At present, the latest medical imaging technologies such as computed tomography (CT), magnetic resonance imaging (MRI) and ultrasound imaging (US) not only achieve high-resolution imaging of tumor lesions, but also can structure and function multimodal imaging, so as to achieve non-invasive detection of tumor lesions. However, the current clinical diagnosis and treatment of tumor based on imaging mainly rely on the personal experience and subjective evaluation of doctors. However, with the diversification of imaging modes and the increasing amount of information, it is more and more challenging to rely on the doctor’s personal judgment. How to carry out automatic and accurate diagnosis and treatment planning of tumor image data through AI technology has always been a hot and difficult research topic.
More and more medical users are looking for elastic, safe, efficient and highly available solutions on AWS platform. At the same time, based on the industry attribute of medical treatment, medical users require the machine learning process on the cloud to be integrated with other AWS services such as monitoring, security, audit and other services to meet HIPAA requirements; on the other hand, it can be seamlessly integrated and flexibly deployed in accordance with the local business environment. along withYitikang，Crystal technologyExcellent medical + AI users have greatly shortened the time from conception, development to deployment of products by rapidly building a service platform on AWS. At the same time, more and more users find that the technical advantages of AWS can make the model training of medical AI users easier. This article aims to explore the business scenarios and advantages of Amazon sagemaker in accelerating the construction of custom medical AI image segmentation algorithm by taking the open source medical image data and semantic segmentation algorithm as an example.
Amazon sagemaker is a fully hosted service that helps developers and data scientists quickly build, train, and deploy machine learning (ML) models. Sagemaker completely eliminates the heavy work of every step in the machine learning process, making it easier to develop high-quality models. Compared with local AI cluster: sagemaker has the following advantages:
- Sagemaker naturally considers data security, authority control, audit compliance, version control and other issues to ensure that every step of the AI project conforms to the specifications of various industries around the world and the best practices of AI engineering.
- Sagemaker decouples AI development steps into data processing, estimator, fit, deploy and other operations. It is triggered by sagemaker’s API or SDK instructions to achieve one click operation, high flexibility, and simplify operation and maintenance work in cluster scheduling, distributed training, data interaction, version control and other issues to the maximum extent. At the same time, the console can monitor the parameters and indicators of various tasks in real time, so that developers can clearly understand the tasks in the fast iteration period of the algorithm. Finally, with the development of new features, such as sagemaker studio (WEB visualization interface), autopilot (automatic construction, training and tuning model), experiences (organizing, tracking and evaluating training running situation), and debugger (analyzing, detecting and warning the problems in the training process), etc., sagemaker has become a powerful assistant for ML users in the whole development cycle.
- Sagemaker is highly integrated with other AWS services to help users achieve fine-grained authority control, detailed performance monitoring and post audit during training / deployment, and automatically generate reports to help users meet their compliance requirements. At the same time, through the integration with database, big data analysis, data flow, ETL and other services, users can truly achieve efficient data insight and data driven decision-making.
- Sage maker supports the use of spot instances in training. By using unused EC2 capacity in AWS cloud, the training cost can be saved by up to 90%. The model deployment using elastic inference can save 75% of the serving cost.
SageMaker control panel
Search the sagemaker service on the main console of AWS. After entering, you can see the control panel of sagemaker.
Under the notebook instance column, you can see the notebook instance list on the left. If you need to create a notebook instance, please click Create notebook instance in the upper right corner, and then refer tohttps://docs.aws.amazon.com/zh_cn/sagemaker/latest/dg/gs-setup-working-env.htmlStep.
When the notebook instance status is green “inservice”, click to open jupyter or open jupyterlab. In the upper right corner, click new, terminal
Data set download and preprocessing
This experiment usedData set and codeIt is stored in S3 bucket, please download and enter unzip blog_ files.zip Unzip.
Open train.py You can see our original script. Note that lines 30 through 39 are the model paths and training paths for local training (not using sagemaker) and cloud training (using sagemaker). This is because when sagemaker encapsulates estimator for training, the training instance will first download the original data from S3 specified bucket to os.environ [‘SM_ INPUT_ Dir ‘], the trained model is placed in the os.environ [‘SM_ MODEL_ Dir ‘], and upload it to S3 to the specified location after the end of the model. At the same time, we pass parameters through argparse.
We can then run Python in the console to verify that the script is correct.
Next, we initialize the script and specify the S3 path and region.
Upload raw data to S3
Next, we use the pytorch class under sagemaker to encapsulate the estimator. If you want to check again on the notebook instance whether the training process is reasonable, you can specify the train_ instance_ type=’local’/’local-gpu’。
You can also use training examples to train to accelerate the training process of existing scripts flexibly. In particular, you can also use spot instances for training, saving up to 90% of training costs. To use spot instance, set the train when encapsulating the estimator_ use_ spot_ Instances = true, with reference tohttps://docs.aws.amazon.com/sagemaker/latest/dg/model-managed-spot-training.html#model-managed-spot-training-usingSet train_ max_ wait。
Under the console training task, you can see that the task is training. Click the name to see the detailed configuration and monitoring information of training tasks.
Click the task name to enter the detailed configuration page
Note: stopping the jupyter notebook during the training process will not stop the training process. You can view the current training process under the console training task.
After the training, we can see the information of training success and the cost saving information of spot instance.
Sagemaker supports one click deployment, as shown in the following code:
It should be noted that using the reasoning image provided by sagemaker torch requires that the model be specified in the script_ For details, please refer tohttps://sagemaker.readthedocs.io/en/stable/using_pytorch.html。
In the actual project, according to the flexible deployment requirements of medical users, we can obtain the trained data in S3. Please find the output item in the configuration page of training task and download and decompress it to the corresponding path of S3.
The decompressed file can be customized by loading the model through various depth framework commands.
After modifying epoch and batch size, more accurate prediction results can be obtained.
In this blog, we take the semantic segmentation algorithm for medical public data set as an example to explore the steps and advantages of deep learning local algorithm in medical field migrating to Amazon sagemaker. We hope that through our efforts, medical AI developers can go all out to optimize algorithms and benefit patients.