Recently, the cloud + Community Technology Salon “Tencent open source technology” ended successfully. The salon invited a number of Tencent technical experts to deeply disclose Tencent’s open source projects, such as Tencent OS tiny, tubemq, Kona JDK, tars and medicalnet. This article is a detailed introduction of medical net, a pre training model dedicated to providing 3D medical image big data.
1、 Overview of medical imaging AI
Medical imaging AI is actually to solve the global problem of “patients are difficult to see a doctor, doctors are tired of diagnosis”.
Due to the large investment and long period of training, the number of medical staff is difficult to increase significantly in a short period of time. However, artificial intelligence technology can assist medical work and alleviate the current situation of insufficient medical resources.
For the medical field, artificial intelligence has two main functions, one is to carry out population-based screening, the other is to improve the quality of diagnosis.
For some simple diseases, artificial intelligence can achieve high diagnostic performance, which can be used in the work of disease screening, to some extent, alleviate the lack of medical staff. However, for some diseases with high treatment difficulty, artificial intelligence can provide reference for doctors’ diagnosis and play a role of reminder.
Medical image contains rich diagnostic information, which is a very common means in medical diagnosis. The “manufacturing” method of medical image AI is as follows: collect annotation data, train artificial intelligence model through these data, and finally input patient image into the system to obtain the diagnosis result close to senior doctors.
2、 The relationship between medicalnet and the development of medical imaging AI
In recent years, the development of image and video recognition software has provided great help for medical image AI. However, due to the limited resources of medical staff, it is difficult to label data, resulting in very few identically distributed label data that can be used for training, which conflicts with data-driven deep learning, which is the bottleneck of the development of medical image AI.
Therefore, for the research of medical image AI, it is urgent to find large-scale data sets and corresponding models to provide information support for most small data medical image AI applications, which is exactly the motivation of developing medicalnet.
Although the data volume of each medical 3D open data set with the same distribution is small, the data sets of multiple medical scenarios can form a large-scale data set. The medicalnet development team will collect the data sets of these scenarios to train different pre training models, and then open source relevant pre training models.
In this way, when a user needs to train a new model, they can directly use the medicalnet model for migration learning. Even if the amount of data in the new application is small, the user can finally train a model.
3、 Technical implementation of medicalnet
In the process of implementing medicalnet, there are many problems that need to be solved by technology. Among them, there are many problems, such as different pixel meanings, wide range differences, frequent artifacts, low imaging quality, fuzzy boundary, low contrast; different source data, lack of annotation; different resolution of the same organization, large differences in different organization scales, and so on.
The medicalnet development team mainly solves these problems through two solutions.
The first is the data set screening scheme, the main purpose is to find out the data sets with common knowledge. The specific methods are as follows: select a small amount of data from each scenario’s data set, form a mini dataset agent, train it into a small network quickly through the agent, and finally judge which data sets can be preserved according to the quality of the mini dataset segmentation prediction results.
After the data set is filtered, the joint training scheme is used for training. Firstly, the data is preprocessed by space and pixel normalization. In order to get more annotation information, medicalnet uses the segmented data set.
Medicalnet consists of encoding and decoding parts, which are open source models. In order to concentrate more information on the coding part, most parameters are concentrated on the coding part. In order to solve the problem of annotation inconsistency between datasets and datasets, multi task is used to separate the annotation data of multiple scenes in the decoding part.
In the process of training, different skip connection combinations are used to alleviate the problem of gradient disappearance. After the training, the coding part can be transferred to the model of arbitrary segmentation, classification, detection and other tasks.
The final experimental results show that in 3D medical image application, medicalnet can help the network of small data scene to accelerate convergence speed and improve prediction performance.
Q: Is the code used by medicalnet open source? Is medicalnet useless for a hospital function?
A: Relevant codes of medicalnet have been open-source. For details, see https://github.com/tencent/me… And medicalnet has been used in multiple landing modules.
CHEN Si Hong, senior researcher of Tencent vision algorithm, started to work on medical image AI in 2014, and published papers in top conference journals such as MICCAI and TMI. It is mainly committed to the research and application of in-depth learning in medical video image and 3D image.