What do we do when we don’t have a lot of different training data? This is a quick introduction to using data enhancement in tensorflow to perform in memory image conversion during model training to help overcome this data barrier.
The success of image classification is driven, at least to a large extent, by a large number of available training data. For the time being, if we don’t consider the problems such as fitting, the more training image data, the greater the chance of establishing an effective model.
But what should we do if we don’t have a lot of training data? Immediately think of a number of broad approaches to this particular problem, especially migration learning and data enhancement.
Transfer learningIt is the process of applying the existing machine learning model to the scene that is not expected at first. This utilization can save training time and extend the practicability of existing machine learning models, which may have available data and calculations, and have been trained on very large data sets for a long time. If we train the model on a large amount of data, we can optimize the results to be valid for a small amount of data.
Data expansionIt is an increase in the size and diversity of existing training datasets without the need to manually collect any new data. The enhanced data can be obtained by performing a series of preprocessing transformations on existing data. For image data, these transformations can include horizontal and vertical flip, tilt, trim, rotation, etc. All in all, this extended data can simulate a variety of slightly different data points, as opposed to just copying the same data. The nuances of these “additional” images should be sufficient to help train more robust models. Again, that’s the idea.
This paper focuses on the implementation of the second method in tensorflow to reduce the problem of a small amount of image training data (data enhancement), and the similar practical processing of transfer learning will be carried out later.
How image enhancement helps
When the convolution neural network learns image features, we want to ensure that these features appear in various directions, so that the trained model can recognize that human legs can appear in both vertical and horizontal directions of the image. In addition to increasing the original number of data points, enhancement can also help us in this case by using transformations such as image rotation. As another example, we can also use horizontal flipping to help model training recognize whether a cat is upright or photographed upside down.
Data enhancement is not a panacea; we don’t want it to solve all of our small data problems, but it can be effective in many cases, and can be done as part of a comprehensive model training approach, or with another dataset extension technique (e.g., transfer learning)
Image enhancement in tensorflow
In tensorflow, use theImageDataGeneratorClass completes data expansion. It’s very easy to understand and use. The entire dataset cycles through each period, and the images in the dataset are converted according to the selected options and values. These transformations are performed in memory, so no additional storage is required (although
save_to_dirIf required, this parameter can be used to save the enhanced image to disk).
If you are using tensorflow, you may already have
ImageDataGeneratorA simple way to scale an existing image without any other expansion. It might look like this:
ImageDataGeneratorPerforming enhanced updates might look like this:
What does that mean?
**rotation_range**-The degree range of random rotation; in the above example, 20 degrees
**width_shift_range**-Part of the total width (in this case if the value is < 1) to convert the image horizontally randomly; in the example above, it is 0.2
**height_shift_range**-A portion of the total height (in this case, if the value is < 1), randomly shifts the image vertically; in the above example, it is 0.2
**shear_range**-The anti clockwise shear angle, in degrees, is used for shear conversion; in the above example, it is 0.2
**zoom_range**-Random scaling range: 0.2 in the above example
**horizontal_flip**-Boolean value used to flip the image randomly horizontally; true in the example above
**vertical_flip**-Boolean value used to flip the image vertically and randomly; true in the above example
**fill_mode**-Fills points outside the input boundary based on constant, closest, reflection, or wrap; closest in the above example
You can then use the
flow_from_directoryOption to specify the location of the training data (and choose whether to validate if you want to create a validation generator), for example, use options, and then use the
fit_generatorThese enhanced images flow to your network during the training process to train the model. An example of this type of code is as follows:
If you like this article, please click like to forward! thank you.
Don’t leave after watching, there are still surprises!
I carefully collated the 2TB video courses and books related to computer / Python / machine learning / deep learning, worth 1W yuan. Focus on WeChat official account “computer and AI”, click on the menu below to get SkyDrive links.