|
From Deep Learning for Vision Systems by Mohamed Elgendy In this part, we will delve into image preprocessing for computer vision systems. |
Take 37% off Deep Learning for Vision Systems . Just enter fccelgendy into the discount code box at checkout at manning.com.
Check out part 1 for an intro to the computer vision pipeline and part 2 for an overview of input images.
Image preprocessing
What is image processing?
In machine learning projects in general, you usually go through a data preprocessing or cleaning step. As a machine learning engineer, you’ll spend a good amount of your time cleaning up and preparing the data before you build your learning model. The goal of this step is to make your data ready for the ML model to make it easier to analyze and process computationally, as it is with images. Based on the problem you’re solving and the dataset in hand, there’s some data massaging required before you feed your images to the ML model.
Image processing could be simple tasks like image resizing. In order to feed a dataset of images to a convolutional network, they must all be the same size. Other processing tasks can take place like geometric and color transformation or converting color to grayscale and many more.
Why image preprocessing?
The acquired data are usually messy and come from different sources. To feed them to the ML model (or neural network), they need to be standardized and cleaned up. More often than not, preprocessing is used to conduct steps that reduce the complexity and increase the accuracy of the applied algorithm. We can’t write a unique algorithm for each of the condition in which an image is taken, thus, when we acquire an image, we tend to convert it into a form that allows a general algorithm to solve it.
Data preprocessing techniques might include:
- Convert color images to grayscale to reduce computation complexity: in certain problems you’ll find it useful to lose unnecessary information from your images to reduce space or computational complexity.
For example, converting your colored images to grayscale images. This is because in many objects, color isn’t necessary to recognize and interpret an image. Grayscale can be good enough for recognizing certain objects. Because color images contain more information than black and white images, they can add unnecessary complexity and take up more space in memory (Remember how color images are represented in three channels, which means that converting it to grayscale reduces the number of pixels that need to be processed).
Figure 1
In the example above, you can see how patterns in brightness and darkness of an object (intensity) can be used to define the shape and characteristics of many objects. In other applications, color is important to define certain objects. Like skin cancer detection which relies heavily on the skin colors (red rashes).
- Standardize images: One important constraint that exists in some machine learning algorithms, such as CNN, is the need to resize the images in your dataset to a unified dimension. This implies that our images must be preprocessed and scaled to have identical widths and heights before fed to the learning algorithm.
- Data augmentation: Another common pre-processing technique involves augmenting the existing dataset with perturbed versions of the existing images. Scaling, rotations and other affine transformations are typical. This is done to enlarge your dataset and expose the neural network to a wide variety of variations of your images. This makes it more likely that your model recognizes objects when they appear in any form and shape. Here’s an example of image augmentation applied to a butterfly image:
Figure 3
- Other techniques: Many preprocessing techniques can be used to get your images ready to train the machine learning model. In some projects, you might need to remove the background color from your images to reduce the noise. Other projects might require that you brighten or darken your images. In short, any adjustments that you need to apply to your dataset are considered a sort of preprocessing. And you’ll select the appropriate processing techniques based on the dataset at hand and the problem you’re solving. That builds your intuition of which ones you need when working on your own projects.
That’s all for now. Keep a look out for part 4. If you’re interested in learning more about the book, check it out on liveBook here and see this slide deck.