Computer Vision- Image Segmentation

Image segmentation, Semantic segmentation, Instance segmentation, How to segment images in computer vision

IMAGE SEGMENTATION

Check out the first part of the computer vision blog post, Welcome to Part 2!

Segmentation is an image processing technique used in scientific image analysis. It provides information about various regions of interest in an image. Image segmentation is used to organize the data within visual images or videos into meaningful categories. These processes of segmentation provide pixel-perfect accuracy of an image.

What is an image segmentation task? So when we have some image that is represented as an array of numbers we want a computer to understand what's on this image. So the first problem is the classification the image net competition is for classification problem. But what if we want to find where the object is so for example if we have a surveillance camera and we want a computer to recognize where the person is walking and where are all the cars so this is a much more complicated problem. For classification, you need to predict only one number the label or the cross. And for detection problems when you are trying to find out where the object is you are trying to predict one number of a class and phone numbers for a box around this object. And the situation becomes even more complex when you have several objects. So you want to detect every object and the most complicated version of this problem is image segmentation. So this problem is to detect class for every pixel on your image so it is the conversion from an image from an array of numbers to another array of numbers. So like another image where every pixel denotes some class.

Types of Image segmentation

1. Semantic segmentation

2. Instance segmentation

Semantic Segmentation

Semantic segmentation is a type of image segmentation where all pixels corresponding to a class are given the same pixel values. Semantic segmentation is different from all the three algorithms like image classification, object detection, and object tracking. Semantic Segmentation refers to the task of linking each pixel in the image with a class label. These labels could include many objects. It is typically an image classification at a pixel level.

In semantic segmentation, it labels every pixel in the image where the object is located. Lots of pixels in the image are based on the resolution and quality of the image so it detects and labels every pixel location. It gives us a classification by detecting the lines or the edges of the object. It gives us a better perspective of the output and prediction it is better than object detection and tracking. It does not create the bounding boxes but it takes the exact shape of the object. The semantic segmentation detects edges all around in the image.

For example, pixels that belong to a box would be defined under the same “Box” category.

It classifies every pixel in the image. So it detects the exact shape based on the pixel and classifies the objects in their category and pixel should be labeled. There is one limitation that we cannot count how many objects are there in a single category of the image. To overcome this limitation, we have instance segmentation.

Applications of semantic segmentation:

· Autonomous driving (Brake lights, localizing pedestrians, other vehicles, etc.)

· GeoSensing – land usage (forests, crops, roads, buildings, etc.)

· Facial segmentation

· Precision Agriculture

· Virtual fitting Rooms

· Medical Imaging and Diagnostics (locating tumors, planning a surgery, studying anatomy, Measure tissue volumes, Virtual surgery simulation, etc.)

Instance Segmentation

Instance segmentation is another variation of image segmentation where all pixels corresponding to each object share a unique pixel value.

Instance segmentation labels every pixel of the image. But in semantic segmentation, it checks for the pixels that belong to the objects and supposes for the one human category wherever it detects the human it will only define the human label. It will label the pixel only for the human category. But instance segmentation will label every object's pixels are assigned to different unique id and different unique labels.

So, it will detect one human and give this object or this human only one label that's person one and the other person two. It will differentiate the objects based on the edges and also give the different labels and different unique IDs.

So, we can count every object in the image like Bottle, Cubes, and Cup are in this image. We can differentiate that there are three cubes, one bottle, and one cup. This is the difference between semantic segmentation and instance segmentation.

For example, all pixels in this green cup share the same pixel value this type of approach can be very useful in object parameters for each object in your image.

Applications of Instance Segmentation:

· U-Net is a convolution neural network developed for biomedical image segmentation.

· Mask Region-Based Convolution Neural Network.

How to segment images?

The approach you take for segmentation completely depends upon the complexity of the image.

Segmenting low complex images

For low complexity of the image (single) can be separated from the background by applying simple histogram-based thresholding. It can be used to find that appropriate threshold value to separate the object from the background.

Segmenting medium complex images

As the complexity of the image (1-10) increases, finding any machine learning-based approaches to be more efficient. For similar pixel values makes it very difficult for histogram-based thresholding to separate efficiently extracting features. Traditional machine learning algorithms such as random forest or support vector machines often yield excellent results even with limited training data making them trainable on any workstation.

Segmenting high complex images

As the complexity of images (Hundreds) increases it is a very challenging task and cannot be achieved using traditional machine learning approaches. Deep learning has been proven very successful at segmenting challenging images. But deep learning requires hundreds or thousands of labeled images. It may take a long time to train a deep learning model. But once trained these models can be used in production mode to segment future images. U-Net is an architecture that arranges convolutional filters in a contraction path where the input image is progressively scaled-down. And an expansion path where the scaled-down information is upscaled back to the original image size.

Related Post: Top computer vision tools 2021

PS TECHNO BLOG

Header$type=social_icons

Computer Vision- Image Segmentation

IMAGE SEGMENTATION

Types of Image segmentation