Image thresholding is a process for separating the foreground and background of the image. There are lots of methods for image thresholding, Otsu method is one of the methods proposed by Nobuyuki Otsu. The Otsu algorithm is a variance-based way to automatically find a threshold value by which the weighted variance between foreground and background is the least.
With different threshold value, the pixel values of foreground and background are various. Hence, both pixels have different variance for different thresholding. The key of Otsu algorithm is to calculate the total variance from the two variances of both distributions. The process needs to iterate through all the possible threshold vlaues and find a threshold that makes the total variance is smallest.
Algorithm
The Otsu method iteratively search for the threshold value that minimizes the weighted sum of variance of the two classes (foreground and background). The pixel value is usually betwwen 0-255, So if the threshold value is 100, then all the pixels with values that less than 100 will be the background and all the pixels with values that greater than or equal to 100 will be the foreground of the image. The formula for find the variance at any threshold of t is given by:
θ2(t)=ωbg(t)θbg2(t)+ωfg(t)θfg2(t)(1)
where ωbg and ωfg are the probability of the number of pixels for each class at threshold t and θ2 represents the variance of color values.
So, what is the probability in eq. 1? Here is a detail explain, let’s assume:
Pall be the total count of pixels in an image, Pbg(t) be the count of background pixels at threshold t, Pfg(t) be the count of foreground pixels at threshold t
So, the weights are given by:
ωbg(t)=PallPbg(t)ωfg(t)=PallPfg(t)(2)
The variance can be calculated using the below formula:
θ2(t)=N∑(xi−xˉ)2(3)
where,
xi is the value of pixel at i in the group (bg or fg) xˉ is the mean of pixel values in the group (bg or fg) N is the number of pixels.
Now, I’ll give an example to understand the algorithm. here is an image with 4x4 size, the Threshold T is 100:
The foreground is the pixel value which greater than or equal to 100, the background is the pixel value less than 100:
Now the pixel values have been separated to two parts. Thus I can get: