With most applications of computer vision, one of the primary and most important tasks is to isolate the foreground from the background objects. This task is often referred to as segmentation or thresholding and its performance in isolating elements will determine how successful features can be extracted from the image.
This is the simplest and most basic form of segmentation that takes the grey value of each pixel and converts it to black or white, depending on a static threshold value. Figure 3.1 shows the flow of the basic segmentation technique.
Under ideal circumstances, where the lighting and camera parameters are constant, the basic segmentation technique will perform adequately for most tasks. However, most captured video images are susceptible to noise and fluctuations in brightness and will therefore produce varying levels of segmentation. To demonstrate this, figure 3.2 shows three images taken from a static video camera. Although the background elements appear to be the same, by comparing the differences it can be seen in figure 3.3 that there were varying levels of brightness even in static regions.
When dealing with a single image, a static threshold value will suffice, as the value can be adjusted manually to get the desired segmented image. For video sequences, manual adjustment of the threshold value is an impractical solution as the images are being processed at a rate of 25 frames per second and therefore a technique was required that will automatically adjust the threshold from frame to frame.
Also known as optimal thresholding, dynamic thresholding attempts to calculate the optimal threshold value based on the grey level values of the image. Each frame of an image sequence is analysed and a threshold value is calculated. This has the advantage that, if the luminance of the sequence is inconsistent between frames, then the dynamic threshold value should stabilise the problem.
One of the more popular approaches of dynamic thresholding is the Otsu Method [NIXON and AGUADO 2002]. The Otsu method works on the theory that there are two peaks in the grey values of an image’s histogram, one representing the background and the other representing either the foreground or an object. Otsu makes the assumption that the lowest mid-point between these two peaks is the optimal threshold value, see figure 3.4.
The optimal threshold value is calculated from the Otsu method using the following equations:
Using the previous equations, the final calculation can be expressed as:
Where:
k = The histogram index position in the range 0 to 255.
p = The normalised histogram value for the current index position.
ω and μ = The first and second order cumulative values of the normalised histogram.
μT = The total mean level of the image.
See Appendix A for example C++ implementation code for the Otsu thresholding method.
A final comparison of how Otsu’s algorithm performed against a static threshold is shown in figure 3.6. The first column shows the same scene with different lighting. The middle column shows the result after using a static threshold value and the third column shows the result with the Otsu dynamic thresholding method. It can clearly be seen that the Otsu method out-performed the static method, as the markers were more pronounced and also the noise caused from the shadows was greatly reduced.
| << 2. Previous and Current Work | 4. Feature Extraction >> |