The technique makes use of the image contrast that is defined by the local image maximum and minimum. Compared with the image gradient, the image contrast evaluated by the local maximum and minimum has a nice property that it is more tolerant to the uneven illumination and other types of document degradation such as smear. The technique has been tested over the dataset that is used in the recent Document Image Binarization Contest (DIBCO) 2009. Experiments show its superior performance.
Compared with the image gradient, such image contrast is more capable of detecting the high contrast image pixels (lying around the text stroke boundary) from historical documents that often suffer from different types of document degradation. And compared with the Lu&Tan’s method which was used in DIBCO contest, the method is better while handling document images with complex background variation. Given a historical document image, the technique first determines a contrast image based on the local maximum and minimum. The high contrast image pixels around the text stroke boundary are then detected through the global thresholding of the determined contrast image. Lastly, the historical document image is binarized based on the local thresholds that are estimated from the detected high contrast image pixels. Compare with previous method based on image contrast, the method uses the image contrast to identify the text stroke boundary, which can be used to produce more accurate binarization results.
The paper has appeared in DAS2010.
Figure 3. (a) The traditional image gradient that is obtained using Canny’s edge detector; (b) The image contrast that is obtained by using the local maximum and minimum;(c) One column of the image gradient in (a) (shown as a vertical white line);(d) The same column of the contrast image in (b).