Skip navigation
Please use this identifier to cite or link to this item: https://repositorio.ufpe.br/handle/123456789/62494

Share on

Title: Assessing binarization algorithms for document images
Authors: BERNARDINO, Rodrigo Barros
Keywords: Algoritmos de binarização; Documentos históricos; Documentos escaneados; Documentos fotografados; Smartphones; Avaliação de desempenho
Issue Date: 9-Sep-2024
Publisher: Universidade Federal de Pernambuco
Citation: BERNARDINO, Rodrigo Barros. Assessing binarization algorithms for document images. 2024. Tese (Doutorado em Ciência da Computação) – Universidade Federal de Pernambuco, Recife, 2024.
Abstract: Binarization algorithms are essential for document processing, analysis, compression, and recognition, with their performance heavily influenced by document characteristics such as paper texture and noise. This thesis introduces novel algorithms and evaluation methodologies for assessing binarization performance, focusing on image quality, processing time, and file size. Nearly 70 binarization schemes were tested on 39 historical documents and 376 mobile- captured images. To expand the analysis, the Direct Binarization approach was proposed, analysing the RGB channels of input images separately. This generated hundreds of additional images, which were used to train an automatic binarization algorithm selection tool, the Image Matcher, based solely on paper texture and the strength of the back-to-front interference. The tool demonstrated significant improvements in binarization results across various cases. Recog- nizing the growing prevalence of smartphone-captured documents, the thesis also investigated such type of documents by proposing and extensively testing three new evaluation measures: the proportion of black pixels in the binary image, a normalized Levenshtein distance, and a combined metric incorporating both. These measures facilitated a comprehensive assessment of mobile-captured images using six widely used mobile devices under varying conditions, in- cluding strobe flash settings, illumination, and positional changes. Additionally, the compressed image size (using the TIFF Group 4 compression scheme) proved to be a valuable metric for evaluating the algorithms efficiency. It has been shown that if processing time is a priority, the Michalak21a algorithm with the red channel would be preferred for this type of image, but if compression rate is a priority, Yinyang22 is a better choice. Choosing the best algorithm for a given setup using the PL measure provided a better choice when compared to using only the OCR accuracy. The thesis also significantly expanded existing datasets for document image binarization by adding 24 new historical document images with manually generated ground truth and 296 mobile-captured images.
URI: https://repositorio.ufpe.br/handle/123456789/62494
Appears in Collections:Teses de Doutorado - Ciência da Computação

Files in This Item:
File Description SizeFormat 
TESE Rodrigo Barros Bernardino.pdf62,73 MBAdobe PDFThumbnail
View/Open


This item is protected by original copyright



This item is licensed under a Creative Commons License Creative Commons