Skip navigation
Por favor, use este identificador para citar o enlazar este ítem: https://repositorio.ufpe.br/handle/123456789/62494

Comparte esta pagina

Título : Assessing binarization algorithms for document images
Autor : BERNARDINO, Rodrigo Barros
Palabras clave : Algoritmos de binarização; Documentos históricos; Documentos escaneados; Documentos fotografados; Smartphones; Avaliação de desempenho
Fecha de publicación : 9-sep-2024
Editorial : Universidade Federal de Pernambuco
Citación : BERNARDINO, Rodrigo Barros. Assessing binarization algorithms for document images. 2024. Tese (Doutorado em Ciência da Computação) – Universidade Federal de Pernambuco, Recife, 2024.
Resumen : Binarization algorithms are essential for document processing, analysis, compression, and recognition, with their performance heavily influenced by document characteristics such as paper texture and noise. This thesis introduces novel algorithms and evaluation methodologies for assessing binarization performance, focusing on image quality, processing time, and file size. Nearly 70 binarization schemes were tested on 39 historical documents and 376 mobile- captured images. To expand the analysis, the Direct Binarization approach was proposed, analysing the RGB channels of input images separately. This generated hundreds of additional images, which were used to train an automatic binarization algorithm selection tool, the Image Matcher, based solely on paper texture and the strength of the back-to-front interference. The tool demonstrated significant improvements in binarization results across various cases. Recog- nizing the growing prevalence of smartphone-captured documents, the thesis also investigated such type of documents by proposing and extensively testing three new evaluation measures: the proportion of black pixels in the binary image, a normalized Levenshtein distance, and a combined metric incorporating both. These measures facilitated a comprehensive assessment of mobile-captured images using six widely used mobile devices under varying conditions, in- cluding strobe flash settings, illumination, and positional changes. Additionally, the compressed image size (using the TIFF Group 4 compression scheme) proved to be a valuable metric for evaluating the algorithms efficiency. It has been shown that if processing time is a priority, the Michalak21a algorithm with the red channel would be preferred for this type of image, but if compression rate is a priority, Yinyang22 is a better choice. Choosing the best algorithm for a given setup using the PL measure provided a better choice when compared to using only the OCR accuracy. The thesis also significantly expanded existing datasets for document image binarization by adding 24 new historical document images with manually generated ground truth and 296 mobile-captured images.
URI : https://repositorio.ufpe.br/handle/123456789/62494
Aparece en las colecciones: Teses de Doutorado - Ciência da Computação

Ficheros en este ítem:
Fichero Descripción Tamaño Formato  
TESE Rodrigo Barros Bernardino.pdf62,73 MBAdobe PDFVista previa
Visualizar/Abrir


Este ítem está protegido por copyright original



Este ítem está sujeto a una licencia Creative Commons Licencia Creative Commons Creative Commons