Por favor, use este identificador para citar o enlazar este ítem:
https://repositorio.ufpe.br/handle/123456789/67273
Comparte esta pagina
| Título : | Machine Learning and Readability in Accounting: An Ensemble Learning Approach |
| Autor : | COSTA NETO, Arlindo Menezes da |
| Palabras clave : | Informativeness; Machine Learning; Accounting information |
| Fecha de publicación : | 26-nov-2025 |
| Editorial : | Universidade Federal de Pernambuco |
| Citación : | COSTA NETO, Arlindo Menezes da. Machine Learning and Readability in Accounting: An Ensemble Learning Approach. 2025. Tese (Doutorado em Ciências Contábeis) - Universidade Federal de Pernambuco, Recife, 2025. |
| Resumen : | We expand on the value relevance of accounting information by exploring a new metric for valuing the financial text, to do so we employ a language model (FinBERT-PT-BR) trained in Brazilian Portuguese to develop an Informativeness Index, assigning scores to 26.804 quarterly financial statement notes from 1.152 companies in Brazil over the span of 12 years. As a verification of our model’s capability to understand textual data, we calculate the usual readability metrics (Flesch-Kincaid reading ease, Fog index, SMOG index, Loughran McDonald Index) for all the notes and employ machine learning models to evaluate which readability metric best represents an informativeness index built upon the dimensions of Boilerplateness, Completeness and Density, expecting our proposed metric to be poorly related to the readability metrics. The evaluation of which readability metric is closest to measuring the informativeness of financial text is based on the feature importance, which indicates the best proxy for financial text readability of Portuguese text is be the Loughran McDonald Index. The Loughran-McDonald Index is the only one with any relevance in the regressors, and as is based on file size, we assume our metric as capable of measuring textual information value better than common readability metrics, while pointing to the Loughran-McDonald to be a reasonable proxy to informational value of financial text. This research innovates by presenting a new method to quantify the informational value of financial information, contributing to value-relevance literature as well as literature of machine learning employment in accounting research, additionally we do so within a not-so-explored field (Portuguese financial information) with a reasonably large dataset. Further research may be needed to combine our proposed model with market-related metrics or human experiments to increase the validity of the metric concept. |
| URI : | https://repositorio.ufpe.br/handle/123456789/67273 |
| Aparece en las colecciones: | Teses de Doutorado - Ciências Contábeis |
Ficheros en este ítem:
| Fichero | Descripción | Tamaño | Formato | |
|---|---|---|---|---|
| TESE Arlindo Menezes da Costa Neto.pdf | 907.42 kB | Adobe PDF | ![]() Visualizar/Abrir |
Este ítem está protegido por copyright original |
Este ítem está sujeto a una licencia Creative Commons Licencia Creative Commons

