Meta-scaler+ : a meta-learning based solution for model- specific recommendations of scaling techniques

AMORIM, Lucas Benevides Viana de

Por favor, use este identificador para citar o enlazar este ítem: https://repositorio.ufpe.br/handle/123456789/62327

Comparte esta pagina

Título :	Meta-scaler+ : a meta-learning based solution for model- specific recommendations of scaling techniques
Autor :	AMORIM, Lucas Benevides Viana de
Palabras clave :	Classificação; Scaling; Meta-aprendizagem; Normalização; Preprocessamento
Fecha de publicación :	18-feb-2025
Editorial :	Universidade Federal de Pernambuco
Citación :	AMORIM, Lucas Benevides Viana de. Meta-scaler+: a meta-learning based solution for model- specific recommendations of scaling techniques. 2025. Tese (Doutorado em Curso) – Universidade Federal de Pernambuco, Recife, 2025.
Resumen :	Dataset scaling, or normalization, is an essential preprocessing step in a machine learning pipeline. It adjusts attributes’ scales in a way that they all vary within the same range. This transformation is widely assumed to improve the performance of classification models, but very few studies empirically verify this assumption. As a first contribution, this thesis compares the impacts of different scaling techniques (STs) on the performance of several classifiers. Its results show that the choice of scaling technique matters for classification performance, and the performance difference between the best and the worst scaling technique is relevant and statistically significant in most cases. However, there are several STs to choose from, and the process of manually finding, via trial and error, the most suitable technique for a certain dataset can be unfeasible. As an alternative to this, we propose employing meta-learning to select the best ST for a given dataset automatically. Therefore, in our second study, we propose the Meta-scaler, a framework that learns and trains a set of meta-models to represent the relationship between meta-features extracted from the datasets and the performance of a set of classification algorithms on these datasets when they are scaled with different techniques. These meta-models are able to recommend a single optimal ST for a given query dataset, taking into account also the query classifier. The Meta-scaler yielded better classification performance than any choice of a single ST for 10 out of the 12 base models tested and also outperformed state-of-the-art meta-learning methods for ST selection. Then, in our third study, we proposed Meta-scaler+, where we extended the functionality of Meta-scaler, eliminating its limitations by introducing an innovative classifier characterization method, the Classifier Performance Space, which allows us to dynamically combine meta-models for specialized ST recommendations for any query classifier and query dataset. Despite the additional flexibility, Meta-scaler+ performance is competitive with Meta-scaler and superior to other state-of-the- art solutions. In future work, we will invest in improving dataset representation (meta-features), improving Classifier Performance Space initialization, and making Meta-scaler+ a practical and accessible tool, enabling its integration with popular machine-learning libraries.
URI :	https://repositorio.ufpe.br/handle/123456789/62327
Aparece en las colecciones:	Teses de Doutorado - Ciência da Computação

Ficheros en este ítem:

Fichero	Descripción	Tamaño	Formato
TESE Lucas Benevides Viana de Amorim.pdf		2.3 MB	Adobe PDF	Visualizar/Abrir

Este ítem está protegido por copyright original

Visualizar la licencia

Mostrar el registro Dublin Core completo del ítem Recomiende este ítem

Este ítem está sujeto a una licencia Creative Commons Licencia Creative Commons