Skip navigation
Use este identificador para citar ou linkar para este item: https://repositorio.ufpe.br/handle/123456789/62327

Compartilhe esta página

Título: Meta-scaler+ : a meta-learning based solution for model- specific recommendations of scaling techniques
Autor(es): AMORIM, Lucas Benevides Viana de
Palavras-chave: Classificação; Scaling; Meta-aprendizagem; Normalização; Preprocessamento
Data do documento: 18-Fev-2025
Editor: Universidade Federal de Pernambuco
Citação: AMORIM, Lucas Benevides Viana de. Meta-scaler+: a meta-learning based solution for model- specific recommendations of scaling techniques. 2025. Tese (Doutorado em Curso) – Universidade Federal de Pernambuco, Recife, 2025.
Abstract: Dataset scaling, or normalization, is an essential preprocessing step in a machine learning pipeline. It adjusts attributes’ scales in a way that they all vary within the same range. This transformation is widely assumed to improve the performance of classification models, but very few studies empirically verify this assumption. As a first contribution, this thesis compares the impacts of different scaling techniques (STs) on the performance of several classifiers. Its results show that the choice of scaling technique matters for classification performance, and the performance difference between the best and the worst scaling technique is relevant and statistically significant in most cases. However, there are several STs to choose from, and the process of manually finding, via trial and error, the most suitable technique for a certain dataset can be unfeasible. As an alternative to this, we propose employing meta-learning to select the best ST for a given dataset automatically. Therefore, in our second study, we propose the Meta-scaler, a framework that learns and trains a set of meta-models to represent the relationship between meta-features extracted from the datasets and the performance of a set of classification algorithms on these datasets when they are scaled with different techniques. These meta-models are able to recommend a single optimal ST for a given query dataset, taking into account also the query classifier. The Meta-scaler yielded better classification performance than any choice of a single ST for 10 out of the 12 base models tested and also outperformed state-of-the-art meta-learning methods for ST selection. Then, in our third study, we proposed Meta-scaler+, where we extended the functionality of Meta-scaler, eliminating its limitations by introducing an innovative classifier characterization method, the Classifier Performance Space, which allows us to dynamically combine meta-models for specialized ST recommendations for any query classifier and query dataset. Despite the additional flexibility, Meta-scaler+ performance is competitive with Meta-scaler and superior to other state-of-the- art solutions. In future work, we will invest in improving dataset representation (meta-features), improving Classifier Performance Space initialization, and making Meta-scaler+ a practical and accessible tool, enabling its integration with popular machine-learning libraries.
URI: https://repositorio.ufpe.br/handle/123456789/62327
Aparece nas coleções:Teses de Doutorado - Ciência da Computação

Arquivos associados a este item:
Arquivo Descrição TamanhoFormato 
TESE Lucas Benevides Viana de Amorim.pdf2,3 MBAdobe PDFThumbnail
Visualizar/Abrir


Este arquivo é protegido por direitos autorais



Este item está licenciada sob uma Licença Creative Commons Creative Commons