Feature Ranking for Feature Sorting and Feature Selection: FR4(FS)2

  1. Paola Santana-Morales 1
  2. Alberto Merchán 1
  3. Alba Márquez-Rodríguez 1
  4. Antonio Tallón-Ballesteros 1
  1. 1 University of Huelva, Huelva, Spain
Libro:
Bio-inspired Systems and Applications: from Robotics to Ambient Intelligence: 9th International Work-Conference on the Interplay Between Natural and Artificial Computation, IWINAC 2022, Puerto de la Cruz, Tenerife, Spain, May 31 – June 3, 2022, Proceedings, Part II
  1. José Manuel Ferrández Vicente (dir. congr.)
  2. José Ramón Alvarez Sánchez (dir. congr.)
  3. Félix de la Paz López (dir. congr.)
  4. Hojjat Adeli

Editorial: Springer Suiza

ISBN: 978-3-031-06527-9

Año de publicación: 2022

Páginas: 545-550

Tipo: Capítulo de Libro

Resumen

This paper proposes a methodology to feature sorting as well as feature selection in the context of supervised machine learning algorithms. Feature sorting has been revealed as a step which may play a paramount role in machine learning. Nonetheless, the scalability is an important drawback. This paper proposes to add a further stage in order to only retain attributes with a positive influence (att+) and limiting them in a predefined percentage of att+ set. This contribution aims at introducing a new methodology where all attributes are not included in the data mining task but also the positive influence ones till a certain limit. We have followed two different types of sorting by means of different feature ranking methods. The approach has been assessed in three binary problems with a number of features between 1000 and 10000, and a number of instances from 200 to 7000; the test-bed includes challenging data sets from NIPS 2003. According to the experimental results for InfoGain and GainRatio the 90% of the attributes with positive influence are enough to get results in most of the cases comparable to the results with raw data taking into account that the required time to train the classifiers is shorter and hence in the non-required time we may be able to process more instances.