Does the Order of Attributes Play an Important Role in Classification?

  1. Tallón-Ballesteros, Antonio J. 1
  2. Simon Fong 2
  3. Rocío Leal-Díaz 3
  1. 1 Universidad de Huelva
    info

    Universidad de Huelva

    Huelva, España

    ROR https://ror.org/03a1kt624

  2. 2 University of Macau
    info

    University of Macau

    Macao, Macao

    ROR https://ror.org/01r4q9n85

  3. 3 Universidad de Sevilla
    info

    Universidad de Sevilla

    Sevilla, España

    ROR https://ror.org/03yxnpp24

Libro:
Hybrid Artificial Intelligent Systems. 14th International Conference, HAIS 2019: León, Spain, September 4–6, 2019. Proceedings
  1. Hilde Pérez García (coord.)
  2. Lidia Sánchez González (coord.)
  3. Manuel Castejón Limas (coord.)
  4. Héctor Quintián Pardo (coord.)
  5. Emilio Corchado Rodríguez (coord.)

Editorial: Springer Suiza

ISBN: 978-3-030-29859-3 978-3-030-29858-6

Año de publicación: 2019

Páginas: 370-380

Congreso: Hybrid Artificial Intelligent Systems (14. 2019. León)

Tipo: Aportación congreso

Resumen

This paper proposes a methodology to feature sorting in the context of supervised machine learning algorithms. Feature sorting is defined as a procedure to order the initial arrangement of the attributes according to any sorting algorithm to assign an ordinal number to every feature, depending on its importance; later the initial features are sorted following the ordinal numbers from the first to the last, which are provided by the sorting method. Feature ranking has been chosen as the representative technique to fulfill the sorting purpose inside the feature selection area. This contribution aims at introducing a new methodology where all attributes are included in the data mining task, following different sortings by means of different feature ranking methods. The approach has been assessed in ten binary and multiple class problems with a number of features lower than 37 and a number of instances below than 106 up to 28056; the test-bed includes one challenging data set with 21 labels and 23 attributes where previous works were not able to achieve an accuracy of at least a fifty percent. ReliefF is a strong candidate to be applied in order to re-sort the initial characteristic space and C4.5 algorithm achieved a promising global performance; additionally, PART -a rule-based classifierand Support Vector Machines obtained acceptable results.