Una propuesta determinista para la obtención de reglas en problemas de minería de datos

  1. Juan Luis Dominguez Olmedo
Supervised by:
  1. Jacinto Mata Vázquez Director

Defence university: Universidad de Huelva

Year of defence: 2019

Committee:
  1. Diego Gachet Paez Chair
  2. Pedro José Abad Herrera Secretary
  3. Víctor Manuel Rivas Santos Committee member
Department:
  1. TECNOLOGIAS DE LA INFORMACION

Type: Thesis

Abstract

This thesis describes in detail the work done to address, through a new deterministic approach, the generation of rules for its application to data mining problems. Specifically, and given that a previous discretization of the numerical attributes of a dataset, leads to loss of information and possible decrease in the quality of the obtained rules, it has been studied to directly generate rules combining intervals in the conditions of the numerical attributes. In order to reduce the computation time that any exhaustive search process implies, different data structures have been defined and algorithms have been developed to generate and evaluate, in an efficient way, the rules of the model, as well as appropriate parameters to achieve a balance between the computation time and the quality of the rules generated. The proposed method has been adapted to different data mining tasks, specifically, association rules, subgroup discovery, and classification. The developed algorithms have been applied in several test datasets, comparing the quality of the resulting rules with that of other existing methods in the literature. The significance of the results obtained has been evaluated through the appropriate statistical tests. Excellent results were achieved by the proposed method, equaling or improving those of other reference methods, both deterministic and non-deterministic. It has also been applied to real data, such as the case of a medical dataset, where an interpretable predictive model was obtained, also having high accuracy.