Fast k-nearest neighbors for Big Data and Smart Data

  1. Maillo Hidalgo, Jesús
Zuzendaria:
  1. Francisco Herrera Triguero Zuzendaria
  2. Isaac Triguero Velázquez Zuzendaria

Defentsa unibertsitatea: Universidad de Granada

Fecha de defensa: 2020(e)ko maiatza-(a)k 07

Epaimahaia:
  1. Óscar Cordón García Presidentea
  2. Victoria Luzón García Idazkaria
  3. Antonio Peregrín Rubio Kidea
  4. Daniel Peralta Cámara Kidea
  5. Javier del Ser Lorente Kidea

Mota: Tesia

Laburpena

In this thesis, we have presented an extensive study of the kNN algorithm in Big Data problems and its application to transform Big Data into Smart Data. The objective has been to the design, implementation, analysis and evaluation of the proposed algorithms. This thesis started by enabling the original kNN classifier to tackle Big Data problems, and then we extended that proposal to allow its fuzzy variation, in order to improve the scalability and accuracy. Afterwards, the implication of the kNN algorithm in obtaining Smart Data is analysed, highlighting the proposal as an imputation of MVs. Finally, two specific complexity and density metrics for Big Data problems are proposed in order to study the redundancy information in large scale datasets.