The workshop on Classifier Learning from Difficult Data is organized during the International Conference on Computational Science ICCS 2019 in Faro, Algarve, Portugal.

About

Nowadays many practical decision task require to build models on the basis of data which included serious difficulties, as imbalanced class distributions, high number of classes, high-dimensional feature, small or extremely high number of learning examples, limited access to ground truth, data incompleteness, or data in motion, to enumerate only a few. Such characteristics may strongly deteriorate the final model performances. Therefore, the proposition of the new learning methods which can combat the mentioned above difficulties should be the focus of intense research. The main aim of this workshop is to discuss the problems of data difficulties, to identify new issues, and to shape future directions for research.

Topics of interest

  • Learning from imbalanced data
  • learning from data streams, including concept drift management
  • learning with limited ground truth access
  • learning from high dimensional data
  • learning with a high number of classes
  • learning from massive data, including instance and prototype selection
  • learning on the basis of limited data sets, including one-shot learning
  • learning from incomplete data
  • case studies and real-world applications

Key dates

Milestone Date
Paper submission 15 February 2019
Notification of acceptance of papers 15 March 2019
Camera-ready papers 5 April 2019
Author registration 15 March – 5 April 2019
Conference 12-14 June 2019

Keynote speaker

Alberto Cano

High Performance Data Mining on GPUs, Hadoop, Spark, and beyond
The ever-increasing dimensionality of data poses the main challenge to the scalability of data mining algorithms to run in reasonable time. Parallel and distributed architectures, particularly based on GPUs and the MapReduce model on Apache Hadoop or Spark, have become popular approaches to alleviate the prohibitive runtimes of machine learning algorithms on big data. Not only does the size of data increases the computational complexity but also the emergence of new data-level difficulties and calls for novel learning paradigms. Learning from imbalanced, high-dimensional, data streams with concept drift, or multi-label, to name a few, increase the complexity of algorithms to model such massive data accurately. Therefore, there is a need of new approaches to keep up with the increasing complexity and size of learning from difficult data. This talk reviews advances on the scalability of data mining in recent years and discusses the open issues and future lines of research.

Program committee

  • Carlos Cambra, University of Burgos, Spain
  • Alberto Cano, Virginia Commonwealth University, USA
  • Sung-Bae Cho, Yonsei University, South Korea
  • Jose Alfredo F. Costa, Federal University (UFRN), Brazil
  • Richard J. Duro, Universidade da Coruña, Spain
  • Mohamed Medhat Gaber, Birmingham City University, UK
  • João Gama, University of Porto, Portugal
  • Salvador Garcia, University of Granada, Spain
  • Manuel Grana, University of the Basque Country, Spain
  • Francisco Herrera, Univeristy of Granada, Spain
  • Alvaro Herrero, University of Burgos, Spain
  • Michał Koziarski, AGH University, Poland
  • Bartosz Krawczyk, Virginia Commonwealth University, USA
  • Paweł Ksieniewicz, Wroclaw University of Science and Technology, Poland
  • Bernhard Pfahringer, University of Waikato, New Zealand
  • Piotr Porwik, Silesian University, Poland
  • Héctor Quintián, University of A Coruña, Spain
  • Jerzy Stefanowski, Poznań University of Technology, Poland
  • Arkadiusz Tomczyk, Łódź University of Technology, Poland
  • Michał Woźniak, Wroclaw University of Science and Technology, Poland

Organization commitee

Michał Woźniak, Wroclaw University of Science and Technology, Poland
Bartosz Krawczyk, Virginia Commonwealth University, USA
Paweł Ksieniewicz, Wroclaw University of Science and Technology, Poland