The 7th ICDM Workshop on High Dimensional Data Mining (HDM’19)
In conjunction with the IEEE International Conference on Data Mining (IEEE ICDM 2019)

Beijing, China, November 8-11, 2019.

This workshop will take place in Room 305.


Session chair: Xi Zhang


8:00 – 8:30 – Arrival & get to know each other

8:30 – 8:00 Jiacheng Yang, Bin Chen, and Shu-Tao Xia, Mean-Removed Product Quantization for Approximate Nearest Neighbor Search


9:00 – 10:00 Invited Talk: Yulong Li / Keegan Kang


10:00 – 10:30 Coffee break


10:30 – 11:00 Xi Zhang, and Ata Kaban, Experiments with Random Projections Ensembles: Linear versus Quadratic Discriminants

11:00 – 11:30 Jinyu Li, Yu Pan, Hongfeng Yu, and Qi Zhang, Prediction Approach for Ising Model Estimation


Short talks:
11:30 – 11:40 Adahlia Charles, Underdetermined blind source separation for hard clipped stereophonic mixtures

11:40 – 11:50 Qi Xu, Random Projection Ensembles for Clustering


11:50 – 12:00 Shouvik Mani, Expert-guided Regularization via Distance Metric Learning

Description of Workshop


Over a decade ago, Stanford statistician David Donoho predicted that the 21st century will be the century of data. "We can say with complete confidence that in the coming century, high-dimensional data analysis will be a very significant activity, and completely new methods of high-dimensional data analysis will be developed; we just don't know what they are yet." -- D. Donoho, 2000.


Unprecedented technological advances lead to increasingly high dimensional data sets in all areas of science, engineering and businesses. These include genomics and proteomics, biomedical imaging, signal processing, astrophysics, finance, web and market basket analysis, among many others. The number of features in such data is often of the order of thousands or millions - that is much larger than the available sample size.

For a number of reasons, classical data analysis methods inadequate, questionable, or inefficient at best when faced with high dimensional data spaces:

 1. High dimensional geometry defeats our intuition rooted in low dimensional experiences, and this makes data presentation and visualisation particularly challenging.

 2. Phenomena that occur in high dimensional probability spaces, such as the concentration of measure, are counter-intuitive for the data mining practitioner. For instance, distance concentration is the phenomenon that the contrast between pair-wise distances may vanish as the dimensionality increases.

3. Bogus correlations and misleading estimates may result when trying to fit complex models for which the effective dimensionality is too large compared to the number of data points available.

 4. The accumulation of noise may confound our ability to find low dimensional intrinsic structure hidden in the high dimensional data.

 5. The computation cost of processing high dimensional data or carrying out optimisation over a high dimensional parameter spaces is often prohibiting.




This workshop aims to promote new advances and research directions to address the curses, as well as to uncover and exploit the blessings of high dimensionality in data mining.


This year we would like to particularly encourage submissions that define and exploit some notion of "intrinsic dimension" or more generic "intrinsic structure" in learning and/or optimisation problems that allows solving high dimensional data mining tasks more reliably and more efficiently.


Topics of interest include (but are not limited to) the following:

- What are some useful notions of intrinsic structure for high dimensional data mining?

- How to devise data mining algorithms that scale with a suitable notion of intrinsic structure?

- Plausiable models of low intrinsic structure, such as sparse representation, manifold models, latent space models, and studies of their noise tolerance.

- Systematic studies of how the curse of dimensionality affects data mining methods.

- New data mining techniques that exploit some properties of high dimensional data spaces.

- Theoretical underpinning of data mining where the data dimension is larger than the sample size.

- Adaptive and non-adaptive dimensionality reduction for high dimensional data sets.

- Random projections, and random matrix theory applied to high dimensional data mining.

- Classification, regression, clustering, and visualisation of high dimensional complex data sets.

- Functional data mining.

- Data mining applications to real problems in science, engineering or businesses where the data is high dimensional.


Paper submission

High quality original submissions are solicited for oral and poster presentation at the workshop. The page limit of workshop papers is 8 pages in the standard IEEE 2-column format (, including the bibliography and any possible appendices. Reviewing is triple-blind! Therefore, please do not include author identifying information.

All papers must be formatted according to the IEEE Computer Society proceedings manuscript style, following IEEE ICDM 2019 submission guidelines, which are the same as for the main conference (except the page limit). All accepted workshop papers will be published in the IEEE Computer Society Digital Library (CSDL) and IEEE Xplore, and indexed by EI.

Important dates

·         Submission deadline extended until 16th August, 2019. Submission site.

·         Workshop paper notifications: September 4, 2019

·         Camera-ready deadline for the final version of accepted papers: September 8, 2019


Registration & Expenses

Every workshop paper must have at least one full paid conference registration in order to be published. Check the main conference pages for details.


Program committee

Jo Bootkrajang – Chiang Mai University, Thailand

Arthur Flexer – Austrian Research Institute for AI, Austria

Ata Kaban – University of Birmingham

Mehmed Kantardzic – University of Louisville, USA

Minqing Li – University of Birmingham, UK

Luca Oneto – University of Pisa, Italy

Momodou Sanyang – University of The Gambia

Frank-Michael SchleifIniversity of Applied Sciences Würzburg, Germany

Huseyin Seker – Newcastle upon Tyne, UK

Guoxian Yu – Southwest University, China



Workshop organisation & contact

Dr. Ata Kaban

School of Computer Science, University of Birmingham, UK


Previous editions:









Related Links & resources: Analytics, Big Data, Data Mining, & Data Science Resources