The 8^{th} ICDM Workshop on High Dimensional Data Mining
(HDM’20)
In
conjunction with the IEEE International Conference on Data Mining (IEEE ICDM 2020)
November 1720, 2020,
Sorrento, Italy à Virtual
Accepted Papers
DM1140  Accelerated
SGD for Tensor Decomposition of Sparse Count Data
Huan He, Yuanzhe Xi, Joyce Ho
DM454  Towards an Internal Evaluation Measure for Arbitrarily Oriented
Subspace Clustering
Daniyal Kazempour, Peer Kröger, Thomas Seidl
DM933  Individualized ContextAware Tensor Factorization for Online Games
Predictions
Julie Jiang, Kristina Lerman, Emilio Ferrara
S06201  You see a set of wagons  I see one train: Towards a unified view of
local and global arbitrarily oriented subspace clusters
Daniyal Kazempour, Long
Matthias Yan, Peer Kröger and Thomas Seidl
S06202  I fold you so! An internal evaluation
measure for arbitrary oriented subspace clustering
Anna Beer, Daniyal Kazempour,
Peer Kröger, Thomas Seidl
S25201  Efficient distancebased global sensitivity analysis for terrestrial
ecosystem modeling
Dan Lu
Description of Workshop
Over
a decade ago, Stanford statistician David Donoho
predicted that the 21st century will be the century of data. "We can say
with complete confidence that in the coming century, highdimensional data
analysis will be a very significant activity, and completely new methods of
highdimensional data analysis will be developed; we just don't know what they
are yet."  D. Donoho, 2000.
Unprecedented technological advances lead
to increasingly high dimensional data sets in all areas of science, engineering
and businesses. These include genomics and proteomics, biomedical imaging,
signal processing, astrophysics, finance, web and market basket analysis, among
many others. The number of features in such data is often of the order of
thousands or millions  that is much larger than the available sample size.
For a number of reasons, classical data analysis methods inadequate,
questionable, or inefficient at best when faced with high dimensional data
spaces:
1. High dimensional geometry defeats our
intuition rooted in low dimensional experiences, and this makes data
presentation and visualisation particularly challenging.
2.
Phenomena that occur in high dimensional probability spaces, such as the
concentration of measure, are counterintuitive for the data mining
practitioner. For instance, distance concentration is the phenomenon that the
contrast between pairwise distances may vanish as the dimensionality
increases.
3. Bogus correlations and misleading estimates
may result when trying to fit complex models for which the effective
dimensionality is too large compared to the number of data points available.
4. The
accumulation of noise may confound our ability to find low dimensional
intrinsic structure hidden in the high dimensional data.
5. The
computation cost of processing high dimensional data or carrying out
optimisation over a high dimensional parameter spaces is often prohibiting.
Topics
This workshop
aims to promote new advances and research directions to address the curses, as
well as to uncover and exploit the blessings of high dimensionality in data
mining.
This year we
would like to particularly encourage submissions that define and exploit some
notion of "intrinsic dimension" or more generic "intrinsic
structure" in learning and/or optimisation
problems that allows solving high dimensional data mining tasks more reliably
and more efficiently.
Topics of
interest include (but are not limited to) the following:
 What are
some useful notions of intrinsic structure for high dimensional data mining?
 How to
devise data mining algorithms that scale with a suitable notion of intrinsic
structure?
 Plausible
models of low intrinsic structure, such as sparse representation, manifold
models, latent space models, and studies of their noise tolerance.

Systematic studies of how the curse of dimensionality affects data mining methods.
 New data
mining techniques that exploit some properties of high dimensional data spaces.

Theoretical underpinning of data mining where the data dimension is larger than
the sample size.
 Adaptive
and nonadaptive dimensionality reduction for high dimensional data sets.
 Random
projections, and random matrix theory applied to high dimensional data mining.

Classification, regression, clustering, and visualisation
of high dimensional complex data sets.

Functional data mining.
 Data
mining applications to real problems in science, engineering or businesses
where the data is high dimensional.
Paper submission
High
quality original submissions are solicited for oral and poster presentation at
the workshop. The page limit of workshop papers
is 8 pages in the standard IEEE 2column format (https://www.ieee.org/conferences/publishing/templates.html), including the
bibliography and any possible appendices. Reviewing is tripleblind!
Therefore, please do not include author identifying information.
All
papers must be formatted according to the IEEE Computer Society proceedings manuscript style, following IEEE ICDM 2020 submission guidelines, which are the same as for the
main conference (except the page limit). All
accepted workshop papers will be published in the IEEE Computer Society Digital
Library (CSDL) and IEEE Xplore, and indexed by EI.
Important dates
· Submission deadline:
24 August 2020. Extended to 2 September.
· Workshop paper notifications: September 17, 2020
Registration
& Expenses
Every workshop paper must have
at least one full paid conference
registration in order
to be published. Check the main conference pages for details.
Program
committee
Karim AbouMoustafa
Jakramate Bootkrajang
Arthur Flexer
Ata Kaban
Miqing Li
Momodou Sanyang
FrankMichael Schleif
Guoxian Yu
Workshop
organisation & contact
School of Computer Science, University of
Birmingham, UK
Previous
editions:
Related Links & resources: Analytics, Big Data, Data Mining, & Data
Science Resources