Dimension induced clustering

Aristides Gionis, Spiros Papadimitriou, Alexander Hinneburg, Panayiotis Tsaparas

Research output: Contribution to conferencePaperpeer-review

34 Scopus citations

Abstract

It is commonly assumed that high-dimensional datasets contain points most of which are located in low-dimensional manifolds. Detection of low-dimensional clusters is an extremely useful task for performing operations such as clustering and classification, however, it is a challenging computational problem. In this paper we study the problem of finding subsets of points with low intrinsic dimensionality. Our main contribution is to extend the definition of fractal correlation dimension, which measures average volume growth rate, in order to estimate the intrinsic dimensionality of the data in local neighborhoods. We provide a careful analysis of several key examples in order to demonstrate the properties of our measure. Based on our proposed measure, we introduce a novel approach to discover clusters with low dimensionality. The resulting algorithms extend previous density based measures, which have been successfully used for clustering. We demonstrate the effectiveness of our algorithms for discovering low-dimensional m-flats embedded in high dimensional spaces, and for detecting low-rank submatrices.

Original languageEnglish (US)
Pages51-60
Number of pages10
DOIs
StatePublished - Dec 1 2005
Externally publishedYes
EventKDD-2005: 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - Chicago, IL, United States
Duration: Aug 21 2005Aug 24 2005

Other

OtherKDD-2005: 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
CountryUnited States
CityChicago, IL
Period8/21/058/24/05

All Science Journal Classification (ASJC) codes

  • Software
  • Information Systems

Keywords

  • Clustering
  • Fractal Dimension

Fingerprint Dive into the research topics of 'Dimension induced clustering'. Together they form a unique fingerprint.

Cite this