Using betweenness centrality to identify manifold shortcuts

William J. Cukierski, David J. Foran

Research output: Chapter in Book/Report/Conference proceedingConference contribution

30 Scopus citations

Abstract

High-dimensional data presents a significant challenge to a broad spectrum of pattern recognition and machinelearning applications. Dimensionality reduction (DR) methods serve to remove unwanted variance and make such problems tractable. Several nonlinear DR methods, such as the well known ISOMAP algorithm, rely on a neighborhood graph to compute geodesic distances between data points. These graphs may sometimes contain unwanted edges which connect disparate regions of one or more manifolds. This topological sensitivity is well known [1, 9, 13], yet managing high-dimensional, noisy data in the absence of a priori knowledge, remains an open and difficult problem. This manuscript introduces a divisive, edge-removal method based on graph betweenness centrality which can robustly identify manifold-shorting edges. The problem of graph construction in high dimensions is discussed and the proposed algorithm is inserted into the ISOMAP workflow. ROC analysis is performed and the performance is tested on both synthetic and real datasets.

Original languageEnglish (US)
Title of host publicationProceedings - IEEE International Conference on Data Mining Workshops, ICDM Workshops 2008
Pages949-958
Number of pages10
DOIs
StatePublished - 2008
EventIEEE International Conference on Data Mining Workshops, ICDM Workshops 2008 - Pisa, Italy
Duration: Dec 15 2008Dec 19 2008

Publication series

NameProceedings - IEEE International Conference on Data Mining Workshops, ICDM Workshops 2008

Other

OtherIEEE International Conference on Data Mining Workshops, ICDM Workshops 2008
Country/TerritoryItaly
CityPisa
Period12/15/0812/19/08

All Science Journal Classification (ASJC) codes

  • Computer Science Applications
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Using betweenness centrality to identify manifold shortcuts'. Together they form a unique fingerprint.

Cite this