Nonlinear estimators and tail bounds for dimension reduction in l 1 using cauchy random projections

Ping Li, Trevor J. Hastie, Kenneth W. Church

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Scopus citations

Abstract

For dimension reduction in l1, one can multiply a data matrix A ε ℝn×D by R ε ℝD×k (k ≪ D) whose entries are i.i.d. samples of Cauchy. The impossibility result says one can not recover the pairwise l1 distances in A from B = AR εℝn×k, using linear estimators. However, nonlinear estimators are still useful for certain applications in data stream computations, information retrieval, learning, and data mining. We propose three types of nonlinear estimators: the bias-corrected sample median estimator, the bias-corrected geometric mean estimator, and the bias-corrected maximum likelihood estimator. We derive tail bounds for the geometric mean estimator and establish that k = O (&logn/ε2) suffices with the constants explicitly given. Asymptotically (as k → ∞), both the sample median estimator and the geometric mean estimator are about 80% efficient compared to the maximum likelihood estimator (MLE). We analyze the moments of the MLE and propose approximating the distribution of the MLE by an inverse Gaussian.

Original languageEnglish (US)
Title of host publicationLearning Theory - 20th Annual Conference on Learning Theory, COLT 2007, Proceedings
PublisherSpringer Verlag
Pages514-529
Number of pages16
ISBN (Print)9783540729259
DOIs
StatePublished - 2007
Externally publishedYes
Event20th Annual Conference on Learning Theory, COLT 2007 - San Diego, CA, United States
Duration: Jun 13 2007Jun 15 2007

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4539 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other20th Annual Conference on Learning Theory, COLT 2007
CountryUnited States
CitySan Diego, CA
Period6/13/076/15/07

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'Nonlinear estimators and tail bounds for dimension reduction in l <sub>1</sub> using cauchy random projections'. Together they form a unique fingerprint.

Cite this