Sign Cauchy projections and chi-square kernel

Ping Li, Gennady Samorodnitsky, John Hopcroft

Research output: Contribution to journalConference articlepeer-review

27 Scopus citations

Abstract

The method of stable random projections is useful for efficiently approximating the ℓα distance (0 < α ≤2) in high dimension and it is naturally suitable for data streams. In this paper, we propose to use only the signs of the projected data and we analyze the probability of collision (i.e., when the two signs differ). Interestingly, when α = 1 (i.e., Cauchy random projections), we show that the probability of collision can be accurately approximated as functions of the chi-square (χ2) similarity. In text and vision applications, the χ2 similarity is a popular measure when the features are generated from histograms (which are a typical example of data streams). Experiments confirm that the proposed method is promising for large-scale learning applications. The full paper is available at arXiv:1308.1009. There are many future research problems. For example, when α → 0, the collision probability is a function of the resemblance (of the binary-quantized data). This provides an effective mechanism for resemblance estimation in data streams.

Original languageEnglish (US)
JournalAdvances in Neural Information Processing Systems
StatePublished - 2013
Event27th Annual Conference on Neural Information Processing Systems, NIPS 2013 - Lake Tahoe, NV, United States
Duration: Dec 5 2013Dec 10 2013

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Fingerprint

Dive into the research topics of 'Sign Cauchy projections and chi-square kernel'. Together they form a unique fingerprint.

Cite this