Adaptive, unsupervised stream mining

Spiros Papadimitriou, Anthony Brockwell, Christos Faloutsos

Research output: Contribution to journalArticlepeer-review

25 Scopus citations

Abstract

Sensor devices and embedded processors are becoming widespread, especially in measurement/monitoring applications. Their limited resources (CPU, memory and/or communication bandwidth, and power) pose some interesting challenges. We need concise, expressive models to represent the important features of the data and that lend themselves to efficient estimation. In particular, under these severe constraints, we want models and estimation methods that (a) require little memory and a single pass over the data, (b) can adapt and handle arbitrary periodic components, and (c) can deal with various types of noise. We propose AWSOM (Arbitrary Window Stream modeling Method), which allows sensors in remote or hostile environments to efficiently and effectively discover interesting patterns and trends. This can be done automatically, i.e., with no prior inspection of the data or any user intervention and expert tuning before or during data gathering. Our algorithms require limited resources and can thus be incorporated into sensors - possibly alongside a distributed query processing engine [10,6,27]. Updates are performed in constant time with respect to stream size using logarithmic space. Existing forecasting methods (SARIMA, GARCH, etc.) and "traditional" Fourier and wavelet analysis fall short on one or more of these requirements. To the best of our knowledge, AWSOM is the first framework that combines all of the above characteristics. Experiments on real and synthetic datasets demonstrate that AWSOM discovers meaningful patterns over long time periods. Thus, the patterns can also be used to make long-range forecasts, which are notoriously difficult to perform. In fact, AWSOM outperforms manually set up autoregressive models, both in terms of long-term pattern detection and modeling and by at least 10 × in resource consumption.

Original languageEnglish (US)
Pages (from-to)222-239
Number of pages18
JournalVLDB Journal
Volume13
Issue number3
DOIs
StatePublished - Sep 2004
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'Adaptive, unsupervised stream mining'. Together they form a unique fingerprint.

Cite this