On the strength of hyperclique patterns for text categorization

Tieyun Qian, Hui Xiong, Yuanzhen Wang, Enhong Chen

Research output: Contribution to journalArticlepeer-review

13 Scopus citations

Abstract

The use of association patterns for text categorization has attracted great interest and a variety of useful methods have been developed. However, the key characteristics of pattern-based text categorization remain unclear. Indeed, there are still no concrete answers for the following two questions: Firstly, what kind of association pattern is the best candidate for pattern-based text categorization? Secondly, what is the most desirable way to use patterns for text categorization? In this paper, we focus on answering the above two questions. More specifically, we show that hyperclique patterns are more desirable than frequent patterns for text categorization. Along this line, we develop an algorithm for text categorization using hyperclique patterns. As demonstrated by our experimental results on various real-world text documents, our method provides much better computational performance than state-of-the-art methods while retaining classification accuracy.

Original languageEnglish (US)
Pages (from-to)4040-4058
Number of pages19
JournalInformation Sciences
Volume177
Issue number19
DOIs
StatePublished - Oct 1 2007
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Software
  • Control and Systems Engineering
  • Theoretical Computer Science
  • Computer Science Applications
  • Information Systems and Management
  • Artificial Intelligence

Keywords

  • Association rules
  • Hyperclique patterns
  • Text categorization

Fingerprint Dive into the research topics of 'On the strength of hyperclique patterns for text categorization'. Together they form a unique fingerprint.

Cite this