Multilingual text classification using ontologies

Gerard De Melo, Stefan Siersdorfer

Research output: Chapter in Book/Report/Conference proceedingConference contribution

20 Scopus citations

Abstract

In this paper, we investigate strategies for automatically classifying documents in different languages thematically, geographically or according to other criteria. A novel linguistically motivated text representation scheme is presented that can be used with machine learning algorithms in order to learn classifications from pre-classified examples and then automatically classify documents that might be provided in entirely different languages. Our approach makes use of ontologies and lexical resources but goes beyond a simple mapping from terms to concepts by fully exploiting the external knowledge manifested in such resources and mapping to entire regions of concepts. For this, a graph traversal algorithm is used to explore related concepts that might be relevant. Extensive testing has shown that our methods lead to significant improvements compared to existing approaches.

Original languageEnglish (US)
Title of host publicationAdvances in Information Retrieval - 29th European Conference on IR Research, ECIR 2007, Proceedings
PublisherSpringer Verlag
Pages541-548
Number of pages8
ISBN (Print)3540714944, 9783540714941
DOIs
StatePublished - 2007
Externally publishedYes
Event29th European Conference on IR Research, ECIR 2007 - Rome, Italy
Duration: Apr 2 2007Apr 5 2007

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4425 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other29th European Conference on IR Research, ECIR 2007
Country/TerritoryItaly
CityRome
Period4/2/074/5/07

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Multilingual text classification using ontologies'. Together they form a unique fingerprint.

Cite this