WebChild: Harvesting and organizing commonsense knowledge from the web

Niket Tandon, Gerard De Melo, Fabian Suchanek, Gerhard Weikum

Research output: Chapter in Book/Report/Conference proceedingConference contribution

106 Scopus citations

Abstract

This paper presents a method for automatically constructing a large commonsense knowledge base, called WebChild, from Web contents. WebChild contains triples that connect nouns with adjectives via fine-grained relations like hasShape, hasTaste, evokesEmotion, etc. The arguments of these assertions, nouns and adjectives, are disambiguated by mapping them onto their proper WordNet senses. Our method is based on semi-supervised Label Propagation over graphs of noisy candidate assertions. We automatically derive seeds from WordNet and by pattern matching from Web text collections. The Label Propagation algorithm provides us with domain sets and range sets for 19 different relations, and with confidence-ranked assertions between WordNet senses. Large-scale experiments demonstrate the high accuracy (more than 80 percent) and coverage (more than four million fine grained disambiguated assertions) of WebChild.

Original languageEnglish (US)
Title of host publicationWSDM 2014 - Proceedings of the 7th ACM International Conference on Web Search and Data Mining
PublisherAssociation for Computing Machinery
Pages523-532
Number of pages10
ISBN (Print)9781450323512
DOIs
StatePublished - 2014
Externally publishedYes
Event7th ACM International Conference on Web Search and Data Mining, WSDM 2014 - New York, NY, United States
Duration: Feb 24 2014Feb 28 2014

Publication series

NameWSDM 2014 - Proceedings of the 7th ACM International Conference on Web Search and Data Mining

Other

Other7th ACM International Conference on Web Search and Data Mining, WSDM 2014
Country/TerritoryUnited States
CityNew York, NY
Period2/24/142/28/14

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Information Systems

Keywords

  • commonsense knowledge
  • knowledge bases
  • label propagation
  • web mining
  • word sense disambiguation

Fingerprint

Dive into the research topics of 'WebChild: Harvesting and organizing commonsense knowledge from the web'. Together they form a unique fingerprint.

Cite this