Large scale semantic indexing with deep level-wise extreme multi-label learning

Dingcheng Li, Jingyuan Zhang, Ping Li

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Domain ontology is widely used to index literature for the convenience of literature retrieval. Due to the high cost of manual curation of key aspects from the scientific literature, automated methods are crucially required to assist the process of semantic indexing. However, it is a challenging task due to the huge amount of terms and complex hierarchical relations involved in a domain ontology. In this paper, in order to lessen the curse of dimensionality and enhance the training efficiency, we propose an approach named Deep Level-wise Extreme Multi-label Learning and Classification (Deep Level-wise XMLC), to facilitate the semantic indexing of literatures. Specifically, Deep Level-wise XMLC is composed of two sequential modules. The first module, deep level-wise multi-label learning, decomposes the terms of a domain ontology into multiple levels and builds a special convolutional neural network for each level with category-dependent dynamic max pooling and macro F-measure based weights tuning. The second module, hierarchical pointer generation model merges the level-wise outputs into a final summarized semantic indexing. We demonstrate the effectiveness of Deep Level-wise XMLC by comparing it with several state-of-the-art methods on automatic labeling of MeSH, on literature from PubMed MEDLINE and automatic labeling of AmazonCat13K.

Original languageEnglish (US)
Title of host publicationThe Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019
PublisherAssociation for Computing Machinery, Inc
Pages950-960
Number of pages11
ISBN (Electronic)9781450366748
DOIs
StatePublished - May 13 2019
Externally publishedYes
Event2019 World Wide Web Conference, WWW 2019 - San Francisco, United States
Duration: May 13 2019May 17 2019

Publication series

NameThe Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019

Conference

Conference2019 World Wide Web Conference, WWW 2019
Country/TerritoryUnited States
CitySan Francisco
Period5/13/195/17/19

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Software

Keywords

  • Deep level-wise extreme multi-label learning and classification
  • Online macro F-measure optimization
  • Pointer generation

Fingerprint

Dive into the research topics of 'Large scale semantic indexing with deep level-wise extreme multi-label learning'. Together they form a unique fingerprint.

Cite this