Addressing imbalance in multi-label classification using structured hellinger forests

Zachary A. Daniels, Dimitris N. Metaxas

Research output: Contribution to conferencePaperpeer-review

17 Scopus citations

Abstract

The multi-label classification problem involves finding a model that maps a set of input features to more than one output label. Class imbalance is a serious issue in multilabel classification. We introduce an extension of structured forests, a type of random forest used for structured prediction, called Sparse Oblique Structured Hellinger Forests (SOSHF). We explore using structured forests in the general multi-label setting and propose a new imbalance-aware formulation by altering how the splitting functions are learned in two ways. First, we account for cost-sensitivity when converting the multi-label problem to a single-label problem at each node in the tree. Second, we introduce a new objective function for determining oblique splits based on the Hellinger distance, a splitting criterion that has been shown to be robust to class imbalance. We empirically validate our method on a number of benchmarks against standard and state-of-the-art multi-label classification algorithms with improved results.

Original languageEnglish (US)
Pages1826-1832
Number of pages7
StatePublished - 2017
Event31st AAAI Conference on Artificial Intelligence, AAAI 2017 - San Francisco, United States
Duration: Feb 4 2017Feb 10 2017

Other

Other31st AAAI Conference on Artificial Intelligence, AAAI 2017
CountryUnited States
CitySan Francisco
Period2/4/172/10/17

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence

Fingerprint Dive into the research topics of 'Addressing imbalance in multi-label classification using structured hellinger forests'. Together they form a unique fingerprint.

Cite this