S2OSC: A Holistic Semi-Supervised Approach for Open Set Classification

Yang Yang, Hongchen Wei, Zhen Qiang Sun, Guang Yu Li, Yuanchun Zhou, Hui Xiong, Jian Yang

Research output: Contribution to journalArticlepeer-review

Abstract

Open set classification (OSC) tackles the problem of determining whether the data are in-class or out-of-class during inference, when only provided with a set of in-class examples at training time. Traditional OSC methods usually train discriminative or generative models with the owned in-class data, and then utilize the pre-trained models to classify test data directly. However, these methods always suffer from the embedding confusion problem, i.e., partial out-of-class instances are mixed with in-class ones of similar semantics, making it difficult to classify. To solve this problem, we unify semi-supervised learning to develop a novel OSC algorithm, S2OSC, which incorporates out-of-class instances filtering and model re-training in a transductive manner. In detail, given a pool of newly coming test data, S2OSC firstly filters the mostly distinct out-of-class instances using the pre-trained model, and annotates super-class for them. Then, S2OSC trains a holistic classification model by combing in-class and out-of-class labeled data with the remaining unlabeled test data in a semi-supervised paradigm. Furthermore, considering that data are usually in the streaming form in real applications, we extend S2OSC into an incremental update framework (I-S2OSC), and adopt a knowledge memory regularization to mitigate the catastrophic forgetting problem in incremental update. Despite the simplicity of proposed models, the experimental results show that S2OSC achieves state-of-the-art performance across a variety of OSC tasks, including 85.4% of F1 on CIFAR-10 with only 300 pseudo-labels. We also demonstrate how S2OSC can be expanded to incremental OSC setting effectively with streaming data.

Original languageEnglish (US)
Article number34
JournalACM Transactions on Knowledge Discovery from Data
Volume16
Issue number2
DOIs
StatePublished - Apr 2022

All Science Journal Classification (ASJC) codes

  • Computer Science(all)

Keywords

  • embedding confusion
  • incremental learning
  • Open set classification
  • semi-supervised learning

Fingerprint

Dive into the research topics of 'S2OSC: A Holistic Semi-Supervised Approach for Open Set Classification'. Together they form a unique fingerprint.

Cite this