TY - GEN
T1 - Adversarial Kernel Sampling on Class-imbalanced Data Streams
AU - Yang, Peng
AU - Li, Ping
N1 - Publisher Copyright:
© 2021 ACM.
PY - 2021/10/26
Y1 - 2021/10/26
N2 - This paper investigates online active learning in the setting of class-imbalanced data streams, where labels are allowed to be queried of with limited budgets. In this setup, conventional learning would be biased towards majority classes and consequently harm the performance. To address this issue, imbalance learning technique adopts both asymmetric losses and asymmetric queries to tackle the imbalance. Although this approach is effective, it may not guarantee the performance in an adversarial setting where the actual labels are unknown, and they may be chosen by the adversary To learn a promising hypothesis in class-imbalanced and adversarial environment, we propose an asymmetric min-max optimization framework for online classification. The derived algorithm can track the imbalance and bound the choices of an adversary simultaneously. Despite the promising result, this algorithm assumes that the label is provided for every input, while label is scare and labeling is expensive in real-world application. To this end, we design a confidence-based sampling strategy to query the informative labels within a budget. We theoretically analyze this algorithm in terms of mistake bound, and two asymmetric measures. Empirically, we evaluate the algorithms on multiple real-world imbalanced tasks. Promising results could be achieved on various application domains.
AB - This paper investigates online active learning in the setting of class-imbalanced data streams, where labels are allowed to be queried of with limited budgets. In this setup, conventional learning would be biased towards majority classes and consequently harm the performance. To address this issue, imbalance learning technique adopts both asymmetric losses and asymmetric queries to tackle the imbalance. Although this approach is effective, it may not guarantee the performance in an adversarial setting where the actual labels are unknown, and they may be chosen by the adversary To learn a promising hypothesis in class-imbalanced and adversarial environment, we propose an asymmetric min-max optimization framework for online classification. The derived algorithm can track the imbalance and bound the choices of an adversary simultaneously. Despite the promising result, this algorithm assumes that the label is provided for every input, while label is scare and labeling is expensive in real-world application. To this end, we design a confidence-based sampling strategy to query the informative labels within a budget. We theoretically analyze this algorithm in terms of mistake bound, and two asymmetric measures. Empirically, we evaluate the algorithms on multiple real-world imbalanced tasks. Promising results could be achieved on various application domains.
KW - adversarial learning
KW - imbalanced class
KW - online kernel learning
UR - http://www.scopus.com/inward/record.url?scp=85119205312&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85119205312&partnerID=8YFLogxK
U2 - 10.1145/3459637.3482227
DO - 10.1145/3459637.3482227
M3 - Conference contribution
AN - SCOPUS:85119205312
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 2352
EP - 2362
BT - CIKM 2021 - Proceedings of the 30th ACM International Conference on Information and Knowledge Management
PB - Association for Computing Machinery
T2 - 30th ACM International Conference on Information and Knowledge Management, CIKM 2021
Y2 - 1 November 2021 through 5 November 2021
ER -