Rebalancing Using Estimated Class Distribution for Imbalanced Semi-Supervised Learning under Class Distribution Mismatch
Taemin Park, Hyuck Lee, Heeyoung Kim*
;
Abstract
"Despite significant advancements in class-imbalanced semi-supervised learning (CISSL), many existing algorithms explicitly or implicitly assume that the class distribution of unlabeled data matches that of labeled data. However, when this assumption fails in practice, the classification performance of such algorithms may degrade due to incorrectly assigned weight to each class during training. We propose a novel CISSL algorithm called Rebalancing Using Estimated Class Distribution (RECD). RECD estimates the unknown class distribution of unlabeled data through Monte Carlo approximation, leveraging predicted class probabilities for unlabeled samples, and subsequently rebalances the classifier based on the estimated class distribution. Additionally, we propose an extension of feature clusters compression in the context of CISSL to mitigate feature map imbalance by densifying minority class clusters. Experimental results on four benchmark datasets demonstrate that RECD achieves state-of-the-art classification performance in CISSL."
Related Material
[pdf]
[supplementary material]
[DOI]