QUICK REVIEW

[논문 리뷰] Semi-Supervised Learning with Balanced Deep Representation Distributions

Changchun Li, Ximing Li|arXiv (Cornell University)|2026. 03. 22.

Text and Document Classification Technologies인용 수 0

한 줄 요약

tldr: S2tc-bdd를 도입하는 반 감독 학습 텍스트 분류 방법으로, 깊은 표현에서 레이블 각도 분산을 균형 있게 조정하여 가짜 레이블의 정확도와 레이블이 희소한 상황에서의 성능을 향상시킨다.

ABSTRACT

Semi-Supervised Text Classification (SSTC) mainly works under the spirit of self-training. They initialize the deep classifier by training over labeled texts; and then alternatively predict unlabeled texts as their pseudo-labels and train the deep classifier over the mixture of labeled and pseudo-labeled texts. Naturally, their performance is largely affected by the accuracy of pseudo-labels for unlabeled texts. Unfortunately, they often suffer from low accuracy because of the margin bias problem caused by the large difference between representation distributions of labels in SSTC. To alleviate this problem, we apply the angular margin loss, and perform several Gaussian linear transformations to achieve balanced label angle variances, i.e., the variance of label angles of texts within the same label. More accuracy of predicted pseudo-labels can be achieved by constraining all label angle variances balanced, where they are estimated over both labeled and pseudo-labeled texts during self-training loops. With this insight, we propose a novel SSTC method, namely Semi-Supervised Text Classification with Balanced Deep representation Distributions (S2TC-BDD). We implement both multi-class classification and multi-label classification versions of S2TC-BDD by introducing some pseudo-labeling tricks and regularization terms. To evaluate S2 TC-BDD, we compare it against the state-of-the-art SSTC methods. Empirical results demonstrate the effectiveness of S2 TC-BDD, especially when the labeled texts are scarce.

연구 동기 및 목표

Identify margin bias caused by imbalanced label representation distributions in semi-supervised text classification (SSTC).
Develop a balanced deep representation distribution (BDD) loss by Gaussian-based angular transformations to balance label angles.
Extend SSTC with both multi-class and multi-label variants that leverage balanced distributions during self-training.
Demonstrate superior performance of S2tc-bdd over state-of-the-art SSTC methods, especially with limited labeled data.

제안 방법

Base the model on BERT with angular margin (AM) loss to learn discriminative deep representations.
Assume label angles come from label-specific Gaussians and apply Gaussian linear transformations to balance variances across labels (ψk(θik)).
Define and optimize a balanced deep representation distribution (BDDL) loss L_bdd that uses ψk(·) in place of θik within AM loss.
Estimate label angle distributions (means μk and variances σk^2) and prototypes ck from both labeled and pseudo-labeled texts during self-training.
For multi-class, use sharpening and entropy regularization; for multi-label, apply class-distribution-aware pseudo-labeling (CAP) and a low-rank regularization via ADMM.
Provide full objective functions: L_mcc for multi-class and L_mlc for multi-label, integrating supervised, unsupervised, and regularization terms.

Figure 1: The average difference of label angle variances (Avg.DLAV) computed in semi-supervised and supervised manners across AG News ( Multi-Class Case ) and AAPD ( Multi-Label Case ), respectively.

실험 결과

연구 질문

RQ1How does margin bias from unbalanced label representations affect pseudo-label accuracy in SSTC?
RQ2Can transforming angles to balanced label distributions improve pseudo-label quality and final performance?
RQ3How do multi-class and multi-label SSTC settings perform under scarce labeled data when using S2tc-bdd?
RQ4What are the effects of sharpening, CAP, and regularization on learning with balanced representations?
RQ5Do the proposed methods outperform existing SSTC baselines on standard benchmarks?

주요 결과

S2tc-bdd yields superior performance compared with state-of-the-art SSTC methods, especially when labeled data are scarce.
Balancing label angle variances via Gaussian linear transformations eliminates margin bias and improves pseudo-label accuracy.
The method extends effectively to both multi-class and multi-label text classification tasks.
Incorporating sharpening or CAP along with entropy regularization enhances learning under self-training.
Experiments on AG News, Yelp, Yahoo (multi-class) and Ohsumed, AAPD, RCV1-V2 (multi-label) demonstrate robust improvements.

Figure 2: Let solid circles and triangles denote labeled positive and negative texts, and hollow ones denote corresponding unlabeled texts. (a) The large difference between label angle variances results in the margin bias. Many unlabeled texts (in red) can be misclassified. (b) Balancing the label a

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.