Skip to main content
QUICK REVIEW

[论文解读] Dynamic Curriculum Learning for Imbalanced Data Classification

Yiru Wang, Weihao Gan|arXiv (Cornell University)|Jan 21, 2019
Imbalanced Data Classification Techniques参考文献 57被引用 33
一句话总结

引入 Dynamic Curriculum Learning (DCL) 与两层级调度器,以在不平衡数据上自适应采样和损失权重分配;在 CelebA 与 RAP 数据集上达到当前最先进的结果。

ABSTRACT

Human attribute analysis is a challenging task in the field of computer vision, since the data is largely imbalance-distributed. Common techniques such as re-sampling and cost-sensitive learning require prior-knowledge to train the system. To address this problem, we propose a unified framework called Dynamic Curriculum Learning (DCL) to online adaptively adjust the sampling strategy and loss learning in single batch, which resulting in better generalization and discrimination. Inspired by the curriculum learning, DCL consists of two level curriculum schedulers: (1) sampling scheduler not only manages the data distribution from imbalanced to balanced but also from easy to hard; (2) loss scheduler controls the learning importance between classification and metric learning loss. Learning from these two schedulers, we demonstrate our DCL framework with the new state-of-the-art performance on the widely used face attribute dataset CelebA and pedestrian attribute dataset RAP.

研究动机与目标

  • Motivate and address imbalanced data issues in human attribute analysis.
  • Propose a unified curriculum-learning framework to adapt sampling and loss emphasis during training.
  • Demonstrate that combining cross-entropy with metric learning improves discrimination.
  • Show that two schedulers enable gradual transition from imbalanced to balanced learning while emphasizing representation learning.
  • Validate performance on CelebA and RAP benchmarks against existing methods.

提出的方法

  • Define two-level curriculum schedulers: a sampling scheduler that shifts batch data from imbalanced to balanced and from easy to hard; a loss scheduler that balances classification loss and metric learning loss.
  • Introduce Dynamic Selective Learning (DSL) loss that applies class-wise reweighting based on target distribution in each batch.
  • Incorporate a metric learning component with Triplet Loss using Easy Anchors to stabilize embedding learning (L_TE A).
  • Design a Loss Scheduler f(l) to gradually shift emphasis from metric learning to classification as training progresses (L_DCL = L_DSL + f(l)*L_TEA).
  • Provide a general framework showing how existing imbalanced-learning methods map to DCL via different scheduler configurations.

实验结果

研究问题

  • RQ1Can dynamic curriculum scheduling improve generalization in imbalanced attribute classification tasks?
  • RQ2How should sampling and loss components be scheduled to maximize both representation learning and classification accuracy?
  • RQ3Does a metric-learning component with easy anchors stabilize embedding while boosting final discrimination?
  • RQ4How does DCL perform relative to state-of-the-art methods on CelebA and RAP under varying imbalance levels?

主要发现

  • DCL improves mean accuracy on CelebA by outperforming several baselines and state-of-the-art methods, with notable gains across attributes with high imbalance.
  • On RAP, DCL surpasses prior methods such as LG-Net, especially at higher imbalance ratios (1:x).
  • Ablation shows that the sampling scheduler, easy-anchor triplet loss, and loss scheduler each contribute to performance gains.
  • Scheduler choices matter; convex sampling and composite loss schedules yield better results than linear or simple schedules.
  • DCL generalizes beyond CelebA and RAP, achieving improvements on CIFAR-100 as well.

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。