QUICK REVIEW

[論文レビュー] PolyLoss: A Polynomial Expansion Perspective of Classification Loss Functions

Zhaoqi Leng, Mingxing Tan|arXiv (Cornell University)|Apr 26, 2022

Advanced Neural Network Applications被引用数 84

ひとこと要約

PolyLossは分類損失を設計する多項式展開フレームワークを提示し、単純な先頭係数の調整（Poly-1）が、画像分類、検出、セグメンテーション、および3D検出タスク全体でクロスエントロピーとフォーカルロスを一貫して上回ることを示す。

ABSTRACT

Cross-entropy loss and focal loss are the most common choices when training deep neural networks for classification problems. Generally speaking, however, a good loss function can take on much more flexible forms, and should be tailored for different tasks and datasets. Motivated by how functions can be approximated via Taylor expansion, we propose a simple framework, named PolyLoss, to view and design loss functions as a linear combination of polynomial functions. Our PolyLoss allows the importance of different polynomial bases to be easily adjusted depending on the targeting tasks and datasets, while naturally subsuming the aforementioned cross-entropy loss and focal loss as special cases. Extensive experimental results show that the optimal choice within the PolyLoss is indeed dependent on the task and dataset. Simply by introducing one extra hyperparameter and adding one line of code, our Poly-1 formulation outperforms the cross-entropy loss and focal loss on 2D image classification, instance segmentation, object detection, and 3D object detection tasks, sometimes by a large margin.

研究の動機と目的

CEとfocal lossesを統合・拡張するために、損失関数設計を多項式展開として動機づける。
タスクやデータセットに合わせるための多項式基底 (1-P_t)^j の線形結合としてPolyLossを導入する。
特に先頭項を含む多項式係数を調整することで、最小限のコード変更で実用的な利得が得られることを示す。
2D画像分類、インスタンスセグメンテーション、3D物体検出、およびマルチタスク設定における経験的改善を示す。

提案手法

CEとfocal lossesを(1 - P_t) における無限和として表現する。
PolyLossを提案: L = sum_j alpha_j (1 - P_t)^j、alpha_j は非負。
CEは alpha_j = 1/j に対応すること; focal lossは係数をγだけ水平方向にシフトすること。
単純なPoly-1損失を提案: L_Poly-1 = -log(P_t) + epsilon_1 (1 - P_t)、単一の調整可能なepsilon_1 を用いる。
性能向上のためepsilon_1を対象としたグリッド探索を実施（追加で1行のコードのみ）
ImageNet-1K/21K、COCO、およびWaymo Open Datasetで複数モデルを対象に評価する。

実験結果

リサーチクエスチョン

RQ1分類損失の多項式展開ビューは、CEとfocal lossをPolyLossの下で統一できるか？
RQ2先頭の多項式係数を垂直方向に調整することは、最小限のハイパーパラメータ負担でタスク/データセットの性能を向上させるか？
RQ3CEとfocal lossと比較して、PolyLossは多様なタスク（2D/3D検出、セグメンテーション、大規模分類）でどのように性能を示すか？
RQ4PolyLossフレームワークから、不均衡データセットの損失設計についてどんな知見が得られるか？

主な発見

PolyLoss with the simple Poly-1 formulation yields improvements over cross-entropy and focal loss across tasks: ImageNet-1K +0.4% (87.2 vs 86.8), ImageNet-21K +0.6% (46.4 vs 45.8), COCO detection/segmentation +0.4% (87.2 vs 86.8), and Waymo 3D detection +2.5% (49.7 vs 47.2).
Across multiple models and tasks, Poly-1 consistently improves performance, with gains including +0.6, +0.4, +2.5, +2.1, +0.4, +0.7, +0.5, +0.8 over respective baselines (CE or focal).
The leading polynomial term (1 - P_t) contributes a large portion of the gradient during training, and tuning its coefficient significantly boosts accuracy.
For imbalanced ImageNet-21K, positive epsilon on the first polynomial increases prediction confidence and accuracy, while for COCO-based detection, negative epsilon can reduce overconfident predictions and improve metrics.
PolyLoss achieves speed/efficiency parity with larger models (e.g., EfficientNetV2) and can approach or exceed CE/focal performance with minimal code changes.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。