QUICK REVIEW

[論文レビュー] Pareto-Conditioned Diffusion Models for Offline Multi-Objective Optimization

Jatan Shrestha, Santeri Heiskanen|arXiv (Cornell University)|Jan 31, 2026

Advanced Multi-Objective Optimization Algorithms被引用数 0

ひとこと要約

PCDはオフラインのMOOタスクにおいて、 surrogate-modelを使わずターゲットトレードオフで条件付けを行う拡散サンプリングへとオフラインMOOを再定義する。リウェイト戦略と参照方向条件付けを用い、性能と多様性を両立。

ABSTRACT

Multi-objective optimization (MOO) arises in many real-world applications where trade-offs between competing objectives must be carefully balanced. In the offline setting, where only a static dataset is available, the main challenge is generalizing beyond observed data. We introduce Pareto-Conditioned Diffusion (PCD), a novel framework that formulates offline MOO as a conditional sampling problem. By conditioning directly on desired trade-offs, PCD avoids the need for explicit surrogate models. To effectively explore the Pareto front, PCD employs a reweighting strategy that focuses on high-performing samples and a reference-direction mechanism to guide sampling towards novel, promising regions beyond the training data. Experiments on standard offline MOO benchmarks show that PCD achieves highly competitive performance and, importantly, demonstrates greater consistency across diverse tasks than existing offline MOO approaches.

研究の動機と目的

静的データセットのみが利用可能で、代理モデルが信頼できない可能性があるオフラインMOOを動機づける。
ターゲットParetoトレードオフに条件付けて解を直接サンプリングする条件付き生成フレームワークを開発する。
高品質なParetoフロント領域を強調する多目的リウェイト戦略を導入する。
訓練データを超えた多様で新規な条件付け点を生成する参照方向メカニズムを提案する。

提案手法

オフラインデータセット上で、複数の目的空間のグリッドとドミナンスベースの重み付けでサンプルを再重み付けした条件付き拡散モデル p(x|y;σ) を訓練する。
オブジェクティブスペース上のグリッドと支配関係に基づく重み付けで高性能点を強調する。
γでガイダンス強度を制御する classifier-free guidance により、条件付けターゲットへ向けたサンプリングを誘導する。
NSGA-IIIに触発された参照方向メカニズムで多様で高品質なターゲットを生み出す条件付け点を生成する。
サンプリング時には hat{y} ターゲットに条件付けた解 x を生成するために CFG を適用する。
256サンプルの上位Pパーセンタイル（P=100,75,50）に対して HV（Hypervolume）で性能を評価する。

Figure 1: Overview of the PCD framework, which reframes offline MOO as a conditional sampling problem. Training: A conditional diffusion model is trained on a static dataset, using a novel reweighting strategy to emphasize high-quality solutions near the Pareto front. Sampling: At inference, the mod

実験結果

リサーチクエスチョン

RQ1オフラインMOOを明示的な目的 surrogate なしで条件付きサンプリングとしてどう定式化できるか。
RQ2ターゲットトレードオフに条件付けられた拡散モデルは、オフラインMOOでパレート前線を効果的にカバーできるか。
RQ3リウェイトと参照方向条件付けはパレート前線のカバーと品質を改善するか。

主な発見

Method	Synthetic	MORL	RE	Scientific	MONAS	Avg. rank
\u0016 D best	5.45 \u000b \u000b 0.19	1.70 \u000b 0.27	2.60 \u000b 0.07	9.35 \u000b 0.14	11.53 \u000b 0.06	7.43 \u000b 0.05
MOBO	8.69 \u000b 0.30	14.60 \u000b 0.42	10.00 \u000b 0.33	6.75 \u000b 0.47	8.11 \u000b 0.80	8.81 \u000b 0.34
E2E + GN	7.33 \u000b 0.55	5.70 \u000b 2.14	7.06 \u000b 0.32	5.35 \u000b 1.38	9.33 \u000b 0.53	7.82 \u000b 0.40
E2E + PC	5.93 \u000b 0.25	3.50 \u000b 1.22	6.22 \u000b 0.33	4.30 \u000b 1.32	6.60 \u000b 0.40	6.01 \u000b 0.29
E2E	6.16 \u000b 0.30	9.70 \u000b 2.08	6.06 \u000b 0.30	4.20 \u000b 1.40	5.13 \u000b 0.22	5.71 \u000b 0.16
MH + GN	8.82 \u000b 0.53	8.90 \u000b 2.16	8.14 \u000b 0.94	5.05 \u000b 2.14	12.57 \u000b 0.40	9.84 \u000b 0.33
MH + PC	8.87 \u000b 0.45	10.90 \u000b 1.08	6.74 \u000b 0.68	6.15 \u000b 0.91	7.46 \u000b 0.30	7.68 \u000b 0.33
MH	6.18 \u000b 0.53	8.00 \u000b 1.41	6.14 \u000b 0.29	5.80 \u000b 0.89	5.88 \u000b 0.49	6.10 \u000b 0.22
MM + COMs	8.02 \u000b 0.47	3.60 \u000b 1.29	6.54 \u000b 0.17	3.85 \u000b 0.68	7.22 \u000b 0.43	6.80 \u000b 0.13
MM + ICT	6.73 \u000b 0.46	9.10 \u000b 1.95	5.44 \u000b 0.32	5.05 \u000b 0.74	8.42 \u000b 0.40	7.08 \u000b 0.13
MM + IOM	5.16 \u000b 0.51	12.70 \u000b 0.91	5.76 \u000b 0.52	4.40 \u000b 1.15	5.77 \u000b 0.50	5.80 \u000b 0.20
MM + TM	6.55 \u000b 0.82	7.90 \u000b 2.16	5.78 \u000b 0.25	5.90 \u000b 1.29	7.87 \u000b 0.39	6.91 \u000b 0.20
MM	6.07 \u000b 0.50	9.50 \u000b 0.79	5.94 \u000b 0.41	6.55 \u000b 0.93	4.97 \u000b 0.46	5.80 \u000b 0.21
ParetoFlow	2.44 \u000b 0.28	8.50 \u000b 1.32	1.74 \u000b 0.17	9.05 \u000b 0.27	11.19 \u000b 0.52	6.74 \u000b 0.23
PCD (ours)	3.38 \u000b 0.20	5.50 \u000b 3.30	1.51 \u000b 0.13	4.05 \u000b 0.33	7.54 \u000b 0.50	4.80 \u000b 0.30

PCDは合成、MORL、RE、Scientific、MONAS のタスクカテゴリ全体で最良の平均順位を達成。
PCDは ParetoFlow のような生成ベースのベースラインを上回り、複数のタスクで surrogate ベースのベースラインより優位または同等。
アブレーションにより提案されたリウェイトと参照方向メカニズムが HV の結果を一貫して改善。
γ が約 2.5 の分類なしガイダンスはそれ以上のレベルでのリターン減少を示す。
MORL タスクでは高次元探索空間のため全手法が難しく、オフラインデータセットの支配されていない点を上回る手法はなし。
PCDは多様なベンチマークに対して単一の固定ハイパーパラメータセットで頑健性を示す。

Figure 2: Overview of the conditioning points generation procedure : a) The objective space is partitioned via direction vectors, and points are ranked based on non-dominated sorting. b) Each direction vector is paired with the point closest to it in perpendicular distance (black arrow). The rest of

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。