[論文レビュー] Design-MLLM: A Reinforcement Alignment Framework for Verifiable and Aesthetic Interior Design
tldr: Design-MLLM は、空間的実現性と美的嗜好を分離する強化学習ベースの整合性フレームワークを導入し、実行可能で美的に整合した室内デザインを生成します。
Interior design is a requirements-to-visual-plan generation process that must simultaneously satisfy verifiable spatial feasibility and comparative aesthetic preferences. While recent multimodal large language models (MLLMs) offer a unified foundation for interpreting user intent and producing design rationales, our empirical analysis reveals a persistent contradiction in real-world deployment: MLLMs often produce layouts that are unbuildable and aesthetically inconsistent. These findings indicate that simply adding in-domain text is insufficient; effective interior design requires an alignment mechanism that separates hard constraints from soft preferences and coordinates them during optimization. To address this, we propose Design-MLLM, a reinforcement alignment framework that optimizes a feasibility-first preference objective via a dual-branch, aesthetic-oriented reward. Specifically, Design-MLLM (i) explicitly evaluates spatial feasibility using programmatic constraint checks, (ii) assesses aesthetic preference only among feasible candidates to avoid visually appealing but unexecutable shortcuts, and (iii) performs group-relative optimization to obtain stable preference signals. Through this process, Design-MLLM learns a controllable policy that consistently selects and generates solutions that are both executable and aesthetically coherent, rather than occasionally producing visually appealing but infeasible designs. Extensive experiments on various benchmark datasets demonstrate the advantages of Design-MLLM.
研究の動機と目的
- Diagnose why generic multimodal LLMs struggle with deployable interior design due to feasibility–aesthetics conflicts.
- Propose a reinforcement alignment framework that decouples hard spatial constraints from soft aesthetic preferences.
- Develop a feasibility-guided generation pipeline and a dual-branch reward to learn a controllable, policy-based design generator.
- Demonstrate improvements in both spatial executability and aesthetic alignment across multiple benchmarks.
提案手法
- Feasibility-guided candidate generation that produces a group of design candidates and verifies geometric feasibility via constraint checks.
- A dual-branch aesthetic-oriented reward that separately evaluates spatial feasibility and aesthetic preference among feasible candidates.
- GRPO-style group-relative policy optimization to learn more aesthetic solutions within the feasible domain.
- Layout-to-image realization that translates optimized structural plans into high-fidelity renderings.
実験結果
リサーチクエスチョン
- RQ1What are the fundamental obstacles that prevent MLLMs from reliably generating verifiable interior design layouts?
- RQ2Can an alignment framework decouple hard spatial constraints from soft aesthetic preferences to improve both feasibility and aesthetics?
- RQ3How can policy optimization leverage groupwise comparisons to learn designs that are executable and aesthetically coherent?
- RQ4Does a feasibility-first, dual-branch reward approach outperform single-branch or naive prompting strategies in interior design tasks?
主な発見
- Design-MLLM consistently improves spatial executability by enforcing feasibility checks before aesthetics.
- Aesthetic evaluation is constrained to feasible designs to avoid shortcuts that are visually appealing but unbuildable.
- Group-relative policy optimization (GRPO) provides stable learning signals by normalizing rewards within candidate groups.
- A dual-branch reward enables learning a controllable policy that favors more aesthetic designs within the feasible region.
- Ablations confirm the necessity of decoupling feasibility and aesthetics and of using a group-based optimization signal.
より良い研究を、今すぐ始めましょう
論文設計から論文執筆まで、研究時間を劇的に削減しましょう。
クレジットカード登録不要
このレビューはAIが作成し、人間の編集者が確認しました。