QUICK REVIEW

[论文解读] Programming over Thinking: Efficient and Robust Multi-Constraint Planning

Derrick Goh Xin Deik, Quanyu Long|arXiv (Cornell University)|Jan 14, 2026

AI-based Problem Solving and Planning被引用 0

一句话总结

tldr: SCOPE 将针对查询的推理与通用代码执行分离，以构建可重用求解器函数，在多约束规划中实现最先进的结果，且在基准测试中成本与延迟更低。

ABSTRACT

Multi-constraint planning involves identifying, evaluating, and refining candidate plans while satisfying multiple, potentially conflicting constraints. Existing large language model (LLM) approaches face fundamental limitations in this domain. Pure reasoning paradigms, which rely on long natural language chains, are prone to inconsistency, error accumulation, and prohibitive cost as constraints compound. Conversely, LLMs combined with coding- or solver-based strategies lack flexibility: they often generate problem-specific code from scratch or depend on fixed solvers, failing to capture generalizable logic across diverse problems. To address these challenges, we introduce the Scalable COde Planning Engine (SCOPE), a framework that disentangles query-specific reasoning from generic code execution. By separating reasoning from execution, SCOPE produces solver functions that are consistent, deterministic, and reusable across queries while requiring only minimal changes to input parameters. SCOPE achieves state-of-the-art performance while lowering cost and latency. For example, with GPT-4o, it reaches 93.1% success on TravelPlanner, a 61.6% gain over the best baseline (CoT) while cutting inference cost by 1.4x and time by ~4.67x. Code is available at https://github.com/DerrickGXD/SCOPE.

研究动机与目标

Motivate robust multi-constraint planning with LLMs by addressing reasoning inconsistency and high cost.
Propose a two-stage framework that separates problem reasoning from generic solver execution.
Learn to auto-generate reusable solver functions from a single example query.
Demonstrate state-of-the-art performance with lower inference cost on planning benchmarks.

提出的方法

Two-stage workflow: Query-Specific Problem Reasoning and Generic Solver Generation.
Problem Formalization and Problem Optimization to produce combinations and constraints as structured representations.
Solver Construction to build Combination, Filter, and Deliver functions.
Solver Refinement to iteratively improve functions using a single example and ground-truth solution.
Inference uses the pre-generated solver functions to solve new queries without regenerating code.

实验结果

研究问题

RQ1Can separating reasoning from execution produce robust, reusable solvers for multi-constraint planning?
RQ2Does a solver-based approach reduce inference cost while maintaining or improving success rates across benchmarks and models?
RQ3How well does the framework generalize across different planning domains and model capabilities?
RQ4What is the impact of each component (problem formalization, problem optimization, solver refinement) on overall performance?

主要发现

SCOPE achieves state-of-the-art results across multiple benchmarks and models.
It substantially reduces inference cost and time compared to long-text reasoning baselines.
Performance gains are especially pronounced on weaker models (e.g., GPT-4o, Gemini-1.5-Pro).
The approach remains robust as problem complexity increases, due to exhaustive enumeration and deterministic solver logic.
A single example query is sufficient for solver generation and refinement in their setup.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。