[论文解读] SWE-Hub: A Unified Production System for Scalable, Executable Software Engineering Tasks
SWE-Hub 引入一个端到端数据工厂,统一环境配置、可扩展任务合成和多样化任务生成,以规模化产出可执行的 SWE 任务。它在一个共享执行基底上构建三条任务产品线,用于修复、基于现实感的回归和长期仓库构建。
Progress in software-engineering agents is increasingly constrained by the scarcity of executable, scalable, and realistic data for training and evaluation. This scarcity stems from three fundamental challenges in existing pipelines: environments are brittle and difficult to reproduce across languages; synthesizing realistic, system-level bugs at scale is computationally expensive; and existing data predominantly consists of short-horizon repairs, failing to capture long-horizon competencies like architectural consistency. We introduce extbf{SWE-Hub}, an end-to-end system that operationalizes the data factory abstraction by unifying environment automation, scalable synthesis, and diverse task generation into a coherent production stack. At its foundation, the extbf{Env Agent} establishes a shared execution substrate by automatically converting raw repository snapshots into reproducible, multi-language container environments with standardized interfaces. Built upon this substrate, extbf{SWE-Scale} engine addresses the need for high-throughput generation, combining cross-language code analysis with cluster-scale validation to synthesize massive volumes of localized bug-fix instances. extbf{Bug Agent} generates high-fidelity repair tasks by synthesizing system-level regressions involving cross-module dependencies, paired with user-like issue reports that describe observable symptoms rather than root causes. Finally, extbf{SWE-Architect} expands the task scope from repair to creation by translating natural-language requirements into repository-scale build-a-repo tasks. By integrating these components, SWE-Hub establishes a unified production pipeline capable of continuously delivering executable tasks across the entire software engineering lifecycle.
研究动机与目标
- 解决用于训练和评估软件工程代理的可执行、可扩展数据稀缺问题。
- 将环境自动化、可扩展合成和多样化任务生成统一到一个生产栈中。
- 提供一个共享的执行底层,具备标准化接口与验证,以实现可重复的任务。
- 提供覆盖修复、回归现实性和长期仓库构建的多任务族。
提出的方法
- Env Agent 将原始仓库快照转换为可重复的多语言容器环境,具备标准化接口。
- SWE-Scale 引擎通过跨语言分析与自动验证生成大量本地化的 bug-fix 实例。
- Bug Agent 以用户风格的问题报告合成系统级回归,描述症状而不提供根因线索。
- SWE-Architect 将自然语言需求转化为可从零开始构建的仓库级任务,支持长期数据生成。
- 所有任务线路共用基于 Kubernetes 的执行底层、统一的任务模式以及可重复性的确定性验证门槛。

实验结果
研究问题
- RQ1生产系统如何在多语言环境下将原始仓库可靠地转化为可执行、可验证的任务实例?
- RQ2在统一验证器下,是否可以将可扩展合成与真实感 bug 生成与长期仓库构建任务集成?
- RQ3哪种体系结构设计能够在 SWE 中实现跨语言可重复性、确定性环境 provisioning 以及可扩展的任务生成?
主要发现
- 提出一种统一的数据工厂架构,用于标准化环境设置、候选生成和可验证性,以产出可执行的 SWE 任务。
- 一个自动化的可重复性底座(Env Agent)将环境就绪与测试正确性解耦,并产生带有标准化入口点的版本化镜像。
- 三条产品线(SWE-Scale、Bug Agent、SWE-Architect)能够在同一底座上实现面向修复的合成、面向真实感的回归以及长期目标的仓库构建。
- 将验证视为核心可扩展性问题,具备无状态工作节点和跨多种生态系统的确定性、可机器解析的输出。

更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。