QUICK REVIEW

[论文解读] SWE-Hub: A Unified Production System for Scalable, Executable Software Engineering Tasks

Yucheng Zeng, Shupeng Li|arXiv (Cornell University)|Feb 28, 2026

Software Engineering Research被引用 0

一句话总结

SWE-Hub 引入一个端到端数据工厂，统一环境配置、可扩展任务合成和多样化任务生成，以规模化产出可执行的 SWE 任务。它在一个共享执行基底上构建三条任务产品线，用于修复、基于现实感的回归和长期仓库构建。

ABSTRACT

Progress in software-engineering agents is increasingly constrained by the scarcity of executable, scalable, and realistic data for training and evaluation. This scarcity stems from three fundamental challenges in existing pipelines: environments are brittle and difficult to reproduce across languages; synthesizing realistic, system-level bugs at scale is computationally expensive; and existing data predominantly consists of short-horizon repairs, failing to capture long-horizon competencies like architectural consistency. We introduce extbf{SWE-Hub}, an end-to-end system that operationalizes the data factory abstraction by unifying environment automation, scalable synthesis, and diverse task generation into a coherent production stack. At its foundation, the extbf{Env Agent} establishes a shared execution substrate by automatically converting raw repository snapshots into reproducible, multi-language container environments with standardized interfaces. Built upon this substrate, extbf{SWE-Scale} engine addresses the need for high-throughput generation, combining cross-language code analysis with cluster-scale validation to synthesize massive volumes of localized bug-fix instances. extbf{Bug Agent} generates high-fidelity repair tasks by synthesizing system-level regressions involving cross-module dependencies, paired with user-like issue reports that describe observable symptoms rather than root causes. Finally, extbf{SWE-Architect} expands the task scope from repair to creation by translating natural-language requirements into repository-scale build-a-repo tasks. By integrating these components, SWE-Hub establishes a unified production pipeline capable of continuously delivering executable tasks across the entire software engineering lifecycle.

研究动机与目标

解决用于训练和评估软件工程代理的可执行、可扩展数据稀缺问题。
将环境自动化、可扩展合成和多样化任务生成统一到一个生产栈中。
提供一个共享的执行底层，具备标准化接口与验证，以实现可重复的任务。
提供覆盖修复、回归现实性和长期仓库构建的多任务族。

提出的方法

Env Agent 将原始仓库快照转换为可重复的多语言容器环境，具备标准化接口。
SWE-Scale 引擎通过跨语言分析与自动验证生成大量本地化的 bug-fix 实例。
Bug Agent 以用户风格的问题报告合成系统级回归，描述症状而不提供根因线索。
SWE-Architect 将自然语言需求转化为可从零开始构建的仓库级任务，支持长期数据生成。
所有任务线路共用基于 Kubernetes 的执行底层、统一的任务模式以及可重复性的确定性验证门槛。

Figure 1: SWE-Hub architecture. Starting from raw source repositories, the Env Agent provisions a deterministic execution substrate—pinned container images, a unified verification entrypoint, and deterministic artifacts—to make code runnable and testable. On top of this substrate, three task generat

实验结果

研究问题

RQ1生产系统如何在多语言环境下将原始仓库可靠地转化为可执行、可验证的任务实例？
RQ2在统一验证器下，是否可以将可扩展合成与真实感 bug 生成与长期仓库构建任务集成？
RQ3哪种体系结构设计能够在 SWE 中实现跨语言可重复性、确定性环境 provisioning 以及可扩展的任务生成？

主要发现

提出一种统一的数据工厂架构，用于标准化环境设置、候选生成和可验证性，以产出可执行的 SWE 任务。
一个自动化的可重复性底座（Env Agent）将环境就绪与测试正确性解耦，并产生带有标准化入口点的版本化镜像。
三条产品线（SWE-Scale、Bug Agent、SWE-Architect）能够在同一底座上实现面向修复的合成、面向真实感的回归以及长期目标的仓库构建。
将验证视为核心可扩展性问题，具备无状态工作节点和跨多种生态系统的确定性、可机器解析的输出。

Figure 2: SWE-Hub Environment Layer pipeline. Phase 1 (Env Agent) performs environment provisioning and build readiness, identifying toolchains and installing dependencies to achieve environment readiness without requiring tests to pass. Phase 2 (Test Agent) establishes a unified verification interf

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。