QUICK REVIEW

[论文解读] Technical Report on Neural Language Models and Few-Shot Learning for Systematic Requirements Processing in MDSE

Vincent Bertram, Miriam Boß|arXiv (Cornell University)|Jan 1, 2022

Software Engineering Research被引用 2

一句话总结

本文提出使用少量示例微调大规模预训练语言模型（如 GPT-J6B），以自动将非正式的自然语言需求转换为汽车系统工程领域的形式化、领域特定领域特定语言（DSL）。其主要贡献在于证明：少于十个标注示例即可实现高质量翻译，从而以极低的人工成本实现高语法一致性的可扩展、低资源形式化，适用于遗留需求的处理。

ABSTRACT

Systems engineering, in particular in the automotive domain, needs to cope with the massively increasing numbers of requirements that arise during the development process. To guarantee a high product quality and make sure that functional safety standards such as ISO26262 are fulfilled, the exploitation of potentials of model-driven systems engineering in the form of automatic analyses, consistency checks, and tracing mechanisms is indispensable. However, the language in which requirements are written, and the tools needed to operate on them, are highly individual and require domain-specific tailoring. This hinders automated processing of requirements as well as the linking of requirements to models. Introducing formal requirement notations in existing projects leads to the challenge of translating masses of requirements and process changes on the one hand and to the necessity of the corresponding training for the requirements engineers. In this paper, based on the analysis of an open-source set of automotive requirements, we derive domain-specific language constructs helping us to avoid ambiguities in requirements and increase the level of formality. The main contribution is the adoption and evaluation of few-shot learning with large pretrained language models for the automated translation of informal requirements to structured languages such as a requirement DSL. We show that support sets of less than ten translation examples can suffice to few-shot train a language model to incorporate keywords and implement syntactic rules into informal natural language requirements.

研究动机与目标

解决汽车系统工程中不一致、模糊的自然语言需求问题。
降低将自然语言手动翻译为形式化、结构化 DSL 所需的高成本与高工作量。
通过使用预训练语言模型进行少样本学习，实现遗留需求的可扩展、低资源形式化。
通过提升可追溯性、一致性与早期错误检测能力，支持模型驱动的系统工程（MDSE）。
促进工业环境中形式化 DSL 的应用，尤其适用于数据有限的小型团队或单个项目。

提出的方法

分析一个开源的汽车需求数据集，识别常见歧义，并制定领域特定的 DSL 构造。
设计一种领域特定语言（DSL），其语法与语义针对汽车需求量身定制，强调直观表述与形式化结构。
使用大型语言模型（如 GPT-J6B）进行少样本学习，仅需每种翻译模式 1–6 个支持样本，即可训练从自然语言到 DSL 的翻译。
围绕特定语言模式（如 if-then 逻辑、情态动词、命题逻辑）组织训练样本，以引导模型实现句法与语义的映射。
应用人工标注的翻译示例，以最小化分布偏移，并确保在上下文特定表述中的正确性。
通过五名工程师的人工标注与多数投票机制评估翻译质量，以确保可靠性并减轻偏差。

实验结果

研究问题

RQ1使用大型语言模型进行少样本学习，能否实现将非正式自然语言需求准确翻译为形式化、领域特定的 DSL？
RQ2在汽车需求中，为实现不同语言结构的可靠且一致的翻译，需要多少少样本示例？
RQ3预训练语言模型在无需完整微调的情况下，能在多大程度上泛化到需求工程中多样的句法模式？
RQ4在低数据环境下，人工监督如何影响自动翻译的质量与一致性？
RQ5该方法是否可有效应用于遗留需求，而无需大规模重新标注或完整微调？

主要发现

使用 GPT-J6B 的少样本学习在每种翻译模式仅使用 1 至 6 个支持样本的情况下，即实现了高翻译准确率，证明了在低数据环境下的可行性。
模型成功将 if-then 逻辑、情态动词（如 'must'、'may'）以及命题逻辑（如 '≤'、'≥'）等复杂结构翻译为形式化 DSL 语法，且具有一致性。
对于 if-then 结构，模型在 6 样本与 4 样本设置下达到 100% 准确率，在 1 样本设置下也达到 91% 准确率，表明其具备强大的少样本泛化能力。
在仅使用关键词特定示例训练的情况下，命题逻辑构造（如 'equal to'、'less than or equal to'）的翻译准确率达到 100%，即使数据极少。
模型正确提取了语义元素，如变量与边界（如 'horn loudness ≤50dB'），从而支持下游的可追溯性与一致性检查。
人工评估证实了高可靠性，五名工程师的多数投票机制用于验证结果并减少标注偏差。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。