QUICK REVIEW

[논문 리뷰] AssertLLM: Generating and Evaluating Hardware Verification Assertions from Design Specifications via Multi-LLMs

Wenji Fang, Mengming Li|arXiv (Cornell University)|2024. 02. 01.

Software Testing and Debugging Techniques인용 수 8

한 줄 요약

AssertLLM은 세 가지 특화된 LLM을 사용하여 전체 설계 명세로부터 SystemVerilog Assertions(SVAs)를 자동으로 생성하고 골든 RTL 설계와 대조 평가하여 전체 설계에서 문법적 및 기능적 정확도 89%를 달성합니다.

ABSTRACT

Assertion-based verification (ABV) is a critical method for ensuring design circuits comply with their architectural specifications, which are typically described in natural language. This process often requires human interpretation by verification engineers to convert these specifications into functional verification assertions. Existing methods for generating assertions from natural language specifications are limited to sentences extracted by engineers, discouraging its practical application. In this work, we present AssertLLM, an automatic assertion generation framework that processes complete specification files. AssertLLM breaks down the complex task into three phases, incorporating three customized Large Language Models (LLMs) for extracting structural specifications, mapping signal definitions, and generating assertions. Our evaluation of AssertLLM on a full design, encompassing 23 I/O signals, demonstrates that 89\% of the generated assertions are both syntactically and functionally accurate.

연구 동기 및 목표

하드웨어 검증의 ABV를 지원하기 위해 완전한 자연어 설계 명세(전 RTL)로부터 어샘션 생성을 자동화합니다.
세 가지 특화된 LLM을 사용하여 추출, 시그널 매핑, SVA 생성을 분해합니다.
디자인 전반에 걸친 SVA 생성을 평가하기 위한 오픈 소스 벤치마크와 평가 방법론을 제공합니다.

제안 방법

세 가지 맞춤 LLM을 사용합니다: SPEC Analyzer는 전체 명세에서 구조화된 정보를 추출하고; Signal Mapper는 명세 시그널을 HDL 선언과 정렬하며; SVA Generator는 Retrieval Augmented Generation(RAG)과 도메인 지식을 사용해 SVA(폭, 연결성, 기능)를 생성합니다.
정식 평가에서는 골든 RTL 설계와 모델 검사(FPV)를 사용해 SVA를 문법적으로 올바르고 FPV를 통과하는지 분류하고, 시그널별 및 설계별 정확도를 측정합니다.
자연어 명세로부터 SVA 생성을 평가하기 위한 20개 디자인(사양, 시그널 정의, 골든 RTL)을 포함한 오픈 소스 벤치마크입니다.

실험 결과

연구 질문

RQ1완전한 자연어 명세를 각 아키텍처 시그널에 대한 포괄적 SVA로 자동 변환할 수 있습니까?
RQ2골든 RTL 구현과 대조 평가했을 때 LLM이 생성한 SVA의 문법적 및 기능적 정확도는 어느 정도입니까?
RQ3구조화되지 않은 명세로부터의 SVA 생성을 다중 LLM-RAG 강화 접근 방식이 단일 LLM 베이스라인보다 우수합니까?
RQ4테스트된 I2C 설계를 넘는 다양한 설계 유형에 프레임워크가 일반화될 수 있습니까?

주요 결과

AssertLLM은 I2C 설계에서 23개의 시그널에 대해 56개의 SVA를 생성했고, 여기에는 23개의 폭(width), 16개의 연결성(connectivity), 17개의 기능(function) SVA가 포함됩니다.
생성된 SVA의 89%가 골든 RTL 설계에서 문법적으로도 기능적으로도 정확했습니다.
GPT-4 베이스라인에 비해 SPEC Analyzer, Signal Mapper, SVA Generator와 RAG를 포함한 맞춤형 접근법이 SVA 품질을 크게 향상시키고 문법 오류를 줄였습니다.
GPT-3.5는 다중 모달 full-specification을 처리할 수 없었고, 특수 파이프라인이 없는 GPT-4는 품질이 낮은 SVA를 생성했습니다.
벤치마크는 여러 설계 유형에 걸친 SVA 생성을 평가할 수 있게 하며, 향후 검증 친화적 명세 작성을 지원합니다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.