QUICK REVIEW

[Paper Review] Unlocking Hardware Security Assurance: The Potential of LLMs

Xingyu Meng, Amisha Srivastava|arXiv (Cornell University)|Aug 21, 2023

Physical Unclonable Functions (PUFs) and Hardware Security16 citations

TL;DR

The paper presents NSPG, an NLP-based framework using HS-BERT to automatically extract hardware security properties from SoC documentation, enabling vulnerability detection and bug discovery; it identifies 326 properties from 1,723 sentences and eight bugs in OpenTitan, outperforming ChatGPT by about 15%.

ABSTRACT

System-on-Chips (SoCs) form the crux of modern computing systems. SoCs enable high-level integration through the utilization of multiple Intellectual Property (IP) cores. However, the integration of multiple IP cores also presents unique challenges owing to their inherent vulnerabilities, thereby compromising the security of the entire system. Hence, it is imperative to perform hardware security validation to address these concerns. The efficiency of this validation procedure is contingent on the quality of the SoC security properties provided. However, generating security properties with traditional approaches often requires expert intervention and is limited to a few IPs, thereby resulting in a time-consuming and non-robust process. To address this issue, we, for the first time, propose a novel and automated Natural Language Processing (NLP)-based Security Property Generator (NSPG). Specifically, our approach utilizes hardware documentation in order to propose the first hardware security-specific language model, HS-BERT, for extracting security properties dedicated to hardware design. To evaluate our proposed technique, we trained the HS-BERT model using sentences from RISC-V, OpenRISC, MIPS, OpenSPARC, and OpenTitan SoC documentation. When assessedb on five untrained OpenTitan hardware IP documents, NSPG was able to extract 326 security properties from 1723 sentences. This, in turn, aided in identifying eight security bugs in the OpenTitan SoC design presented in the hardware hacking competition, Hack@DAC 2022.

Motivation & Objective

Address the challenge of generating robust hardware security properties for SoCs with multiple IP cores.
Develop an automated NLP-based property generator (NSPG) that leverages hardware-domain BERT (HS-BERT) to extract security properties from design docs.
Create domain-specific data augmentation and modification techniques to train HS-BERT and a sequence classifier for property identification.
Validate NSPG on unseen OpenTitan documents and demonstrate practical bug discovery in Hack@DAC 2022, benchmarking against ChatGPT.

Proposed method

Construct NSPG comprising data augmentation, domain-adapted pre-training (HS-BERT) via masked language modeling on hardware docs, and a sequence classification model (SCM).
Assemble hardware-document datasets: D_pre (pre-training MLM on 15,583 sentences), D_cls (4,427 labeled sentences for property/non-property), and D_val (708 labeled sentences for unseen validation).
Apply data augmentation (random swap, random deletion, synonym replacement, random insertion) and domain-specific fragment insertion to enrich training data.
Fine-tune the SCM with HS-BERT on labeled data to classify sentences as security-property related or not, selecting MOT-based data modification for best results.
Compare HS-BERT with General BERT and SciBERT across configurations (baseline, MT, MOT, MTT, MOTMT) to select the best performing model.
Evaluate on OpenTitan/OpenTitan-derived documents and demonstrate extraction of properties, followed by applying these properties to detect vulnerabilities in OpenTitan design.

Experimental results

Research questions

RQ1Can NSPG automatically mine and generate hardware security properties from SoC documentation?
RQ2How does a hardware-domain tuned BERT (HS-BERT) with domain data augmentation compare to general BERT models for this task?
RQ3What is the impact of data modification techniques on security-property classification performance?
RQ4Can the generated properties reveal real vulnerabilities in OpenTitan designs and outperform baseline or non-domain models?

Key findings

NSPG extracted 326 security properties from 1,723 sentences in unseen OpenTitan documentation.
Eight security bugs in the Hack@DAC 2022 OpenTitan design were identified using NSPG-generated properties.
NSPG outperformed ChatGPT by about 15% in identifying security properties in OpenTitan documentation.
Among HS-BERT variants, MOT modification yielded the best validation performance with OpenTitan data (82% accuracy, 90% recall on average).
HS-BERT outperformed general BERT and SciBERT in accuracy and recall for the property extraction task on the validation set.
OpenTitan, RISCV, and OpenRISC validation accuracies reached 81.5%, 79.1%, and 88.3% respectively under MOT-HS-BERT, with recalls of 93%, 90.1%, and 87% respectively.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.