[Paper Review] Risks of Practicing Large Language Models in Smart Grid: Threat Modeling and Validation
The paper analyzes two threat models for LLMs in smart grids—bad data injection and domain knowledge extraction—and validates them using GPT-3.5 and GPT-4, showing attackers can inject malicious data and extract domain knowledge from smart-grid LLM applications.
Large language models (LLMs) represent significant breakthroughs in artificial intelligence and hold potential for applications within smart grids. However, as demonstrated in previous literature, AI technologies are susceptible to various types of attacks. It is crucial to investigate and evaluate the risks associated with LLMs before deploying them in critical infrastructure like smart grids. In this paper, we systematically evaluated the risks of LLMs and identified two major types of attacks relevant to potential smart grid LLM applications, presenting the corresponding threat models. We validated these attacks using popular LLMs and real smart grid data. Our validation demonstrates that attackers are capable of injecting bad data and retrieving domain knowledge from LLMs employed in different smart grid applications.
Motivation & Objective
- Assess how LLMs differ from traditional AI in smart grid contexts and identify risk types specific to LLM deployment.
- Propose two generalized threat models for LLMs in smart grids: bad data injection and domain knowledge extraction.
- Validate the proposed threats through experiments using popular LLMs and real smart grid data.
- Provide open-source data, code, and evaluation results to enable replication and further research.
Proposed method
- Characterize LLM workflow and prompt-based interaction in smart grid settings.
- Define two threat models for LLMs: bad data injection via public access to LLMs, and domain knowledge extraction from domain-specific prompts.
- Design and conduct validation experiments using GPT-3.5 and GPT-4 with real datasets (renewable energy incident reports and AMI data).
- Use phase-based attack simulations to measure impact on classification/detection tasks under malicious inputs.
- Compare normal vs injected input performance with standard metrics (accuracy, precision, recall, F1).
- Publish data, code, and results to an open-source repository for reproducibility.
Experimental results
Research questions
- RQ1What vulnerabilities do LLMs introduce when applied to smart grid tasks compared with traditional ML models?
- RQ2Can attackers inject malicious data into LLM-based smart grid applications via public interfaces and degrade performance?
- RQ3Can insiders or attackers extract domain knowledge (prompts) from LLMs handling utility data, revealing sensitive information?
- RQ4Do GPT-3.5 and GPT-4 demonstrate resilience against these threat vectors under realistic smart grid scenarios?
- RQ5What open data and tooling are needed to reproduce and extend these experiments?
Key findings
- Attackers can significantly degrade LLM-based incident detection when bad data is injected (e.g., GPT-3.5: accuracy drops from 89.1% to 33.5% under reverse injection).
- GPT-4 shows similar vulnerability to bad data injection, with accuracy dropping to 43.4% under reverse injection.
- Under bad data injection, precision and recall drop dramatically for several cases, showing output manipulation risks.
- Domain knowledge extraction can cause GPT-3.5 and GPT-4 to disclose aggregated domain information when clever prompts are used, indicating confidentiality risks.
- Normal-input scenarios yield high performance, but well-crafted inputs enable data leakage or misleading outputs.
- The authors provide open-source data, code, and evaluation results to enable further research.
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.