A Survey on Large Language Model (LLM) Security and Privacy: The Good, the Bad, and the Ugly


A Survey of Large Language Model (LLM) Security and Privacy: The Good, the Bad, and the Ugly

Introduction

Large Language Model (LLM) is a language model with massive parameters that understands and processes human language through pretraining tasks such as masked language modeling and autoregressive prediction. A powerful LLM should have four key characteristics: deep understanding of natural language context, the ability to generate human-like text, context awareness, especially in knowledge-intensive domains, and strong instruction-following ability.

LLMs have also attracted widespread attention in the security community. For example, GPT-3 discovered 213 security vulnerabilities in a codebase. These early efforts prompted this paper to explore three core research questions about LLM security and privacy:

To answer these questions, this paper reviews 281 related papers and categorizes them into three groups: “good” (beneficial applications), “bad” (offensive applications), and “ugly” (LLM vulnerabilities and defenses).

Findings of this paper:

Contributions of this paper: This paper is the first to provide a comprehensive summary of the role of LLMs in security and privacy, covering their positive impact, potential risks, inherent vulnerabilities, and defense mechanisms. The paper finds that LLMs contribute more to the security field than they harm it, and points out that user-level attacks are currently the most significant threat.

Background

Large Language Models (LLM)

LLMs are the evolution of language models, whose scale increased significantly after the emergence of the Transformer architecture. These models are trained on massive datasets to understand and generate text that closely mimics human language. A qualified LLM should have four key characteristics:

  1. Deep understanding of natural language: able to extract information from it and perform tasks such as translation.
  2. Human-like text generation: able to complete sentences, write paragraphs, and even articles.
  3. Context awareness: possess domain expertise, i.e., the “knowledge-intensive” property.
  4. Strong problem-solving ability: able to use textual information for information retrieval and question answering.

The table below shows a variety of language models from different vendors, illustrating the rapid development of this field. Newer models such as GPT-4 continue to emerge. Although most models are not open source, the open-sourcing of models such as BERT and LLaMA has promoted community development. In general, the more parameters a model has, the stronger its capabilities, but the higher its computational requirements. “Tunability” refers to whether the model can be fine-tuned for specific tasks.

Model Date Provider Open Source Parameters Tunable
gpt-4 2023.03 OpenAI 1.7T
gpt-3.5-turbo 2021.09 OpenAI 175B
gpt-3 2020.06 OpenAI 175B
cohere-medium 2022.07 Cohere 6B
cohere-large 2022.07 Cohere 13B
cohere-xlarge 2022.06 Cohere 52B
BERT 2018.08 Google 340M
T5 2019 Google 11B
PaLM 2022.04 Google 540B
LLaMA 2023.02 Meta AI 65B
CTRL 2019 Salesforce 1.6B
Dolly 2.0 2023.04 Databricks 12B

Overview

Scope

This paper aims to provide a comprehensive literature review of security and privacy research in the context of LLMs, identify the current state of the art, and pinpoint knowledge gaps. The focus of this paper is strictly limited to security and privacy issues.

Research Questions

This paper is organized around the following three core research questions:

This paper collected 281 related papers, including 83 in the “good” category, 54 in the “bad” category, and 144 in the “ugly” category. As shown in the figure below, most papers were published in 2023, indicating a rapid rise in research interest in this field.

Figure illustration

Finding I: In security-related applications, most researchers tend to use LLMs to enhance security, such as vulnerability detection, rather than as attack tools. Overall, LLMs contribute more positively than negatively to the security community.

Positive Impacts (Good)

This section discusses beneficial applications of LLMs in code security and data security and privacy.

Applications of LLMs in Code Security

With their strong language understanding and contextual analysis capabilities, LLMs can play a key role throughout the entire lifecycle of code security, including secure coding, test case generation, vulnerability/malicious code detection, and code repair.

Lifecycle Task Coding (C) Test Case Generation (TCG) Vulnerability Detection Malicious Code Detection Repair LLM Domain Advantages over SOTA
RE Sandoval et al. [234]         Codex - Negligible risk
RE SVEN [98]         CodeGen - Faster/safer
RE SALLM [254]         ChatGPT et al. -
RE Madhav et al. [197]         ChatGPT Hardware
RE Zhang et al. [343]   +       ChatGPT Supply chain More effective cases
RE Libro [136]   +       LLaMA - ↑ Coverage
RE TitanFuzz [56]   +       Codex DL libraries ↑ Coverage
RE FuzzGPT [57]   +       ChatGPT DL libraries ↑ Coverage
RE Fuzz4All [313]   +       ChatGPT Languages High-quality tests
RE WhiteFox [321]   +       GPT4 Compiler 4x faster
RE Zhang et al. [337]   +       ChatGPT API ↑ Coverage, but high FP/FN
RE CHATAFL [190]   +       ChatGPT Protocols Low FP/FN, but FP/FN high
RE Henrik [105]       +   N/A - Not better than SOTA
RE Apiiro [74]       +   ChatGPT - Cost-effective
RE Noever [201]     +     ChatGPT -
RE Bakhshandeh et al. [15]     +     ChatGPT -
RE Moumita et al. [218]     +     ChatGPT -
RE Cheshkov et al. [41]     +     GPT - Reduces manual effort
RE LATTE [174]     +     Codex - ↑ Accuracy/speed
RE DefectHunter [296]     +     ChatGPT Blockchain Fixes more vulnerabilities
RE Chen et al. [37]     +     ChatGPT Blockchain CI pipeline
RE Hu et al. [110]     +     LLaMa Web applications
RE KARTAL [233]     +     Codex Libraries
RE VulLibGen [38]     +     Codex Hardware Zero-shot
RE Ahmad et al. [3]     +     Codex et al. APR ↑ Accuracy
RE InferFix [125]         + ChatGPT APR ↑ Accuracy
RE Pearce et al. [211]     +   + ChatGPT et al. APR ↑ Accuracy
RE Fu et al. [83]     +     ChatGPT APR
RE Sobania et al. [257]     +   +      
RE Jiang et al. [123]     +          

Finding II: Most studies (17 out of 25) believe that LLM-based methods outperform traditional methods in code security, with advantages such as higher code coverage, higher detection accuracy, and lower cost. The most commonly criticized issue with LLM methods is their tendency to produce relatively high false negatives and false positives when detecting vulnerabilities.

Applications of LLMs in Data Security and Privacy

LLMs have also contributed to data security, mainly in terms of data integrity, confidentiality, reliability, and traceability.

Work Features Model Domain Advantage over SOTA
I C R T      
Fang [294] ○+○+ ChatGPT ransomware -
Liu et al. [187] ○+○+ ChatGPT ransomware -
Amine et al. [73] ○○○+ ChatGPT semantics comparable to SOTA
HuntGPT [8] ○○○+ ChatGPT network more effective
Chris et al. [71] ○○○+ ChatGPT logs reduces manual effort
AnomalyGPT [91] ○○○+ ChatGPT video reduces manual effort
LogGPT [221] ○○○+ ChatGPT logs reduces manual effort
Arpita et al. [286] +○++ BERT etc. - -
Takashi et al. [142] ++○+ ChatGPT phishing high accuracy
Fredrik et al. [102] ++○+ ChatGPT etc. phishing effective
IPSDM [119] ++○+ BERT phishing -
Kwon et al. [149] +○++ ChatGPT - friendly to non-experts
Scanlon et al. [237] +++○ ChatGPT forensics more effective
Sladić et al. [255] +++○ ChatGPT honeypot more realistic
WASA [297] ++○○ - watermarking more effective
REMARK [340] ++○○ - watermarking more effective
SWEET [154] ++○○ - watermarking more effective

Finding III: LLMs perform exceptionally well in data protection, often outperforming existing solutions while requiring less human intervention. Across various security applications, ChatGPT is the most widely used LLM model.

Negative Impacts (Evil)

This section examines the offensive applications of LLMs and classifies them into five categories based on their position in the system architecture.

Figure illustration

Figure illustration

Summary of the Taxonomy

This paper divides LLM-based cyberattacks into five levels:

Finding IV: User-level attacks are the most prevalent, mainly due to LLMs’ increasingly human-like reasoning abilities, which enable them to generate realistic conversations and content. At present, LLMs have limited access to OS and hardware capabilities, which constrains the prevalence of attacks at other levels.

Vulnerabilities and Defenses of LLMs (Ugly)

This section examines the vulnerabilities of LLMs themselves, the threats they face, and the corresponding defense measures.

Vulnerabilities and Threats of LLMs

This paper categorizes threats against LLMs into two types: AI model inherent vulnerabilities and non-AI model inherent vulnerabilities.