LLM Prompt Injection - Vulnerability Database

LLM Prompt Injection

Description

The application accepts user input that is directly incorporated into prompts sent to a Large Language Model (LLM) without proper validation or sanitization. Attackers can craft malicious input containing instructions that override the application's intended prompts, causing the LLM to execute unintended commands, bypass security controls, or disclose sensitive information. This vulnerability arises when user-controlled data is treated as trusted instructions rather than untrusted content.

Remediation

Implement a defense-in-depth approach to mitigate prompt injection risks:

1. Input Validation and Sanitization: Treat all user input as untrusted data. Implement strict input validation, filtering, and encoding before incorporating user content into prompts.

2. Prompt Design: Use clear delimiters to separate instructions from user content. Structure prompts to explicitly distinguish between system instructions and user-provided data.

system_prompt = "You are a helpful assistant. Respond to the user query below."
user_query = sanitize_input(user_input)
final_prompt = f"{system_prompt}\n\n--- USER QUERY ---\n{user_query}\n--- END USER QUERY ---"

3. Output Filtering: Implement content filtering on LLM responses to detect and block potentially malicious or sensitive outputs before returning them to users.

4. Privilege Limitation: Apply the principle of least privilege to LLM capabilities. Restrict access to sensitive functions, data sources, and system commands based on user roles and context.

5. Monitoring and Detection: Log all LLM interactions and implement anomaly detection to identify suspicious prompt patterns or unexpected model behavior. Establish alerts for potential injection attempts.