Security Assessment Report

Customer Support Chatbot · Demo Simulation

Risk Score: 7.2/10 - High
7.2/10
High Risk

Executive Summary

The Customer Support Chatbot exhibits significant vulnerabilities to prompt injection and system prompt leakage attacks. The agent's system prompt was fully extracted through indirect techniques, and several prompt injection vectors bypassed input filtering. The chatbot also disclosed internal API endpoints when presented with crafted multi-turn conversations. Immediate remediation is recommended for the critical and high-severity findings.

Customer Support Chatbot
17m 22s
3/14/2026, 9:47:22 AM

Statistics

Findings by OWASP Category

Findings by Severity

67

Techniques Tested

12

Successful Attacks

8

Partial Success

47

Defended

17.9%

Attack Success Rate

Findings (5)

5 of 5 findings

Threat Model Analysis

STRIDE Threat Model

S
Spoofing
0
T
Tampering
1
R
Repudiation
0
I
Info Disclosure
1
D
Denial of Service
0
E
Elevation of Priv.
1

MAESTRO Layer Analysis

L1
Foundation Model
0
L2
Agent Core
1
L3
Memory & Context
0
L4
Tool & Resource
1
L5
Agent Interaction
0
L6
Deployment
0
L7
Ecosystem
0

LINDDUN Privacy Threats

D: 1 finding

Recommendations

Implement Prompt Injection Detection LayerP1

Deploy a dedicated classifier model before the main LLM to detect and block prompt injection attempts. Consider using a fine-tuned model specifically trained on injection patterns.

Related: f-demo-001, f-demo-005

Prevent System Prompt DisclosureP2

Add explicit anti-disclosure instructions to the system prompt. Implement output filtering to detect and block responses that contain system prompt content in any encoding.

Related: f-demo-002

Implement Tool-Use AuthorizationP3

Require explicit human approval for high-impact actions such as ticket escalations, account modifications, and data exports. Implement role-based access control for agent capabilities.

Related: f-demo-004

Add Infrastructure Reference DetectionP4

Implement output scanning to detect and redact internal URLs, IP addresses, database names, and other infrastructure references before responses reach users.

Related: f-demo-003

Methodology

Assessment conducted using Sixi.CH automated security scanning with 19 specialized attack agents executing 134 techniques across OWASP LLM Top 10, MITRE ATLAS, MAESTRO, LINDDUN, STRIDE, and PASTA frameworks. Both single-turn and multi-turn attack strategies were employed.

Ready to assess your own AI agents?

Get Started Free