standards: tags; curated suites are hand-authored.
Agent red-team
Suites
| Suite ID | Standard / version | Count | Focus |
|---|---|---|---|
owasp-llm-top10 | OWASP LLM Top 10 (2025) | 10 | Prompt injection, sensitive disclosure, supply chain, poisoning, agency, hallucination, limits |
owasp-agentic-ai | OWASP Agentic AI Top 10 (2024) | 10 | Goal hijack, tool misuse, identity abuse, memory poisoning, inter-agent comms, rogue agents |
owasp-mcp-top10 | OWASP MCP Top 10 (2025) — agent-side | 10 | How an agent handles MCP tool calls, server trust, scope, and resource boundaries |
owasp-api | OWASP API Security Top 10 (2023) | 10 | BOLA, BFLA, RBAC, PII via API/DB tools, SQL/shell injection, debug exposure, goal hijack |
eu-ai-act-bias | EU AI Act — Bias & Fairness (2024) | 4 | Demographic bias: age, disability, gender, race |
output-trust-and-safety | Output Trust & Safety (v1) | 8 | Hallucination, sycophancy, imitation, contractual overreach, drift, reasoning DoS, smuggling |
harmful-content | MLCommons + Harmbench taxonomy | 12 | CBRN, IEDs, malicious code, CSAM, sex/violent crime, drug synthesis, self-harm, radicalization |
owasp-llm-top10
| ID | Name | Severity | OWASP |
|---|---|---|---|
prompt-injection | Prompt Injection | critical | LLM01 |
sensitive-disclosure | Sensitive Information Disclosure | critical | LLM02 |
supply-chain | Supply Chain Vulnerabilities | high | LLM03 |
data-poisoning | Data and Model Poisoning | high | LLM04 |
improper-output-handling | Improper Output Handling | high | LLM05 |
excessive-agency | Excessive Agency | high | LLM06 / ASI02 |
system-prompt-leakage | System Prompt Leakage | critical | LLM07 |
vector-embedding-weaknesses | Vector and Embedding Weaknesses | high | LLM08 |
misinformation | Misinformation | high | LLM09 |
unbounded-consumption | Unbounded Consumption | high | LLM10 |
owasp-agentic-ai
| ID | Name | Severity | OWASP |
|---|---|---|---|
agent-goal-hijack | Agent Goal Hijacking | critical | ASI01 |
tool-misuse | Tool Misuse and Exploitation | critical | ASI02 |
identity-privilege-abuse | Identity and Privilege Abuse | critical | ASI03 |
supply-chain | Supply Chain Vulnerabilities | high | ASI04 |
unexpected-code-execution | Unexpected Code Execution | critical | ASI05 |
memory-poisoning | Memory and Context Poisoning | high | ASI06 |
inter-agent-communication | Insecure Inter-Agent Communication | high | ASI07 |
cascading-failures | Cascading Failures | high | ASI08 |
human-agent-trust | Human-Agent Trust Exploitation | high | ASI09 |
rogue-agents | Rogue Agents | critical | ASI10 |
owasp-mcp-top10 (agent-side)
Probes how an agent target behaves around MCP. To probe an MCP server directly, see the MCP catalog.
| ID | Name | Severity | OWASP |
|---|---|---|---|
mcp-credential-exposure | MCP Credential Exposure | critical | MCP01 |
mcp-scope-escalation | MCP Scope Escalation | high | MCP02 |
mcp-tool-description-injection | MCP Tool Description Injection | critical | MCP03 |
mcp-supply-chain-trust | MCP Supply Chain Trust | high | MCP04 |
mcp-tool-injection-payload | MCP Tool-Injection Payload Forwarding | critical | MCP05 |
mcp-intent-subversion | MCP Intent Subversion | high | MCP06 |
mcp-missing-authentication | MCP Missing Authentication | high | MCP07 |
mcp-audit-bypass | MCP Audit and Telemetry Bypass | medium | MCP08 |
mcp-shadow-server | MCP Shadow Server Detection | high | MCP09 |
mcp-cross-resource-leakage | MCP Cross-Resource Leakage | high | MCP10 |
owasp-api
| ID | Name | Severity | OWASP |
|---|---|---|---|
bola | Broken Object Level Authorization (BOLA) | high | API1 |
bfla | Broken Function Level Authorization (BFLA) | high | API5 |
rbac | Broken Role-Based Access Control | high | API5 |
pii-direct | PII Direct Disclosure | critical | API4 |
pii-api-db | PII Disclosure via API/DB Tool Calls | critical | API4 |
pii-session | PII Cross-Session Leakage | high | API4 |
shell-injection | Shell Injection | critical | API7 |
sql-injection | SQL Injection | critical | API7 |
debug-access | Debug and Admin Interface Exposure | high | API7 |
hijacking | Goal Hijacking | high | API8 |
eu-ai-act-bias
| ID | Name | Severity | Standard |
|---|---|---|---|
bias-age | Age Bias | medium | EU AI Act Art. 10 / 15 |
bias-disability | Disability Bias | medium | EU AI Act Art. 10 / 15 |
bias-gender | Gender Bias | medium | EU AI Act Art. 10 / 15 |
bias-race | Racial Bias | medium | EU AI Act Art. 10 / 15 |
output-trust-and-safety
| ID | Name | Severity | Tag |
|---|---|---|---|
hallucination | Hallucination | high | LLM09 |
overreliance | Overreliance and Sycophancy | medium | LLM09 |
imitation | Unauthorized Imitation | high | LLM09 |
contracts | Unauthorized Contractual Commitments | high | LLM05 |
competitors | Competitor Endorsement | medium | — |
off-topic | Off-Topic Drift | low | — |
reasoning-dos | Reasoning Denial of Service | high | LLM10 |
ascii-smuggling | ASCII / Unicode Smuggling | high | LLM01 |
harmful-content
Content-harm coverage from the MLCommons hazard taxonomy and Harmbench. Best paired with raw LLM endpoints; compose with owasp-llm-top10, owasp-api, and eu-ai-act-bias for full coverage.
| ID | Name | Severity | Tag |
|---|---|---|---|
harmful-bioweapons | Chemical & Biological Weapons | critical | LLM10 / AML.T0054 |
harmful-indiscriminate-weapons | Indiscriminate Weapons (CBRN) | critical | LLM10 / AML.T0054 |
harmful-weapons-ied | Improvised Explosive Devices | critical | LLM10 / AML.T0054 |
harmful-cybercrime-malicious-code | Cybercrime & Malicious Code | critical | LLM10 / AML.T0054 |
harmful-child-exploitation | Child Exploitation (CSAM) | critical | LLM10 / AML.T0054 |
harmful-sex-crime | Sex Crime | critical | LLM10 / AML.T0054 |
harmful-violent-crime | Violent Crime | high | LLM10 / AML.T0054 |
harmful-illegal-drugs | Illegal Drug Synthesis & Trafficking | high | LLM10 / AML.T0054 |
harmful-self-harm | Self-Harm & Suicide | high | LLM10 / AML.T0054 |
harmful-radicalization | Radicalization & Extremism | high | LLM10 / AML.T0054 |
harmful-specialized-advice | Unqualified Specialized Advice | high | LLM09 / AML.T0048 |
harmful-unsafe-practices | Promotion of Unsafe Practices | high | LLM09 / AML.T0048 |
Standalone (not in any suite)
| ID | Name | Severity | OWASP |
|---|---|---|---|
jailbreaking | Jailbreaking | high | LLM10 |
selection.evaluators: ["jailbreaking"].
MCP red-team
Suite
| Suite ID | Standard / version | Count | Focus |
|---|---|---|---|
owasp-mcp-top10 | OWASP MCP Top 10 (2025) | 14 | Server-side: secret exposure, OAuth passthrough, scope escalation, supply chain, tool poisoning, command injection, SSRF, and more |
Evaluators
| ID | Name | Severity | OWASP |
|---|---|---|---|
secret-exposure | Secret and Token Exposure | critical | MCP01 |
oauth-token-passthrough | OAuth Confused Deputy and Token Passthrough | critical | MCP01 |
scope-escalation | Scope Escalation and Privilege Bypass | high | MCP02 |
tool-description-injection | Tool Poisoning (Description Injection, Rug Pull, Schema Poisoning) | critical | MCP03 |
tool-description-scan | Tool Description Poisoning Scan | critical | MCP03 |
content-injection | Second-Order Content Injection | high | MCP03 |
supply-chain | Software Supply Chain Attacks & Dependency Tampering | high | MCP04 |
command-injection | Command Injection and STDIO RCE | critical | MCP05 |
ssrf | Server-Side Request Forgery (SSRF) | critical | MCP05 |
intent-subversion | Intent Flow Subversion | high | MCP06 |
missing-authentication | Missing Authentication | critical | MCP07 |
audit-telemetry | Lack of Audit and Telemetry | medium | MCP08 |
shadow-mcp-server | Shadow MCP Server Detection | high | MCP09 |
cross-resource-leakage | Context Injection, Over-Sharing & Cross-Resource Leakage | critical | MCP10 |
Auto-fired
| ID | Name | Severity | OWASP |
|---|---|---|---|
resource-exposure | MCP Resource Exposure | critical | MCP01 |
resource-exposure runs automatically before attacks — opfor calls resources/list + resources/read on every resource and judges for secret/PII exposure. Disable with mcp.scanResources: false in the config.