Evaluator reference

Opfor maintains two catalogs — agent and MCP. Standard suites are auto-derived from standards: tags; curated suites are hand-authored.

Agent red-team

Suites

Suite ID	Standard / version	Count	Focus
`owasp-llm-top10`	OWASP LLM Top 10 (2025)	10	Prompt injection, sensitive disclosure, supply chain, poisoning, agency, hallucination, limits
`owasp-agentic-ai`	OWASP Agentic AI Top 10 (2024)	10	Goal hijack, tool misuse, identity abuse, memory poisoning, inter-agent comms, rogue agents
`owasp-mcp-top10`	OWASP MCP Top 10 (2025) — agent-side	10	How an agent handles MCP tool calls, server trust, scope, and resource boundaries
`owasp-api`	OWASP API Security Top 10 (2023)	10	BOLA, BFLA, RBAC, PII via API/DB tools, SQL/shell injection, debug exposure, goal hijack
`eu-ai-act-bias`	EU AI Act — Bias & Fairness (2024)	4	Demographic bias: age, disability, gender, race
`output-trust-and-safety`	Output Trust & Safety (v1)	8	Hallucination, sycophancy, imitation, contractual overreach, drift, reasoning DoS, smuggling
`harmful-content`	MLCommons + Harmbench taxonomy	12	CBRN, IEDs, malicious code, CSAM, sex/violent crime, drug synthesis, self-harm, radicalization

`owasp-llm-top10`

ID	Name	Severity	OWASP
`prompt-injection`	Prompt Injection	critical	LLM01
`sensitive-disclosure`	Sensitive Information Disclosure	critical	LLM02
`supply-chain`	Supply Chain Vulnerabilities	high	LLM03
`data-poisoning`	Data and Model Poisoning	high	LLM04
`improper-output-handling`	Improper Output Handling	high	LLM05
`excessive-agency`	Excessive Agency	high	LLM06 / ASI02
`system-prompt-leakage`	System Prompt Leakage	critical	LLM07
`vector-embedding-weaknesses`	Vector and Embedding Weaknesses	high	LLM08
`misinformation`	Misinformation	high	LLM09
`unbounded-consumption`	Unbounded Consumption	high	LLM10

`owasp-agentic-ai`

ID	Name	Severity	OWASP
`agent-goal-hijack`	Agent Goal Hijacking	critical	ASI01
`tool-misuse`	Tool Misuse and Exploitation	critical	ASI02
`identity-privilege-abuse`	Identity and Privilege Abuse	critical	ASI03
`supply-chain`	Supply Chain Vulnerabilities	high	ASI04
`unexpected-code-execution`	Unexpected Code Execution	critical	ASI05
`memory-poisoning`	Memory and Context Poisoning	high	ASI06
`inter-agent-communication`	Insecure Inter-Agent Communication	high	ASI07
`cascading-failures`	Cascading Failures	high	ASI08
`human-agent-trust`	Human-Agent Trust Exploitation	high	ASI09
`rogue-agents`	Rogue Agents	critical	ASI10

`owasp-mcp-top10` (agent-side)

Probes how an agent target behaves around MCP. To probe an MCP server directly, see the MCP catalog.

ID	Name	Severity	OWASP
`mcp-credential-exposure`	MCP Credential Exposure	critical	MCP01
`mcp-scope-escalation`	MCP Scope Escalation	high	MCP02
`mcp-tool-description-injection`	MCP Tool Description Injection	critical	MCP03
`mcp-supply-chain-trust`	MCP Supply Chain Trust	high	MCP04
`mcp-tool-injection-payload`	MCP Tool-Injection Payload Forwarding	critical	MCP05
`mcp-intent-subversion`	MCP Intent Subversion	high	MCP06
`mcp-missing-authentication`	MCP Missing Authentication	high	MCP07
`mcp-audit-bypass`	MCP Audit and Telemetry Bypass	medium	MCP08
`mcp-shadow-server`	MCP Shadow Server Detection	high	MCP09
`mcp-cross-resource-leakage`	MCP Cross-Resource Leakage	high	MCP10

`owasp-api`

ID	Name	Severity	OWASP
`bola`	Broken Object Level Authorization (BOLA)	high	API1
`bfla`	Broken Function Level Authorization (BFLA)	high	API5
`rbac`	Broken Role-Based Access Control	high	API5
`pii-direct`	PII Direct Disclosure	critical	API4
`pii-api-db`	PII Disclosure via API/DB Tool Calls	critical	API4
`pii-session`	PII Cross-Session Leakage	high	API4
`shell-injection`	Shell Injection	critical	API7
`sql-injection`	SQL Injection	critical	API7
`debug-access`	Debug and Admin Interface Exposure	high	API7
`hijacking`	Goal Hijacking	high	API8

`eu-ai-act-bias`

ID	Name	Severity	Standard
`bias-age`	Age Bias	medium	EU AI Act Art. 10 / 15
`bias-disability`	Disability Bias	medium	EU AI Act Art. 10 / 15
`bias-gender`	Gender Bias	medium	EU AI Act Art. 10 / 15
`bias-race`	Racial Bias	medium	EU AI Act Art. 10 / 15

`output-trust-and-safety`

ID	Name	Severity	Tag
`hallucination`	Hallucination	high	LLM09
`overreliance`	Overreliance and Sycophancy	medium	LLM09
`imitation`	Unauthorized Imitation	high	LLM09
`contracts`	Unauthorized Contractual Commitments	high	LLM05
`competitors`	Competitor Endorsement	medium	—
`off-topic`	Off-Topic Drift	low	—
`reasoning-dos`	Reasoning Denial of Service	high	LLM10
`ascii-smuggling`	ASCII / Unicode Smuggling	high	LLM01

`harmful-content`

Content-harm coverage from the MLCommons hazard taxonomy and Harmbench. Best paired with raw LLM endpoints; compose with owasp-llm-top10, owasp-api, and eu-ai-act-bias for full coverage.

ID	Name	Severity	Tag
`harmful-bioweapons`	Chemical & Biological Weapons	critical	LLM10 / AML.T0054
`harmful-indiscriminate-weapons`	Indiscriminate Weapons (CBRN)	critical	LLM10 / AML.T0054
`harmful-weapons-ied`	Improvised Explosive Devices	critical	LLM10 / AML.T0054
`harmful-cybercrime-malicious-code`	Cybercrime & Malicious Code	critical	LLM10 / AML.T0054
`harmful-child-exploitation`	Child Exploitation (CSAM)	critical	LLM10 / AML.T0054
`harmful-sex-crime`	Sex Crime	critical	LLM10 / AML.T0054
`harmful-violent-crime`	Violent Crime	high	LLM10 / AML.T0054
`harmful-illegal-drugs`	Illegal Drug Synthesis & Trafficking	high	LLM10 / AML.T0054
`harmful-self-harm`	Self-Harm & Suicide	high	LLM10 / AML.T0054
`harmful-radicalization`	Radicalization & Extremism	high	LLM10 / AML.T0054
`harmful-specialized-advice`	Unqualified Specialized Advice	high	LLM09 / AML.T0048
`harmful-unsafe-practices`	Promotion of Unsafe Practices	high	LLM09 / AML.T0048

Standalone (not in any suite)

ID	Name	Severity	OWASP
`jailbreaking`	Jailbreaking	high	LLM10

Select it explicitly via selection.evaluators: ["jailbreaking"].

MCP red-team

Suite

Suite ID	Standard / version	Count	Focus
`owasp-mcp-top10`	OWASP MCP Top 10 (2025)	14	Server-side: secret exposure, OAuth passthrough, scope escalation, supply chain, tool poisoning, command injection, SSRF, and more

Evaluators

ID	Name	Severity	OWASP
`secret-exposure`	Secret and Token Exposure	critical	MCP01
`oauth-token-passthrough`	OAuth Confused Deputy and Token Passthrough	critical	MCP01
`scope-escalation`	Scope Escalation and Privilege Bypass	high	MCP02
`tool-description-injection`	Tool Poisoning (Description Injection, Rug Pull, Schema Poisoning)	critical	MCP03
`tool-description-scan`	Tool Description Poisoning Scan	critical	MCP03
`content-injection`	Second-Order Content Injection	high	MCP03
`supply-chain`	Software Supply Chain Attacks & Dependency Tampering	high	MCP04
`command-injection`	Command Injection and STDIO RCE	critical	MCP05
`ssrf`	Server-Side Request Forgery (SSRF)	critical	MCP05
`intent-subversion`	Intent Flow Subversion	high	MCP06
`missing-authentication`	Missing Authentication	critical	MCP07
`audit-telemetry`	Lack of Audit and Telemetry	medium	MCP08
`shadow-mcp-server`	Shadow MCP Server Detection	high	MCP09
`cross-resource-leakage`	Context Injection, Over-Sharing & Cross-Resource Leakage	critical	MCP10

Auto-fired

ID	Name	Severity	OWASP
`resource-exposure`	MCP Resource Exposure	critical	MCP01

resource-exposure runs automatically before attacks — opfor calls resources/list + resources/read on every resource and judges for secret/PII exposure. Disable with mcp.scanResources: false in the config.

​Agent red-team

​Suites

​owasp-llm-top10

​owasp-agentic-ai

​owasp-mcp-top10 (agent-side)

​owasp-api

​eu-ai-act-bias

​output-trust-and-safety

​harmful-content

​Standalone (not in any suite)

​MCP red-team

​Suite

​Evaluators

​Auto-fired