Open source · 130 community tests, 380 proprietaryFILE NO. 020 / OSS · CC BY-SA 4.0

Independent
tests. Half of
them open.

130 foundational tests published on GitHub under CC BY-SA 4.0 — vetted in the wild, used in education, aligned to the OWASP LLM Top 10. 380 proprietary tests ship inside paid engagements where adaptive escalation and adversarial chaining belong.

Open tests130 on GitHub
Proprietary380 in audits
Frameworks mapped4
LicenseCC BY-SA 4.0
01 / Open sourceWhat ships on GitHub

The baseline, in public.

Foundational OWASP LLM Top 10 coverage — the same tests that anchor every paid engagement. Vetted by the community, used by educators, free to read and free to run.

01 · The repository

130 foundational tests, every category.

Baseline coverage of the OWASP LLM Top 10. Markdown definitions with expected-behaviour indicators, severity scoring guidance, and usage notes for manual testing. Designed for transparency and education first.

MIT-aligned · CC BY-SA 4.0
02 · Why open

Trust through transparency.

OWASP CC BY-SA 4.0 compliance for the baseline. Community trust through public methodology. And a fair line: the foundational layer is open; the adaptive layer is what you hire us for.

No black box on the basics
03

Markdown definitions

Each test ships as a single Markdown file — attack prompt, expected refusal, expected vulnerable response, OWASP category, severity guidance. Readable in any editor.

Human-readable
04

Severity guides

Calibration notes for each OWASP category — what HIGH looks like, what CRITICAL looks like, and the line between them. The same scoring rubric we use internally.

Calibrated
05

OWASP-aligned

Based directly on the OWASP LLM Top 10 (2025). The same categories your security team is already reading.

LLM01–LLM10
02 / ProprietaryWhat stays inside engagements

The adaptive layer. Where the catastrophic findings live.

Static tests find the door. Adaptive tests walk through it. The 380 proprietary tests cover the chains, encodings, and multi-turn escalations that real attackers actually use — and that are not safe to ship publicly.

01 · Why these are not public

Adversarial chains are dual-use.

Restricted release · audit-only

A working multi-turn jailbreak, an indirect-injection vector that survives RAG retrieval, an agent-tool exploitation chain — these are operational weapons as much as test cases. They live inside paid engagements where they reach defenders, not attackers.

02

Adaptive engine

The escalation logic that generated CS-001 from a single HIGH refusal to a CATASTROPHIC trigger word. Iterative attack generation, not single-shot.

Up to 7 iterations
03

Agentic chains

Multi-step tool abuse, MCP exploitation, RAG poisoning, credential harvesting, agent impersonation — 40+ chain tests built for real production architectures.

AGENTIC-001 → 043
04

Encoded & evasive

Encoded payloads, obfuscated prompts, multi-turn baiting, persona drift — every published jailbreak family plus the adaptive variants the static set won't catch.

Production-tuned
03 / ContributingHow to participate

Open a PR. Or open an engagement.

Two ways to contribute: extend the open suite on GitHub, or run a paid engagement and let your remediated findings feed back into the methodology (anonymised, with your consent).

01 · Open contributions

Pull requests welcome.

Issues, new test cases, severity refinements, framework cross-references — all welcome under CC BY-SA 4.0. Read the contribution guide on the repo before opening a PR.

02 · Run the full suite

Audit-grade evidence.

Want the 130 open tests plus the 380 adaptive ones, with forensic-grade evidence and cross-framework mapping? That's an engagement.

Based on the OWASP LLM Top 10 (2025). License: CC BY-SA 4.0.
Begin · The baseline is free; the depth is hired

Run the open tests. Then hire us for the rest.

The 130-test community edition is yours, today, on GitHub. The 380-test adaptive engine ships with your audit — evidence-grade, cross-framework, signed.