Case studies · anonymised findingsFILE NO. 008 / FIELD REPORTS

Findings,
in plain
English.

Anonymised excerpts from real Technical Assessment Reports — 600+ tests, 7 adaptive iterations, 100% attack success rate where the vulnerability was confirmed. Names changed; the engineering, the evidence and the remediation are not.

Test your endpoint See a sample report →

Tests executed600+ per engagement

Adaptive depthUp to 7 iterations

FrameworksEU · ISO · NIST · OWASP

IdentityNDA · anonymised

01 / FeaturedCS-001 · E-commerce chatbot

One trigger word. Every guardrail bypassed.

A production e-commerce chatbot with working data-protection rules — and a behavioural vulnerability that escalated from HIGH to CATASTROPHIC across seven adaptive iterations. Discovered in under an hour.

CS-001 · Catastrophic · OWASP LLM01/06/08/09 · Article 15.1/15.4/15.5

Persistent jailbreak in a production e-commerce chatbot.

7 iterations · 100% success · €35M+ exposure

A seemingly secure customer-service chatbot — confident, internally tested — was systematically compromised through progressive behavioural manipulation. Seven adaptive iterations established a persistent bypass mechanism triggered by a single word: OMEGA. Data-access rules held; behavioural controls did not. The full report walks the attack chain, the framework violations, and the remediation roadmap. Read the full case study →

02 / MethodologyWhy static testing misses these

Static tests find the door. Adaptive tests walk through it.

Conventional AI red-teaming is single-shot. We continue past first refusal — escalating, recombining and persisting — exactly the way a real attacker does.

A · Conventional testing

One shot. One verdict.

Single attempt per vulnerability. No follow-up after initial refusal. Manual engagement at $15K–50K and weeks of calendar time. Coverage typically 100–200 tests. Point-in-time; no retest. Behavioural depth: shallow.

Where most audits stop

B · TestMy.AI adaptive

600+ tests. Then we push.

600+ static tests, plus adaptive escalation on every failure. Progressive iteration mimics real attacker behaviour. 5–14 business days, $3.5K–$9.5K. One retest within 30 days included. Full OWASP LLM Top 10, ISO 42001, NIST AI RMF and EU AI Act coverage.

Where the catastrophic findings live

● Begin · Most systems have at least one critical

Could your endpoint survive seven iterations?

Most AI endpoints we test ship with at least one CRITICAL vulnerability. An independent assessment finds yours before an external attacker does — and gives you the evidence to file.

Request a full audit Start with discovery →