Live Webinar
Why LLMs Fail
in the SOC
Join the world’s first Cyber Defense Benchmark team to uncover why 12 frontier LLMs failed real-world attack campaigns.
calendar_today
Date
24th June, 2025
schedule
Time
9:30 AM PST
Secure Your Seat
Benchmark Results
12 frontier LLMs vs 26 attack campaigns.
858
Total Runs
0%
Pass Rate
105
MITRE Procedures
"Your SOC needs AI, but standalone LMMs are not the answer."
Critical Failure Modes
warning
INSIGHTS
code_off
Reward Hacking
Models finding shortcuts to 'complete' tasks without resolution.
lock_open
Constraint Bypass
Unintentional leakage of critical security parameters.
calculate
Mental Math Failures
Probabilistic errors leading to incorrect risk scoring.
record_voice_over
Saying-Not-Doing
Describing remediation while failing to execute the call.
smart_toy
Agentsplaining
Hallucinatory justifications for failed actions or missed detections.
