Skip to main content

Safer Agentic AIAuto-Assessor

Assess your AI systems against the Safer Agentic AI Framework. Upload your evidence documents, receive scored evaluations powered by Claude, and pinpoint compliance gaps across all 16 safety suites.

Testing Ground

See how the auto-assessor works before committing. Browse pre-computed evaluation results using synthetic data from a fictional “Acme Autonomous Vehicles” company.

  • Pre-computed results across all 16 suites, 63 evidence requirements
  • Scores from 1 to 5 with justifications and gap analysis
  • No API key or account needed

AI Safety Assessment

Evaluate your own organization's AI systems with live Claude analysis. Paste your evidence documents and get real-time scoring against any of the 846 evidence requirements.

  • Uses your own Anthropic API key — stored in memory only, never saved
  • Live evaluation with detailed justifications and gap analysis
  • Your data never leaves the API call
16
Safety Suites
9 Drivers + 7 Inhibitors
214
Sub-goals
With Safety Functional Requirements (SFRs) and evidence items
847
Evidence Requirements
Each individually assessable

How It Works

  1. 1Browse the 16 safety suites covering drivers (goals, values, security, transparency, governance) and inhibitors (deception, opacity, uncertainty).
  2. 2Select a sub-goal and review its Safety Functional Requirements (SFRs) and evidence requirements.
  3. 3Paste or upload evidence documentation and run automated evaluation against each requirement.
  4. 4Receive scores (1–5), justifications, identified gaps, and relevant excerpts for each criterion.

Scoring Rubric

5
Excellent
Evidence comprehensively addresses all aspects of the requirement.
4
Good
Most aspects addressed. Minor areas for improvement remain.
3
Average
Some aspects addressed, but significant gaps remain.
2
Poor
Some relevant information, but major aspects unaddressed.
1
Unacceptable
Evidence does not address the requirement at all.