SaferAgenticAI

SAAI Auto-Assessor

LLM-powered conformity assessor for the Safer Agentic AI Framework. Upload evidence documents, select criteria, and get scored evaluations against the framework’s safety requirements.

Built with Next.js 14, Tailwind CSS, and the Anthropic API (client-side only, no server involvement).

Quick Start

cp .env.example .env.local        # Add your Anthropic API key
npm install                        # Install dependencies
npm run dev                        # http://localhost:3000/assessor/

Project Structure

src/
  app/
    page.tsx               # Home/landing page
    assess/page.tsx        # Full assessment workflow
    playground/page.tsx    # Quick evaluation playground
    data-handling/page.tsx # Data handling info page
    layout.tsx             # Root layout
    error.tsx              # Error boundary
    api/evaluate/route.ts  # (deprecated, unused)
  components/
    CriteriaTree.tsx       # Framework criteria browser
    SuiteDetail.tsx        # Suite-level detail view
    RadarChart.tsx         # Score visualization
    DocumentIngestion.tsx  # File upload and parsing
    EvidenceEvaluator.tsx  # Evaluation orchestration
    ScoreCard.tsx          # Score display
    ApiKeyInput.tsx        # API key entry
  lib/
    types.ts               # TypeScript type definitions
    criteria.ts            # Criteria data loader
    scoring.ts             # Score calculation (weighted averages)
    client-evaluator.ts    # Anthropic API calls (browser-side)
    document-parser.ts     # PDF, DOCX, image, text extraction
    evaluator.ts           # (deprecated, unused)
  data/
    criteria-v1.json       # Framework criteria definitions
    synthetic/             # Sample evidence documents for testing

Scripts

Command	Description
`npm run dev`	Start dev server on port 3000
`npm run build`	Production build (static export)
`npm run lint`	Run Next.js linting
`npm test`	Run all Playwright tests
`npm run test:chromium`	Tests in Chromium only
`npm run test:loading`	App loading tests
`npm run test:playground`	Playground page tests
`npm run test:assess`	Assessment workflow tests
`npm run test:mobile`	Mobile responsiveness tests
`npm run test:a11y`	Accessibility tests
`npm run test:visual`	Visual design tests
`npm run test:nav`	Navigation tests
`npm run test:report`	Open HTML test report

Running Tests

Tests use Playwright. First-time setup:

npx playwright install             # Downloads browser binaries (one-time)
npm test                           # Runs all test suites

Architecture Notes

All LLM evaluation happens client-side. The user’s API key and documents never touch our servers. The api/evaluate/route.ts and evaluator.ts files are deprecated stubs from an earlier server-side design.

Scoring uses a 1-5 scale (1=Unacceptable, 2=Poor, 3=Average, 4=Good, 5=Excellent) with normative SFRs weighted at 1.5x. See src/lib/scoring.ts for the calculation logic.