SaferAgenticAI

SAAI Auto-Assessor

LLM-powered conformity assessor for the Safer Agentic AI Framework. Upload evidence documents, select criteria, and get scored evaluations against the framework’s safety requirements.

Built with Next.js 14, Tailwind CSS, and the Anthropic API (client-side only, no server involvement).

Quick Start

cp .env.example .env.local        # Add your Anthropic API key
npm install                        # Install dependencies
npm run dev                        # http://localhost:3000/assessor/

Project Structure

src/
  app/
    page.tsx               # Home/landing page
    assess/page.tsx        # Full assessment workflow
    playground/page.tsx    # Quick evaluation playground
    data-handling/page.tsx # Data handling info page
    layout.tsx             # Root layout
    error.tsx              # Error boundary
    api/evaluate/route.ts  # (deprecated, unused)
  components/
    CriteriaTree.tsx       # Framework criteria browser
    SuiteDetail.tsx        # Suite-level detail view
    RadarChart.tsx         # Score visualization
    DocumentIngestion.tsx  # File upload and parsing
    EvidenceEvaluator.tsx  # Evaluation orchestration
    ScoreCard.tsx          # Score display
    ApiKeyInput.tsx        # API key entry
  lib/
    types.ts               # TypeScript type definitions
    criteria.ts            # Criteria data loader
    scoring.ts             # Score calculation (weighted averages)
    client-evaluator.ts    # Anthropic API calls (browser-side)
    document-parser.ts     # PDF, DOCX, image, text extraction
    evaluator.ts           # (deprecated, unused)
  data/
    criteria-v1.json       # Framework criteria definitions
    synthetic/             # Sample evidence documents for testing

Scripts

Command Description
npm run dev Start dev server on port 3000
npm run build Production build (static export)
npm run lint Run Next.js linting
npm test Run all Playwright tests
npm run test:chromium Tests in Chromium only
npm run test:loading App loading tests
npm run test:playground Playground page tests
npm run test:assess Assessment workflow tests
npm run test:mobile Mobile responsiveness tests
npm run test:a11y Accessibility tests
npm run test:visual Visual design tests
npm run test:nav Navigation tests
npm run test:report Open HTML test report

Running Tests

Tests use Playwright. First-time setup:

npx playwright install             # Downloads browser binaries (one-time)
npm test                           # Runs all test suites

Architecture Notes

All LLM evaluation happens client-side. The user’s API key and documents never touch our servers. The api/evaluate/route.ts and evaluator.ts files are deprecated stubs from an earlier server-side design.

Scoring uses a 1-5 scale (1=Unacceptable, 2=Poor, 3=Average, 4=Good, 5=Excellent) with normative SFRs weighted at 1.5x. See src/lib/scoring.ts for the calculation logic.