Skip to main content

MCP Server

Serve the Safer Agentic AI framework — canonical criteria plus 233 Implementation Patterns — to Claude Code, Cursor, Windsurf, or any MCP client, so your coding assistant grounds safety recommendations in the canonical Drivers, Inhibitors, and patterns as you write code.

What you get

  • 233 Implementation Patterns — one concrete pattern per subgoal: 116 code-applicable, 55 governance, 40 process, 17 ecosystem. Each anchored to its SFRs and evidence requirements.
  • 10 MCP tools for lookup, filtered listing, field-weighted search, cross-reference navigation, and task-first pattern discovery.
  • Task-first lookup — ask “what patterns apply to a tool-using agent executing shell commands?” and the server returns relevant patterns grouped by suite, ranked by field-weighted keyword matching.
  • Hot reload — pattern YAML edits are picked up on each tool call, no restart needed.

The Implementation Patterns layer is developer guidance derived from the framework — not normative. Compliance claims anchor to the framework itself. Patterns help teams get there.

Worked example

Inside Claude Code, ask:

Use the saferagenticai MCP to find the framework guidance
relevant to building an agent that runs untrusted shell commands.

The assistant calls find_patterns_for_task. Trimmed response:

{
  "task": "sandbox an autonomous agent executing untrusted shell commands...",
  "keywords_used": ["sandbox", "autonomous", "executing", "untrusted", "shell", "commands", "network", "access"],
  "total_matches": 171,
  "flat_top": [
    {
      "pattern_id": "D3::idx2::sandboxing",
      "display_id": "D3.2",
      "title": "Sandboxing",
      "score": 61, "matched_in": "title",
      "confidence": "high", "needs_human_review": false
    },
    {
      "pattern_id": "D6::idx9::management-of-access-and-usage-restricti",
      "display_id": "D6.9",
      "title": "Management of Access and Usage Restrictions",
      "score": 25, "matched_in": "title"
    }
    // ...truncated
  ],
  "next_step_hint": "For each pattern_id above, call get_requirement(id, ...)"
}

The assistant then calls get_requirement("D3::idx2::sandboxing") for the full pattern — five named design patterns (multi-layer isolation, pre-validation gate, environment-scoped credentials, staging fidelity, deny-logged), five anti-patterns, pseudocode, and evidence-production hints — and grounds its recommendation in that material rather than from training data alone.

Note the score=25 hit on D6.9. That's the system telling you a sandboxing question also implicates access-management. Cross-cutting design questions hit multiple suites by design.

Install

Pick the path that matches your setup.

1 From PyPI (recommended)

pipx install saferagenticai-mcp
# or, with the modern uv toolchain:
uv tool install saferagenticai-mcp
# or plain pip:
pip install --user saferagenticai-mcp

The wheel bundles criteria-v1.json, 233 pattern YAMLs, and 4 exemplars, so it works without any repo checkout.

For audit trails, pin the version so reviewers can reproduce results: pipx install saferagenticai-mcp==0.3.0. Older releases (0.2.0) lack the verbosity param on find_patterns_for_task / search_patterns.

2 With uvx (no manual install)

If you have uv (opens in new tab) installed, your MCP client can launch the server on demand:

uvx saferagenticai-mcp

uv handles isolation and caches the install. This works well as a single-command entry in ~/.claude/mcp.json.

3 Manual venv (editable checkout)

If you have a local checkout and want pattern edits picked up live:

python3 -m venv research/mcp/.venv
research/mcp/.venv/bin/pip install -e research/mcp/server

Produces research/mcp/.venv/bin/saferagenticai-mcp.

Configure (Claude Code)

Add the server to ~/.claude/mcp.json (or your IDE's MCP config). Restart Claude Code after editing; the server loads on the first tool call.

With pipx or PyPI install

"mcpServers": {
  "saferagenticai": {
    "command": "saferagenticai-mcp"
  }
}

With uvx

"mcpServers": {
  "saferagenticai": {
    "command": "uvx",
    "args": ["saferagenticai-mcp"]
  }
}

With a manual venv checkout

Use the absolute path to the venv binary:

"command": "/absolute/path/to/research/mcp/.venv/bin/saferagenticai-mcp"

Cursor / Windsurf

Same shape, different file. For Cursor, edit ~/.cursor/mcp.json:

{
  "mcpServers": {
    "saferagenticai": {
      "command": "uvx",
      "args": ["saferagenticai-mcp"]
    }
  }
}

Windsurf uses ~/.codeium/windsurf/mcp_config.json with the same JSON body. Any MCP-compatible client that supports stdio servers should work; only the config-file location differs.

Verify it works

After restarting your client, two checks confirm the wiring end-to-end:

1 Transport check (in Claude Code)

Type /mcp. saferagenticai should appear with a connected status and 10 tools listed. If it shows as failed, jump to troubleshooting.

2 Data check (no client needed)

python3 -c "
from saferagenticai_mcp.framework_loader import load_framework
idx = load_framework()
print(f'{len(idx.subgoals)} subgoals, '
      f'{sum(1 for s in idx.subgoals.values() if s.has_pattern)} with patterns')
"

Expect 233 subgoals, 233 with patterns. Anything less means the data files didn't ship correctly — reinstall with pipx reinstall saferagenticai-mcp.

Troubleshooting

Server doesn't appear in /mcp output

Restart Claude Code after editing ~/.claude/mcp.json. The config is read once at startup; mid-session edits aren't picked up.

command not found: saferagenticai-mcp

The binary isn't on the PATH the client launches with. Three fixes, in order of preference:

  • Install via pipx (puts the binary in ~/.local/bin, on most PATHs) instead of a manual venv.
  • Use the absolute path in command: /Users/you/.../research/mcp/.venv/bin/saferagenticai-mcp.
  • Switch to uvx saferagenticai-mcpuv resolves the binary itself.
Server connects but tool calls return errors

Run the data check above directly. If it errors, the bundled data is missing or the package version is too old. Upgrade with pipx upgrade saferagenticai-mcp, or pin explicitly: pipx install saferagenticai-mcp==0.3.0 --force.

I edited a pattern YAML and the server returns the old answer

Hot-reload is suite-aware: it stat-walks research/mcp/suites/ on every tool call. It only fires for editable installs (mode 1 in framework_loader.py (opens in new tab)). PyPI / wheel installs ship a frozen copy of the data — pattern edits in your local repo won't be visible. Reinstall in editable mode to get hot-reload: pip install -e research/mcp/server.

Search returns confidently wrong-looking results

Search is keyword-only with substring matching, so "coding" matches inside "encoding". The internal eval (below) measures top-3 hit rate at 61.9%. Until embedding search is added, treat low-score (< 15) hits with skepticism and prefer hits where matched_in: title.

Tools (10 total)

Tool Input Returns
list_suites 16 suites with titles and subgoal counts
get_requirement id, include_pattern One subgoal plus its Pattern layer; falls back to fuzzy candidates if no exact match
list_requirements suite / type / content_type / confidence filters Filtered subgoal list with reliability signals
search_patterns query, limit, verbosity Field-weighted ranked matches with matched_in and (in full mode) snippets + confidence flags
get_cross_references id, include_inferred Outgoing adjacencies
get_reverse_references id Incoming adjacencies (who cites this pattern)
resolve_id query Canonicalise a partial id, slug fragment, or display_id; always returns candidates
find_patterns_for_task task, limit, verbosity Top patterns grouped by suite for a task description; defaults to compact mode for cheap triage
list_unreviewed limit Patterns without reviewed_by, sorted low-confidence first
review_stats Coverage %, per-suite, per-confidence; plus validation issue count

Trust signals in every result

Each pattern hit carries provenance so an LLM (or a human reviewer) can weight it appropriately:

  • confidencelow / medium / high, the drafting subagent's honest assessment. Current spread: 62 high, 142 medium, 10 low.
  • needs_human_review — flagged when the subagent saw an ambiguity in the normative content (e.g., scope mismatch between an SFR and its subgoal). 144 of 233 patterns currently carry this flag.
  • reviewed_by — populated when a human redlines and signs off on a pattern. Currently 0/233; Phase 3 in progress.

If you're using the MCP to draft compliance evidence, prefer high-confidence + reviewed patterns; treat medium-confidence drafts as a starting point, not as authority. Compliance claims always anchor to the canonical layer, never to the Pattern layer.

Search quality

Search is field-weighted keyword matching, not semantic. A 21-task internal smoke test (research/mcp/eval/ranking_eval.yaml) gives current numbers:

33.3%top-1 hit
61.9%top-3 hit
0.52MRR (limit 10)

Eval authored by the same team that built the search; not blinded. Treat as a smoke test, not a benchmark.

The misses are instructive. "Psychological safety for engineers raising concerns" doesn't find D7.2 Culture of Safety because the canonical text uses different vocabulary. "Scope what tools an agent is allowed to call" doesn't find D3.1 Authorization for the same reason. This is the case for embedding-based search at ~10× the current scale; for now, prefer hits where matched_in: title and treat low-score results skeptically.

Field weights (transparent so you can reason about ranking):

FieldWeightNotes
title10×Highest signal; canonical language
summaryPattern-author one-paragraph framing
sfrNormative requirement text
descriptionCanonical subgoal description
bodyCatch-all over the full pattern YAML

The result's matched_in field tells you which weighted field produced the hit, so you can spot e.g. a body-only match (often spurious) vs. a title hit (usually solid).

Inferred cross-references: get_cross_references(include_inferred=true) returns same-suite siblings only. There is no embedding-similarity component. Treat inferred entries as "neighbours worth scanning," not as endorsed dependencies.

Data sources

  • Canonical framework — extracted from the Safer Agentic AI Recommended Practices and shipped as criteria-v1.json.
  • Pattern layer — one YAML file per subgoal under suites/.
  • Exemplars — four hand-written anchor patterns used as few-shot references (D3.2 Sandboxing, D7.2 Culture of Safety, D9.2 Compliance, I7.3).

At startup the server loads both layers and builds an in-memory index keyed by pattern_id. display_id lookups (e.g. D3.2) are also supported and may resolve to multiple subgoals when underlined variants exist.

Two-layer authorship

The framework is served in two layers with different review bars and update cadences:

  • Canonical layer — the normative SFRs and evidence requirements, authored by Nell Watson and Ali Hessami. Stable, auditable, versioned via criteria-v1.json.
  • Implementation Patterns layer — LLM-drafted, human-reviewed guidance that translates normative requirements into code patterns, governance templates, and process scaffolds. Versioned independently.

Every pattern entry carries provenance: drafted_by, anchor_exemplar, confidence, needs_human_review, and reviewed_by once redlined.

Versioning & drift

  • Canonical framework — follows criteria-v1.json's version field.
  • Pattern layerv1-draft during active review; v1 once all patterns have a human reviewed_by.
  • Server — semantic versioning. 0.3.0 introduces verbosity on the search tools; pin if you want reproducibility.
  • Drift auditresearch/mcp/audit_pattern_drift.py compares each pattern's anchored SFR/evidence letters against the current canonical and flags new, deleted, or renamed entries. Run after any framework edit; non-zero exit means at least one pattern needs re-anchoring.

Scope & limits

  • Transport — stdio only; no remote or authenticated transport.
  • Search — field-weighted keyword scoring. Sufficient at 233 patterns; embedding-based semantic search would be worth it at ~10× this scale.
  • Read-only — no mark_reviewed write tool. Review edits go through the YAML files directly so editor + git diff stay auditable.

Questions or issues? Reach us through the contact form on the homepage.

Last updated 2026-04-18 Server v0.3.0 Pattern layer v1-draft