MCP Server
Serve the Safer Agentic AI framework — canonical criteria plus 233 Implementation Patterns — to Claude Code, Cursor, Windsurf, or any MCP client, so your coding assistant grounds safety recommendations in the canonical Drivers, Inhibitors, and patterns as you write code.
What you get
- 233 Implementation Patterns — one concrete pattern per subgoal: 116 code-applicable, 55 governance, 40 process, 17 ecosystem. Each anchored to its SFRs and evidence requirements.
- 10 MCP tools for lookup, filtered listing, field-weighted search, cross-reference navigation, and task-first pattern discovery.
- Task-first lookup — ask “what patterns apply to a tool-using agent executing shell commands?” and the server returns relevant patterns grouped by suite, ranked by field-weighted keyword matching.
- Hot reload — pattern YAML edits are picked up on each tool call, no restart needed.
The Implementation Patterns layer is developer guidance derived from the framework — not normative. Compliance claims anchor to the framework itself. Patterns help teams get there.
Worked example
Inside Claude Code, ask:
Use the saferagenticai MCP to find the framework guidance
relevant to building an agent that runs untrusted shell commands.
The assistant calls find_patterns_for_task. Trimmed response:
{
"task": "sandbox an autonomous agent executing untrusted shell commands...",
"keywords_used": ["sandbox", "autonomous", "executing", "untrusted", "shell", "commands", "network", "access"],
"total_matches": 171,
"flat_top": [
{
"pattern_id": "D3::idx2::sandboxing",
"display_id": "D3.2",
"title": "Sandboxing",
"score": 61, "matched_in": "title",
"confidence": "high", "needs_human_review": false
},
{
"pattern_id": "D6::idx9::management-of-access-and-usage-restricti",
"display_id": "D6.9",
"title": "Management of Access and Usage Restrictions",
"score": 25, "matched_in": "title"
}
// ...truncated
],
"next_step_hint": "For each pattern_id above, call get_requirement(id, ...)"
}
The assistant then calls get_requirement("D3::idx2::sandboxing") for the full pattern —
five named design patterns (multi-layer isolation, pre-validation gate, environment-scoped credentials,
staging fidelity, deny-logged), five anti-patterns, pseudocode, and evidence-production hints —
and grounds its recommendation in that material rather than from training data alone.
Note the score=25 hit on D6.9. That's the system telling you a sandboxing question also implicates access-management. Cross-cutting design questions hit multiple suites by design.
Install
Pick the path that matches your setup.
1 From PyPI (recommended)
pipx install saferagenticai-mcp
# or, with the modern uv toolchain:
uv tool install saferagenticai-mcp
# or plain pip:
pip install --user saferagenticai-mcp
The wheel bundles criteria-v1.json, 233 pattern YAMLs, and 4 exemplars, so it works
without any repo checkout.
For audit trails, pin the version so reviewers can reproduce results:
pipx install saferagenticai-mcp==0.3.0. Older releases (0.2.0) lack
the verbosity param on find_patterns_for_task /
search_patterns.
2 With uvx (no manual install)
If you have uv (opens in new tab) installed, your MCP client can launch the server on demand:
uvx saferagenticai-mcp
uv handles isolation and caches the install. This works well as a single-command entry
in ~/.claude/mcp.json.
3 Manual venv (editable checkout)
If you have a local checkout and want pattern edits picked up live:
python3 -m venv research/mcp/.venv
research/mcp/.venv/bin/pip install -e research/mcp/server
Produces research/mcp/.venv/bin/saferagenticai-mcp.
Configure (Claude Code)
Add the server to ~/.claude/mcp.json (or your IDE's MCP config). Restart Claude Code
after editing; the server loads on the first tool call.
With pipx or PyPI install
"mcpServers": {
"saferagenticai": {
"command": "saferagenticai-mcp"
}
}
With uvx
"mcpServers": {
"saferagenticai": {
"command": "uvx",
"args": ["saferagenticai-mcp"]
}
}
With a manual venv checkout
Use the absolute path to the venv binary:
"command": "/absolute/path/to/research/mcp/.venv/bin/saferagenticai-mcp"
Cursor / Windsurf
Same shape, different file. For Cursor, edit ~/.cursor/mcp.json:
{
"mcpServers": {
"saferagenticai": {
"command": "uvx",
"args": ["saferagenticai-mcp"]
}
}
}
Windsurf uses ~/.codeium/windsurf/mcp_config.json with the same JSON body.
Any MCP-compatible client that supports stdio servers should work; only the config-file location differs.
Verify it works
After restarting your client, two checks confirm the wiring end-to-end:
1 Transport check (in Claude Code)
Type /mcp. saferagenticai should appear with a connected status
and 10 tools listed. If it shows as failed, jump to troubleshooting.
2 Data check (no client needed)
python3 -c "
from saferagenticai_mcp.framework_loader import load_framework
idx = load_framework()
print(f'{len(idx.subgoals)} subgoals, '
f'{sum(1 for s in idx.subgoals.values() if s.has_pattern)} with patterns')
"
Expect 233 subgoals, 233 with patterns. Anything less means the data files
didn't ship correctly — reinstall with pipx reinstall saferagenticai-mcp.
Troubleshooting
Server doesn't appear in /mcp output
Restart Claude Code after editing ~/.claude/mcp.json. The config is read once at
startup; mid-session edits aren't picked up.
command not found: saferagenticai-mcp
The binary isn't on the PATH the client launches with. Three fixes, in order of preference:
- Install via
pipx(puts the binary in~/.local/bin, on most PATHs) instead of a manual venv. - Use the absolute path in
command:/Users/you/.../research/mcp/.venv/bin/saferagenticai-mcp. - Switch to
uvx saferagenticai-mcp—uvresolves the binary itself.
Server connects but tool calls return errors
Run the data check above directly. If it errors, the bundled data is missing or the package
version is too old. Upgrade with pipx upgrade saferagenticai-mcp, or pin
explicitly: pipx install saferagenticai-mcp==0.3.0 --force.
I edited a pattern YAML and the server returns the old answer
Hot-reload is suite-aware: it stat-walks research/mcp/suites/ on every tool call.
It only fires for editable installs (mode 1 in framework_loader.py (opens in new tab)).
PyPI / wheel installs ship a frozen copy of the data — pattern edits in your local repo
won't be visible. Reinstall in editable mode to get hot-reload:
pip install -e research/mcp/server.
Search returns confidently wrong-looking results
Search is keyword-only with substring matching, so "coding" matches inside
"encoding". The internal eval (below)
measures top-3 hit rate at 61.9%. Until embedding search is added, treat low-score
(< 15) hits with skepticism and prefer hits where matched_in: title.
Tools (10 total)
| Tool | Input | Returns |
|---|---|---|
list_suites |
— | 16 suites with titles and subgoal counts |
get_requirement |
id, include_pattern |
One subgoal plus its Pattern layer; falls back to fuzzy candidates if no exact match |
list_requirements |
suite / type / content_type / confidence filters | Filtered subgoal list with reliability signals |
search_patterns |
query, limit, verbosity |
Field-weighted ranked matches with matched_in and (in full mode) snippets + confidence flags |
get_cross_references |
id, include_inferred |
Outgoing adjacencies |
get_reverse_references |
id |
Incoming adjacencies (who cites this pattern) |
resolve_id |
query |
Canonicalise a partial id, slug fragment, or display_id; always returns candidates |
find_patterns_for_task |
task, limit, verbosity |
Top patterns grouped by suite for a task description; defaults to compact mode for cheap triage |
list_unreviewed |
limit |
Patterns without reviewed_by, sorted low-confidence first |
review_stats |
— | Coverage %, per-suite, per-confidence; plus validation issue count |
Trust signals in every result
Each pattern hit carries provenance so an LLM (or a human reviewer) can weight it appropriately:
confidence—low/medium/high, the drafting subagent's honest assessment. Current spread: 62 high, 142 medium, 10 low.needs_human_review— flagged when the subagent saw an ambiguity in the normative content (e.g., scope mismatch between an SFR and its subgoal). 144 of 233 patterns currently carry this flag.reviewed_by— populated when a human redlines and signs off on a pattern. Currently 0/233; Phase 3 in progress.
If you're using the MCP to draft compliance evidence, prefer high-confidence + reviewed patterns; treat medium-confidence drafts as a starting point, not as authority. Compliance claims always anchor to the canonical layer, never to the Pattern layer.
Search quality
Search is field-weighted keyword matching, not semantic. A 21-task internal smoke test
(research/mcp/eval/ranking_eval.yaml) gives current numbers:
Eval authored by the same team that built the search; not blinded. Treat as a smoke test, not a benchmark.
The misses are instructive. "Psychological safety for engineers raising concerns" doesn't find
D7.2 Culture of Safety because the canonical text uses different vocabulary. "Scope what tools an
agent is allowed to call" doesn't find D3.1 Authorization for the same reason. This is the case
for embedding-based search at ~10× the current scale; for now, prefer hits where
matched_in: title and treat low-score results skeptically.
Field weights (transparent so you can reason about ranking):
| Field | Weight | Notes |
|---|---|---|
title | 10× | Highest signal; canonical language |
summary | 4× | Pattern-author one-paragraph framing |
sfr | 3× | Normative requirement text |
description | 2× | Canonical subgoal description |
body | 1× | Catch-all over the full pattern YAML |
The result's matched_in field tells you which weighted field produced the hit, so you can
spot e.g. a body-only match (often spurious) vs. a title hit (usually solid).
Inferred cross-references: get_cross_references(include_inferred=true) returns
same-suite siblings only. There is no embedding-similarity component. Treat inferred entries as
"neighbours worth scanning," not as endorsed dependencies.
Data sources
- Canonical framework — extracted from the Safer Agentic AI Recommended Practices
and shipped as
criteria-v1.json. - Pattern layer — one YAML file per subgoal under
suites/. - Exemplars — four hand-written anchor patterns used as few-shot references
(
D3.2Sandboxing,D7.2Culture of Safety,D9.2Compliance,I7.3).
At startup the server loads both layers and builds an in-memory index keyed by pattern_id.
display_id lookups (e.g. D3.2) are also supported and may resolve to
multiple subgoals when underlined variants exist.
Two-layer authorship
The framework is served in two layers with different review bars and update cadences:
- Canonical layer — the normative SFRs and evidence requirements, authored by
Nell Watson and Ali Hessami. Stable, auditable, versioned via
criteria-v1.json. - Implementation Patterns layer — LLM-drafted, human-reviewed guidance that translates normative requirements into code patterns, governance templates, and process scaffolds. Versioned independently.
Every pattern entry carries provenance: drafted_by, anchor_exemplar,
confidence, needs_human_review, and reviewed_by once redlined.
Versioning & drift
- Canonical framework — follows
criteria-v1.json'sversionfield. - Pattern layer —
v1-draftduring active review;v1once all patterns have a humanreviewed_by. - Server — semantic versioning.
0.3.0introducesverbosityon the search tools; pin if you want reproducibility. - Drift audit —
research/mcp/audit_pattern_drift.pycompares each pattern's anchored SFR/evidence letters against the current canonical and flags new, deleted, or renamed entries. Run after any framework edit; non-zero exit means at least one pattern needs re-anchoring.
Scope & limits
- Transport — stdio only; no remote or authenticated transport.
- Search — field-weighted keyword scoring. Sufficient at 233 patterns; embedding-based semantic search would be worth it at ~10× this scale.
- Read-only — no
mark_reviewedwrite tool. Review edits go through the YAML files directly so editor + git diff stay auditable.
Questions or issues? Reach us through the contact form on the homepage.
Last updated 2026-04-18 Server v0.3.0 Pattern layer v1-draft