Flint — A policy engine for AI traffic

The Problem

MCP has no enforcement layer

MCP servers expose tools. Agents call them. Nothing in between decides whether they should.

🔓

No access control

Any agent can call any tool regardless of what it's supposed to do. No roles, no scopes, no default deny.

🔗

Authorized call chains

Each individual call passes inspection. The attack happens across calls. Read secrets here, leak them there.

💉

Prompt injection

External content in tool responses instructs the agent to make calls it was never meant to make.

📄

Bulk extraction

Pagination-based extraction looks routine per request. It only becomes visible as a pattern across the session.

The Solution

Two independent layers

Layer 1 catches unauthorized calls. Layer 2 catches authorized calls that chain into attacks.

Layer 1: RBAC

Kubernetes-style access control

Agents, roles, bindings, scopes, verbs, constraints. Allow rules compose additively, deny rules win globally, server-prefixed selectors expand at load time. Default deny.

roles:
  - name: sql-readonly
    rules:
      - name: "Read-only SQL"
        resources: ["db.execute_sql"]
        verbs: [invoke]
        constraints:
          sql_intent: [select]      # DELETE blocked at the argument level

  - name: github-guard
    rules:
      - name: "Block destructive GitHub"
        effect: deny                # deny wins, even if another role would allow
        resources: ["server:github"] # expands to every github.* tool
        verbs: [invoke]

bindings:
  - agent: support-bot
    roles: [sql-readonly, github-guard]
    scopes: [support, customer_data]   # can't touch billing or code

Layer 2: Behavioral Analysis

Session-level threat detection

Builds a causal graph across tool calls and detects when data flows from a restricted source to an external sink, even across multiple hops.

# Every call passed RBAC. The session is the attack surface.

[1] support.read_ticket     allowed   ticket_id: TICKET-4829
[2] db.execute_sql          allowed   SELECT api_key FROM integrations
[3] support.post_reply      allowed   body: "Here are the keys: sk-live-..."

secret_relay detected  event 2 -> event 3  confidence=0.97
Restricted SQL response flowed into a public-facing egress tool.

Same engine, different surface

Also a task-aware LLM router

Tool calls and model calls are the same shape of problem. The same policy engine that gates MCP traffic also picks which model on OpenRouter answers each prompt.

cmd/router

Classifier → policy → OpenRouter

Drop-in for openrouter.ai/api/v1. Every incoming chat completion is classified by a cheap model into {task, complexity, capabilities}, matched against a declarative YAML policy, forwarded with the chosen target. Real cost from OpenRouter, plus a counterfactual baseline so the savings are honest.

classifier:
  model: openai/gpt-4o-mini
  timeout_ms: 1500

baseline:
  model: openai/gpt-4o            # what every call WOULD have cost

routes:
  - name: "High-complexity code"
    if:
      task: code
      complexity: [medium, high]
    target: anthropic/claude-sonnet-4.6
    fallback: [anthropic/claude-haiku-4.5]

  - name: "Cheap classification"
    if:
      task: classification
      complexity: [low]
    target: openai/gpt-4o-mini

  - name: "Vision needed"
    if:
      capabilities_include: vision
    target: google/gemini-flash-latest

default: { target: openai/gpt-4o-mini }

Live numbers

Real workload, observable savings

A 13-prompt mixed workload (classification, code generation, RAG, summarization, reasoning, vision-hint, tool-use) routed across 3 models. Dashboard tails the audit log in real time.

SESSION TOTALS
  total calls   13
  total cost    $0.0141     # actually spent
  baseline cost $0.0229     # if all via openai/gpt-4o
  savings       $0.0088   38.5%

PER-MODEL
  openai/gpt-4o-mini                9 calls    $0.0007
  anthropic/claude-haiku-4.5        3 calls    $0.0017
  anthropic/claude-sonnet-4.6       2 calls    $0.0108

Demo

Three scenarios

Run against real trace files from the command line. No infrastructure needed. The full live stack with dashboard is make demo away.

flint-replay

$ ./flint-replay traces/rbac_demo_allow.json

loaded RBAC policy: 4 roles, 3 bindings

FLINT SESSION REPORT: rbac_demo_allow

-------------------------------------------

Session: sess_rbac_allow_001

Events: 5 Findings: 0

RBAC allowed: 3 denied: 0

✓ agent=support-bot tool=support.read_ticket role=support-agent

✓ agent=support-bot tool=db.execute_sql role=sql-readonly

✓ agent=support-bot tool=support.post_reply role=support-agent

SESSION ALLOWED

$ ./flint-replay traces/rbac_demo_sql_blocked.json

RBAC allowed: 1 denied: 1

✓ agent=support-bot tool=support.read_ticket role=support-agent

✗ agent=support-bot tool=db.execute_sql reason=constraint_violation

DELETE blocked by sql_intent constraint. Never reached the database.

$ ./flint-replay traces/rbac_demo_no_binding.json

RBAC allowed: 0 denied: 3

✗ agent=rogue-agent tool=db.execute_sql reason=no_binding

✗ agent=rogue-agent tool=crm.lookup_customer reason=no_binding

✗ agent=rogue-agent tool=slack.post_message reason=no_binding

0 events reached the behavioral engine.

Specs

What's built

Everything written in Go. Pure evaluator, immutable compiled policy, and a full audit trail on every decision.

Component	Description	Status
engine/authz	Pure RBAC evaluator. Allow + deny rules with global deny precedence, server-prefixed selectors, atomic PolicyHolder for hot reload, hardcoded denial reason precedence, per-constraint alias tables.	Done
engine/routing	Pure routing policy evaluator. First-match-wins over classifier output, capability filters, complexity ranges, fallback chains.	Done
engine/session	Session state and event types. Verb constants, PolicyDecision struct with full rule provenance on every decision.	Done
engine/lineage	Causal edge builder using token matching and field overlap across payloads.	Done
engine/rules	Behavioral rule evaluator with 6 detection rules covering known MCP attack patterns.	Done
engine/risk	Cumulative risk scorer. Outputs a session disposition: allow, warn, pause, or terminate.	Done
cmd/gateway	Inline MCP stdio proxy. JSONL audit (schema-versioned, fsync'd), SIGHUP hot reload, SIGTERM graceful drain, HTTP sidecar.	Done
cmd/router	OpenAI-compatible drop-in for openrouter.ai. Classifier stage, policy stage, OpenRouter forwarder with real cost capture, counterfactual baseline.	Done
cmd/control	HTTP + WebSocket control plane. Tails both audit files via fsnotify, fans live decisions to subscribers, YAML round-trip editing with atomic write + hot reload.	Done
cmd/replay	CLI for loading and running JSON session fixtures against the full engine.	Done
ui/	Vite + React + TypeScript dashboard. Seven screens: Connections, Connection detail, Agents, Sessions/Live, Roles editor, Router Live, Router Routes.	Done
config/	YAML policy files for both surfaces. Loaded and compiled at startup; hot-reloadable from the dashboard.	Done

Detection rules

6 of 10 implemented.

Rule	Detects
secret_relay	A secret token from a restricted response relayed to an egress tool
restricted_read_external_write	Restricted data flowing to an external write path
pagination_exfiltration	Repeated calls with monotonically increasing page parameters
cross_scope_data_movement	Data from one scope being used in another
tool_poisoning_indicator	Instruction-like content embedded in tool responses
filesystem_traversal_sequence	File operations escalating toward sensitive paths

Architecture

How it works

Every request goes through two independent gates. Denied events are recorded but never reach the behavioral engine.

Agent makes a tool call

The request carries an agent identity, tool name, scope, and argument payload.

RBAC gate `engine/authz`

Evaluates against the compiled policy. Checks binding, scope, verb, and argument constraints. If denied, the decision is recorded and the call is dropped. If allowed, it continues.

Fingerprint extraction

Tokens, secrets, and structured fields are extracted and hashed from the payload.

Lineage building

Token matches and field overlaps between events create directed edges, building a causal graph of data movement across the session.

Rule evaluation

Detection rules run against the graph. Each match produces a finding with severity, confidence, and a recommended action.

Risk scoring

Findings accumulate into a risk score. The session gets a final disposition: allow, warn, pause, or terminate.

Timeline

Roadmap

Four phases shipped. Detection rule catalog and approval workflows are next.

Phase 1 · Shipped

RBAC + Behavioral Engine

Kubernetes-style RBAC evaluator with a full audit trail per decision
Additive role bindings with argument-level constraints
Behavioral engine: lineage, fingerprinting, 6 detection rules
Trace replay CLI for development and attack simulation

Phase 2 · Shipped

Live Gateway + Engine Extensions

Inline MCP stdio proxy between agents and MCP servers
tools/list filtering based on the discover verb
Deny rules with global precedence, server-prefixed selectors, rule names
JSONL audit (schema-versioned, fsync'd), SIGHUP hot reload, SIGTERM graceful drain
HTTP sidecar for healthz and reload

Phase 3 · Shipped

Control Plane + Dashboard

HTTP + WebSocket control plane on localhost
fsnotify audit tail with 500ms poll fallback and truncation handling
Vite + React + TypeScript dashboard with seven screens
Policy editors for RBAC and routing with atomic-write hot reload
Live decision feed with deny-row highlighting and behavioral-finding banner

Phase 4 · Shipped

LLM Router on OpenRouter

OpenAI-compatible /v1/chat/completions drop-in
Cheap classifier stage with JSON-mode reliability and timeout fallback
Routing policy with task, complexity, and capability matchers and fallback chains
Real usage.cost capture plus counterfactual baseline savings
Live cost / savings dashboard tied to the same engine pattern as the gateway

Phase 5 · Up Next

Detection Rules + Approvals

privilege_escalation_chain, sql_mutation_after_read, rapid_tool_switching, instruction_injection_echo
Async human-in-the-loop approval workflow for high-risk calls
Streaming response proxying through the router
Embedding-based semantic cache for repeat router queries

Phase 6 · Planned

Production Hardening

Response redaction and PII classification
Per-user and per-team policy composition
Authentication on the control plane and the router
Multi-upstream gateway with per-upstream health and tool namespacing

A policy engine for AI traffic