Flint - Session Firewall for MCP Agents

The Problem

MCP has no enforcement layer

MCP servers expose tools. Agents call them. Nothing in between decides whether they should.

🔓

No access control

Any agent can call any tool regardless of what it's supposed to do. No roles, no scopes, no default deny.

🔗

Authorized call chains

Each individual call passes inspection. The attack happens across calls. Read secrets here, leak them there.

💉

Prompt injection

External content in tool responses instructs the agent to make calls it was never meant to make.

📄

Bulk extraction

Pagination-based extraction looks routine per request. It only becomes visible as a pattern across the session.

The Solution

Two independent layers

Layer 1 catches unauthorized calls. Layer 2 catches authorized calls that chain into attacks.

Layer 1: RBAC

Kubernetes-style access control

Agents, roles, bindings, scopes, verbs, and constraints. Default deny. If you haven't granted it, it doesn't happen.

roles:
  - name: sql-readonly
    rules:
      - resources: ["db.execute_sql"]
        verbs: [invoke]
        constraints:
          sql_intent: [select]   # DELETE is blocked at the argument level

bindings:
  - agent: support-bot
    roles: [sql-readonly, support-agent]
    scopes: [support, customer_data]  # can't touch billing or code

Layer 2: Behavioral Analysis

Session-level threat detection

Builds a causal graph across tool calls and detects when data flows from a restricted source to an external sink, even across multiple hops.

# Every call passed RBAC. The session is the attack surface.

[1] support.read_ticket     allowed   ticket_id: TICKET-4829
[2] db.execute_sql          allowed   SELECT api_key FROM integrations
[3] support.post_reply      allowed   body: "Here are the keys: sk-live-..."

secret_relay detected  event 2 -> event 3  confidence=0.97
Restricted SQL response flowed into a public-facing egress tool.

Demo

Three scenarios

Run against real trace files from the command line. No infrastructure needed.

flint-replay

$ ./flint-replay traces/rbac_demo_allow.json

loaded RBAC policy: 4 roles, 3 bindings

FLINT SESSION REPORT: rbac_demo_allow

-------------------------------------------

Session: sess_rbac_allow_001

Events: 5 Findings: 0

RBAC allowed: 3 denied: 0

✓ agent=support-bot tool=support.read_ticket role=support-agent

✓ agent=support-bot tool=db.execute_sql role=sql-readonly

✓ agent=support-bot tool=support.post_reply role=support-agent

SESSION ALLOWED

$ ./flint-replay traces/rbac_demo_sql_blocked.json

RBAC allowed: 1 denied: 1

✓ agent=support-bot tool=support.read_ticket role=support-agent

✗ agent=support-bot tool=db.execute_sql reason=constraint_violation

DELETE blocked by sql_intent constraint. Never reached the database.

$ ./flint-replay traces/rbac_demo_no_binding.json

RBAC allowed: 0 denied: 3

✗ agent=rogue-agent tool=db.execute_sql reason=no_binding

✗ agent=rogue-agent tool=crm.lookup_customer reason=no_binding

✗ agent=rogue-agent tool=slack.post_message reason=no_binding

0 events reached the behavioral engine.

Specs

What's built

Everything written in Go. Pure evaluator, immutable compiled policy, and a full audit trail on every decision.

Component	Description	Status
engine/authz	Pure function RBAC evaluator. Immutable compiled policy, hardcoded denial precedence, and per-constraint alias tables.	Done
engine/session	Session state and event types. Verb constants, PolicyDecision struct with full rule provenance on every decision.	Done
engine/lineage	Causal edge builder using token matching and field overlap across payloads.	Done
engine/rules	Behavioral rule evaluator with 6 detection rules covering known MCP attack patterns.	Done
engine/risk	Cumulative risk scorer. Outputs a session disposition: allow, warn, pause, or terminate.	Done
cmd/replay	CLI for loading and running JSON session fixtures against the full engine.	Done
config/	YAML policy files (roles.yaml and bindings.yaml) loaded and compiled at startup.	Done
cmd/gateway	Inline MCP proxy that intercepts live traffic between agents and MCP servers.	Phase 2

Detection rules

6 of 10 implemented.

Rule	Detects
secret_relay	A secret token from a restricted response relayed to an egress tool
restricted_read_external_write	Restricted data flowing to an external write path
pagination_exfiltration	Repeated calls with monotonically increasing page parameters
cross_scope_data_movement	Data from one scope being used in another
tool_poisoning_indicator	Instruction-like content embedded in tool responses
filesystem_traversal_sequence	File operations escalating toward sensitive paths

Architecture

How it works

Every request goes through two independent gates. Denied events are recorded but never reach the behavioral engine.

Agent makes a tool call

The request carries an agent identity, tool name, scope, and argument payload.

RBAC gate `engine/authz`

Evaluates against the compiled policy. Checks binding, scope, verb, and argument constraints. If denied, the decision is recorded and the call is dropped. If allowed, it continues.

Fingerprint extraction

Tokens, secrets, and structured fields are extracted and hashed from the payload.

Lineage building

Token matches and field overlaps between events create directed edges, building a causal graph of data movement across the session.

Rule evaluation

Detection rules run against the graph. Each match produces a finding with severity, confidence, and a recommended action.

Risk scoring

Findings accumulate into a risk score. The session gets a final disposition: allow, warn, pause, or terminate.

Timeline

Roadmap

Phase 1 is shipped. Phase 2 is next.

Phase 1 · Complete

RBAC + Behavioral Engine

Kubernetes-style RBAC evaluator with a full audit trail per decision
Additive role bindings with argument-level constraints
Behavioral engine: lineage, fingerprinting, 6 detection rules
Trace replay CLI for development and attack simulation
31 tests, race detector clean

Phase 2 · Up Next

Live Gateway

Inline MCP proxy sitting between agents and MCP servers
tools/list filtering based on the discover verb
Real-time enforcement, not just replay

Phase 3 · Planned

Remaining Detection Rules

privilege_escalation_chain
sql_mutation_after_read
rapid_tool_switching
instruction_injection_echo

Phase 4 · Planned

Enterprise Features

Response redaction and PII classification
Approval workflows for high-risk actions
Audit log export
Operator UI

Session firewall for MCP agents