Open Source  ยท  MVP

Session firewall for MCP agents

Kubernetes-style RBAC for AI agents. Granular access control over which agents can call which tools, plus behavioral analysis that catches attack chains even when every individual call is authorized.

MCP has no enforcement layer

MCP servers expose tools. Agents call them. Nothing in between decides whether they should.

๐Ÿ”“

No access control

Any agent can call any tool regardless of what it's supposed to do. No roles, no scopes, no default deny.

๐Ÿ”—

Authorized call chains

Each individual call passes inspection. The attack happens across calls. Read secrets here, leak them there.

๐Ÿ’‰

Prompt injection

External content in tool responses instructs the agent to make calls it was never meant to make.

๐Ÿ“„

Bulk extraction

Pagination-based extraction looks routine per request. It only becomes visible as a pattern across the session.

Two independent layers

Layer 1 catches unauthorized calls. Layer 2 catches authorized calls that chain into attacks.

Layer 1: RBAC

Kubernetes-style access control

Agents, roles, bindings, scopes, verbs, and constraints. Default deny. If you haven't granted it, it doesn't happen.

roles:
  - name: sql-readonly
    rules:
      - resources: ["db.execute_sql"]
        verbs: [invoke]
        constraints:
          sql_intent: [select]   # DELETE is blocked at the argument level

bindings:
  - agent: support-bot
    roles: [sql-readonly, support-agent]
    scopes: [support, customer_data]  # can't touch billing or code
Layer 2: Behavioral Analysis

Session-level threat detection

Builds a causal graph across tool calls and detects when data flows from a restricted source to an external sink, even across multiple hops.

# Every call passed RBAC. The session is the attack surface.

[1] support.read_ticket     allowed   ticket_id: TICKET-4829
[2] db.execute_sql          allowed   SELECT api_key FROM integrations
[3] support.post_reply      allowed   body: "Here are the keys: sk-live-..."

secret_relay detected  event 2 -> event 3  confidence=0.97
Restricted SQL response flowed into a public-facing egress tool.

Three scenarios

Run against real trace files from the command line. No infrastructure needed.

flint-replay
$ ./flint-replay traces/rbac_demo_allow.json
loaded RBAC policy: 4 roles, 3 bindings

FLINT SESSION REPORT: rbac_demo_allow
-------------------------------------------
Session: sess_rbac_allow_001
Events: 5    Findings: 0

RBAC allowed: 3   denied: 0

โœ“ agent=support-bot tool=support.read_ticket role=support-agent
โœ“ agent=support-bot tool=db.execute_sql role=sql-readonly
โœ“ agent=support-bot tool=support.post_reply role=support-agent

SESSION ALLOWED


$ ./flint-replay traces/rbac_demo_sql_blocked.json

RBAC allowed: 1   denied: 1

โœ“ agent=support-bot tool=support.read_ticket role=support-agent
โœ— agent=support-bot tool=db.execute_sql reason=constraint_violation

DELETE blocked by sql_intent constraint. Never reached the database.


$ ./flint-replay traces/rbac_demo_no_binding.json

RBAC allowed: 0   denied: 3

โœ— agent=rogue-agent tool=db.execute_sql reason=no_binding
โœ— agent=rogue-agent tool=crm.lookup_customer reason=no_binding
โœ— agent=rogue-agent tool=slack.post_message reason=no_binding

0 events reached the behavioral engine.

What's built

Everything written in Go. Pure evaluator, immutable compiled policy, and a full audit trail on every decision.

Component Description Status
engine/authz Pure function RBAC evaluator. Immutable compiled policy, hardcoded denial precedence, and per-constraint alias tables. Done
engine/session Session state and event types. Verb constants, PolicyDecision struct with full rule provenance on every decision. Done
engine/lineage Causal edge builder using token matching and field overlap across payloads. Done
engine/rules Behavioral rule evaluator with 6 detection rules covering known MCP attack patterns. Done
engine/risk Cumulative risk scorer. Outputs a session disposition: allow, warn, pause, or terminate. Done
cmd/replay CLI for loading and running JSON session fixtures against the full engine. Done
config/ YAML policy files (roles.yaml and bindings.yaml) loaded and compiled at startup. Done
cmd/gateway Inline MCP proxy that intercepts live traffic between agents and MCP servers. Phase 2

6 of 10 implemented.

RuleDetects
secret_relayA secret token from a restricted response relayed to an egress tool
restricted_read_external_writeRestricted data flowing to an external write path
pagination_exfiltrationRepeated calls with monotonically increasing page parameters
cross_scope_data_movementData from one scope being used in another
tool_poisoning_indicatorInstruction-like content embedded in tool responses
filesystem_traversal_sequenceFile operations escalating toward sensitive paths

How it works

Every request goes through two independent gates. Denied events are recorded but never reach the behavioral engine.

Agent makes a tool call

The request carries an agent identity, tool name, scope, and argument payload.

RBAC gate  engine/authz

Evaluates against the compiled policy. Checks binding, scope, verb, and argument constraints. If denied, the decision is recorded and the call is dropped. If allowed, it continues.

Fingerprint extraction

Tokens, secrets, and structured fields are extracted and hashed from the payload.

Lineage building

Token matches and field overlaps between events create directed edges, building a causal graph of data movement across the session.

Rule evaluation

Detection rules run against the graph. Each match produces a finding with severity, confidence, and a recommended action.

Risk scoring

Findings accumulate into a risk score. The session gets a final disposition: allow, warn, pause, or terminate.

Roadmap

Phase 1 is shipped. Phase 2 is next.

Phase 1  ยท  Complete

RBAC + Behavioral Engine

  • Kubernetes-style RBAC evaluator with a full audit trail per decision
  • Additive role bindings with argument-level constraints
  • Behavioral engine: lineage, fingerprinting, 6 detection rules
  • Trace replay CLI for development and attack simulation
  • 31 tests, race detector clean
Phase 2  ยท  Up Next

Live Gateway

  • Inline MCP proxy sitting between agents and MCP servers
  • tools/list filtering based on the discover verb
  • Real-time enforcement, not just replay
Phase 3  ยท  Planned

Remaining Detection Rules

  • privilege_escalation_chain
  • sql_mutation_after_read
  • rapid_tool_switching
  • instruction_injection_echo
Phase 4  ยท  Planned

Enterprise Features

  • Response redaction and PII classification
  • Approval workflows for high-risk actions
  • Audit log export
  • Operator UI