Kubernetes-style RBAC for AI agents. Granular access control over which agents can call which tools, plus behavioral analysis that catches attack chains even when every individual call is authorized.
MCP servers expose tools. Agents call them. Nothing in between decides whether they should.
Any agent can call any tool regardless of what it's supposed to do. No roles, no scopes, no default deny.
Each individual call passes inspection. The attack happens across calls. Read secrets here, leak them there.
External content in tool responses instructs the agent to make calls it was never meant to make.
Pagination-based extraction looks routine per request. It only becomes visible as a pattern across the session.
Layer 1 catches unauthorized calls. Layer 2 catches authorized calls that chain into attacks.
Agents, roles, bindings, scopes, verbs, and constraints. Default deny. If you haven't granted it, it doesn't happen.
roles:
- name: sql-readonly
rules:
- resources: ["db.execute_sql"]
verbs: [invoke]
constraints:
sql_intent: [select] # DELETE is blocked at the argument level
bindings:
- agent: support-bot
roles: [sql-readonly, support-agent]
scopes: [support, customer_data] # can't touch billing or code
Builds a causal graph across tool calls and detects when data flows from a restricted source to an external sink, even across multiple hops.
# Every call passed RBAC. The session is the attack surface.
[1] support.read_ticket allowed ticket_id: TICKET-4829
[2] db.execute_sql allowed SELECT api_key FROM integrations
[3] support.post_reply allowed body: "Here are the keys: sk-live-..."
secret_relay detected event 2 -> event 3 confidence=0.97
Restricted SQL response flowed into a public-facing egress tool.
Run against real trace files from the command line. No infrastructure needed.
Everything written in Go. Pure evaluator, immutable compiled policy, and a full audit trail on every decision.
| Component | Description | Status |
|---|---|---|
| engine/authz | Pure function RBAC evaluator. Immutable compiled policy, hardcoded denial precedence, and per-constraint alias tables. | Done |
| engine/session | Session state and event types. Verb constants, PolicyDecision struct with full rule provenance on every decision. | Done |
| engine/lineage | Causal edge builder using token matching and field overlap across payloads. | Done |
| engine/rules | Behavioral rule evaluator with 6 detection rules covering known MCP attack patterns. | Done |
| engine/risk | Cumulative risk scorer. Outputs a session disposition: allow, warn, pause, or terminate. | Done |
| cmd/replay | CLI for loading and running JSON session fixtures against the full engine. | Done |
| config/ | YAML policy files (roles.yaml and bindings.yaml) loaded and compiled at startup. | Done |
| cmd/gateway | Inline MCP proxy that intercepts live traffic between agents and MCP servers. | Phase 2 |
6 of 10 implemented.
| Rule | Detects |
|---|---|
| secret_relay | A secret token from a restricted response relayed to an egress tool |
| restricted_read_external_write | Restricted data flowing to an external write path |
| pagination_exfiltration | Repeated calls with monotonically increasing page parameters |
| cross_scope_data_movement | Data from one scope being used in another |
| tool_poisoning_indicator | Instruction-like content embedded in tool responses |
| filesystem_traversal_sequence | File operations escalating toward sensitive paths |
Every request goes through two independent gates. Denied events are recorded but never reach the behavioral engine.
The request carries an agent identity, tool name, scope, and argument payload.
engine/authzEvaluates against the compiled policy. Checks binding, scope, verb, and argument constraints. If denied, the decision is recorded and the call is dropped. If allowed, it continues.
Tokens, secrets, and structured fields are extracted and hashed from the payload.
Token matches and field overlaps between events create directed edges, building a causal graph of data movement across the session.
Detection rules run against the graph. Each match produces a finding with severity, confidence, and a recommended action.
Findings accumulate into a risk score. The session gets a final disposition: allow, warn, pause, or terminate.
Phase 1 is shipped. Phase 2 is next.