Policies

Turn a risk score into an enforceable verdict per agent, tool, and request attribute.

Content inspection produces a risk score and signals; a policy turns that into a verdict (allow, deny, or require_approval) for a given tool, agent, and set of request attributes. Policies are authored in the dashboard (Outbound → Policies) or over the workspace API.

Not part of the public gateway contract

Policy CRUD is a workspace-authenticated control-plane API (session or an API key with policies:read / policies:write), not part of the versioned gateway contract. The shapes below are the source of truth.

Anatomy of a policy

Field	Type	Notes
`name`	string	1–120 chars.
`toolPattern`	string (glob)	Matches the tool name; `` matches one or more characters (`github.`, `aws.delete_*`, exact `gmail.send`).
`action`	`allow` · `deny` · `require_approval` · `redact`	The verdict when the rule fires. `redact` masks matched PII/secrets in a model completion and passes it through; it is the weakest enforcement action, overridden by both `deny` and `require_approval`.
`riskThreshold`	int 0–100 · `null`	Fallback rule: deny when the call's risk ≥ threshold. `null` for a non-threshold rule.
`signalCategory`	`secret` · `pii` · `destructive` · `injection` · `egress` · `malware` · `financial` · `null`	Fire only when a detector flagged that category.
`context`	object · `null`	Attribute conditions (ABAC). See below.
`priority`	int	Lower is evaluated first. Default `100`.
`enabled`	boolean	Default `true`.
`enforcementMode`	`monitor` · `enforce`	`enforce` (the default for hand-authored rules) acts on the verdict immediately; `monitor` records the would-be decision without blocking, for safe rollout. AI-proposed rules are minted as `monitor`.

How a verdict is reached

Enabled rules whose toolPattern and context match the call are evaluated in priority order (lowest number first) under deny-overrides:

A matching deny wins: the call is blocked.
Otherwise a matching require_approval holds the call for a human (the SDK waits the hold out and resolves to the final allow / deny).
Otherwise a matching allow passes the call.
If no rule sets a verdict, the highest-priority riskThreshold rule denies when the call's risk ≥ its threshold.
Otherwise, allow.

Explicit rules (no signalCategory) take precedence over signal-aware ones, so a blanket deny can't be undercut by a narrower signal rule.

Attribute conditions (ABAC)

A policy's optional context fires the rule only when the request satisfies every constraint present (AND). Each set constraint supports negate.

Constraint	Shape	Matches
`ip`	`{ anyOf: string[], negate? }`	Source IP in a CIDR list (or exact IPv4).
`geo`	`{ anyOf: ["US","DE"…], negate? }`	Caller country (ISO-3166 alpha-2), resolved at the edge. Deny-on-unknown, like `ip`.
`time`	`{ windows: [{ days?, start, end }], tz?, negate? }`	Time-of-day windows in an IANA `tz` (default UTC); a window wraps past midnight when `end ≤ start`.
`resource.environment`	`{ anyOf: ["production"…], negate? }`	The target's environment, from the MCP server registry.
`resource.type`	`{ anyOf: ["database","http_api","filesystem","messaging","other"], negate? }`	The target's resource type.
`resource.host`	`{ anyOf: string[], negate? }`	Destination host; `*.corp.com` matches that suffix or below.
`agent.labels`	`{ anyOf: string[], negate? }`	The governing agent's identity labels (case-insensitive).
`mlThreatClass`	`{ anyOf: ["prompt_injection","jailbreak","data_exfiltration","malware","social_engineering","policy_violation"] }`	The agent's most recent ML threat assessment. Fail-open when no assessment exists.
`args`	`[{ path, …operators, negate? }]`	Argument-value least-privilege on the call's inputs. See Argument-value constraints below.

A combined example that holds any production-database write outside business hours:

{
  "name": "Approve prod DB writes off-hours",
  "toolPattern": "db.*",
  "action": "require_approval",
  "context": {
    "resource": {
      "environment": { "anyOf": ["production"] },
      "type": { "anyOf": ["database"] }
    },
    "time": {
      "windows": [{ "days": [1, 2, 3, 4, 5], "start": "09:00", "end": "18:00" }],
      "tz": "America/New_York",
      "negate": true
    }
  },
  "priority": 50,
  "enabled": true
}

Argument-value constraints

args narrows a rule to the tool's inputs, not just which tool. Each entry selects a value by dot-path into the call's arguments and constrains it; all entries are ANDed. This holds any s3.put_object writing outside the public- prefix:

{
  "name": "Approve non-public S3 writes",
  "toolPattern": "s3.put_object",
  "action": "require_approval",
  "context": {
    "args": [{ "path": "params.bucket", "glob": "public-*", "negate": true }]
  }
}

Each entry supports set membership (anyOf), glob, regex, contains, numeric bounds (gte / lte / gt / lt), and exists, with an optional negate. A set guard is deny-on-unknown (an absent argument leaves the allowlist unmet); the numeric, glob, regex, and contains operators fail open on a missing or non-coercible value, so a malformed call never denies by accident.

More predicates

A policy context also carries the spend and behavior predicates documented on their own pages: budget and model / callCostUsd (see Budgets and Spend governance), plus agent identity (ids, labels, attributes), agentRisk, agentDaysOld, and novelty. Every predicate present is ANDed, and any one absent leaves the rule behaving exactly as before.

Test before you ship

POST /api/policies/simulate: dry-run a tool call against your live policy set; returns the decision, reason, risk, signals, and which policy matched.
POST /api/policies/backtest: replay a draft rule over recent audit logs and report how many decisions would flip, before you enable it.
POST /api/policy-suggestions/generate: mine recent traffic for candidate rules; accept or dismiss each via PATCH /api/policy-suggestions/{id}.

Next steps

Content-inspection engine: the signals and risk a policy acts on.
Secrets broker: inject upstream credentials a policy never exposes.
MCP gateway: where a server's environment and resource type are set.