The Inference Gateway: the security boundary

The component that holds the keys and guards the egress: its pipeline, the threat model around it, what anonymization does to outbound prompts, how model routing stays inspectable, and where the inference-tier boundary sits.

Orthogonal boundary — the Inference Choice Spectrum

The Inference Choice Spectrum (PRD §1.5.2) is a seventh boundary that runs along a different axis from R1–R6. R1–R6 restrain what the model may decide, spend, run, or touch. The Inference Choice Spectrum restrains where the data goes during inference.

The five tiers (PRD §1.5.2): local-only (Tier 1), customer-hosted cloud inference (Tier 2), enterprise managed inference with ZDR / no-training commitments (Tier 3), standard cloud API (Tier 4), consumer or free tier (Tier 5).
Skills, Projects, and requests can require a minimum tier (R2-adapted, above); the gateway refuses routing decisions that violate the floor (tier_below_minimum).
The audit log records every routing decision (inference_routing_log per PRD §5.5).
Tier 3 is recommended for most pragmatic enterprise deployments; Tier 1 is recommended for the most sensitive privileged work.

This boundary is documented separately in:

PRD §1.5.2 (the spectrum's five-tier definition)
PRD §3.13 (the Inference Tier badge in the UI)
PRD §4.4 (gateway configuration of tier mapping)
PRD §1.8 (security posture; calls the spectrum "the central security trade-off")

It is named here so a reader doesn't conflate the two boundaries: a deployment can ship full R1 + R2 + R3 + R4 + R5 + R6 and still expose customer data to a weaker tier through configuration choices, or vice-versa.