The Inference Gateway: the security boundary

The component that holds the keys and guards the egress: its pipeline, the threat model around it, what anonymization does to outbound prompts, how model routing stays inspectable, and where the inference-tier boundary sits.

Trust boundaries

LQ.AI runs as 7 services on a single operator-controlled deployment (Docker Compose for dev; Helm/Kubernetes for production per deploy/helm/lq-ai/). The Inference Gateway is the only component holding plaintext provider API keys per PRD §4; this defines the primary trust boundary. Everything internal to the operator's deployment is one trust zone; the LLM providers (Anthropic, OpenAI, etc.) are another; the operator's IdP (if integrated) is a third.

┌────────────────────────────────────────────────────────────┐
│ Operator deployment                                        │
│  ┌──────┐   ┌─────────┐   ┌──────────┐   ┌────────────┐   │
│  │ web  │──>│   api   │──>│ gateway  │──>│ providers  │   │
│  └──────┘   └────┬────┘   └────┬─────┘   └────────────┘   │
│                  ▼             ▼                          │
│              ┌────────┐    ┌─────────┐                    │
│              │postgres│    │  minio  │                    │
│              └────────┘    └─────────┘                    │
│              ┌────────┐                                   │
│              │ redis  │                                   │
│              └────────┘                                   │
└────────────────────────────────────────────────────────────┘

The five rows in the STRIDE table below cover the production-facing services (api, gateway, web, postgres, minio). Redis and the ingest-worker are cluster-internal and inherit postgres-tier mitigations (least-privilege role, no external listener, operator-managed secret); they do not add a distinct row.