Resolves the tension between "no shared in-process code" and a shared error module by giving each subsystem its own app.errors package with the same base LQAIError shape, while the shared contract is the error-code enum + structured error body in the two OpenAPI sketches, verified by a cross-subsystem conformance test. It accepts ~50 lines of duplicated base class per subsystem to honor subsystem isolation, and deliberately keeps the two HTTP envelope shapes different (backend {"detail"} vs gateway {"error"}) while the inner code/message/details shape is identical. You're raising or mapping errors across api/ and gateway/, or wondering why the two services use different envelope keys.
ADR 0003 — Error handling: parallel lq_ai.errors packages with an OpenAPI-defined contract
Status: Accepted (2026-05-08)
Decision-makers: Kevin Keller (initial maintainer)
Affected components: api/, gateway/, tests/ (cross-subsystem conformance)
Related: ADR 0002 — Backend-owned auth, docs/M1-PROGRESS.md "Deferred to B5", CONTRIBUTING.md "Code style → Exceptions"
Context
CONTRIBUTING.md has carried — since A1 — the line "use the project's lq_ai.errors exception hierarchy; do not raise bare Exception". The package was never actually built. B1 and B3 each surfaced the absence in their reports; B5 is the natural landing spot because B5 is the first task where backend↔gateway error semantics need to align (every gateway error becomes a backend error of some shape).
The deferred-items table in docs/M1-PROGRESS.md describes the package as living "in a shared module that both api/ and gateway/ import from." Read literally, this contradicts CLAUDE.md:
Each subsystem (
api/,gateway/,web/) is a self-contained service. They talk over HTTP using OpenAPI-defined contracts. There is no shared in-process code — adapters cross the boundary explicitly.
The two statements can be reconciled three ways:
- Option A — Shared Python distribution. Put the hierarchy in
packages/lq-ai-errors/with its ownpyproject.toml; install it as an editable dependency in bothapi/andgateway/. Honors the deferred-items hypothesis but violates CLAUDE.md's hard rule on shared in-process code. - Option B — Parallel hierarchies + OpenAPI-defined contract. Each subsystem owns its own
app.errorsPython package with the same canonical error-code enum and the same exception-to-HTTP mapping. The shared artifact is the error-code enum + structured error-body schema in the OpenAPI sketches (docs/api/backend-openapi.yaml,docs/api/gateway-openapi.yaml). A cross-subsystem conformance test undertests/verifies the two enums stay in sync. Honors CLAUDE.md's hard rule and matches how the rest of the system already works (api/ and gateway/ communicate over HTTP, not via shared imports). - Option C — Single subsystem only. Accept the inconsistency that gateway has its own (already-existing)
ProviderAdapterErrorhierarchy and letapi/invent its own. Already the de-facto state; the cost is architectural drift over time and no enforcement that the two error vocabularies line up.
Decision
Adopt Option B. Each subsystem ships its own app.errors Python package; the contract that ties them together is documented in the OpenAPI sketches and verified by a cross-subsystem conformance test.
Concretely:
-
api/app/errors.py — backend exception hierarchy.
LQAIError(code, message, http_status, details)base; subclasses for the codes the backend emits (unauthorized,forbidden,not_found,validation_error,rate_limited,internal_error,password_change_required,gateway_unreachable,gateway_timeout,gateway_invalid_response,provider_unavailable,tier_below_minimum). FastAPI exception handler in api/app/main.py translates these to the response shape{"detail": {"code": ..., "message": ..., "details": ...}}documented indocs/api/backend-openapi.yamlas theErrorschema. -
gateway/app/errors.py — gateway exception hierarchy. Same base class shape, codes from the gateway-openapi
GatewayError.codeenum (tier_below_minimum,tier_disallowed_globally,anonymization_failed,invalid_model,provider_unavailable,rate_limit_exceeded,invalid_request,not_implemented,unauthorized). FastAPI exception handler translates to the response shape{"error": {"code": ..., "message": ..., "details": ...}}per the existingGatewayErrorschema. The gateway's existingProviderAdapterErrorhierarchy in gateway/app/providers/base.py continues to exist for adapter-internal errors and is mapped toLQAIErrorsubclasses at the route boundary; B5 does not refactor adapters. -
OpenAPI sketches — the canonical contract.
docs/api/backend-openapi.yamlgains an explicitErrorschema (it was missing — a real gap).docs/api/gateway-openapi.yaml'sGatewayErroris already canonical and does not change. -
Cross-subsystem conformance test — tests/test_error_code_contract.py imports both subsystems' code enums and asserts: (1) every code an
api/exception emits that originates from the gateway is also a code the gateway side declares, (2) the canonical messages defined for shared codes match. This is the binding contract; it lives in the cross-cuttingtests/directory per CLAUDE.md.
The two response wrapper shapes are deliberately different: backend uses {"detail": ...} (matches FastAPI's native HTTPException response, matches the B2 forced-password-change pattern, matches what the OpenWebUI fork's auth-delegation glue already needs to read), and gateway uses {"error": ...} (matches the existing GatewayError schema, matches the gateway adapter tests that already assert on body["error"]["code"]). The inner shape is identical (code + message + details) and is what the cross-subsystem code enum actually constrains. We do not unify the wrappers because doing so would either (a) churn the gateway's shipped tests and OpenAPI without value, or (b) push a non-FastAPI-native shape onto the backend. The wrapper difference is documented in both OpenAPI sketches.
The pre-existing A4 _stub.not_implemented 501 envelope uses {"error": {"code": "not_implemented", ...}} for stub endpoints. Since the stubs are transitional (each one is replaced as its task lands) and since 47 stub-conformance tests assert on the existing body["error"] shape, the stub envelope stays as-is until each stub is replaced by a real handler. New typed errors raised from real handlers go through the lq_ai.errors hierarchy and produce the canonical {"detail": {...}} shape.
Consequences
Positive:
- Honors CLAUDE.md's hard rule on subsystem isolation. No shared Python distribution, no shared imports across
api/andgateway/. - The contract that matters (the error-code vocabulary) is anchored in the same artifact that already governs cross-subsystem semantics — the OpenAPI sketches.
- Each subsystem's exception hierarchy can evolve independently for codes that don't cross the boundary (e.g.,
password_change_requiredis backend-only;tier_below_minimumis gateway-only and propagates through the backend as a passthrough). - The cross-subsystem conformance test makes drift loud rather than silent.
Negative:
- Two parallel implementations of the base class and the FastAPI exception handler. The duplication is small (~50 lines per subsystem) and the conformance test catches divergence on the values that actually matter; this is a tolerable cost.
- Operators reading both OpenAPI sketches see two different wrapper keys (
detailvs.error). Documented in both sketches; not a usability hazard at the level of programmatic clients (each client calls one of the two services).
Mitigations:
- The conformance test under
tests/is the load-bearing artifact. If a code-name appears in one subsystem and not the other, the test fails. - If a future task argues for a single Python package (e.g., M3 introduces a Word add-in service that needs shared exceptions), this ADR is reopened. Until then, the cost-benefit is clearly on Option B's side.
Implementation notes
- B5 lands the package in both subsystems plus the conformance test.
- Existing
HTTPException(detail="string")raises in api/app/api/auth.py are not mass-rewritten in B5; they are mapped at handler-level only when B5 touches that handler for other reasons. Per CLAUDE.md "don't churn the codebase for the sake of it." - New code raises
LQAIErrorsubclasses directly. The exception handler does the translation. - The 501 stub helper (
api/app/api/_stub.py::not_implemented,gateway/app/api/inference.py::_not_implemented) is unchanged.
Updates to docs/M1-PROGRESS.md reflect that this is the chosen path and that the "lives in a shared module" phrasing in the deferred-items table was a hypothesis, not a directive.