The LQ.AI Atlas LQ.AI's documentation, bound to the code it describes
234 documents

Document review & citations, end to end

How a contract becomes reviewed, cited, and trustworthy: the verification cascade, where citations are stored, the playbook and tabular surfaces that ride on it, and the honest edges where verification stops.

message_citations

Per M2-A2 (migration 0025_create_message_citations.py). One row per model-emitted citation, written by the chat-send path after the assistant message is persisted and the Citation Engine has run its verification cascade.

CREATE TABLE message_citations (
    id                        UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    message_id                UUID NOT NULL REFERENCES messages(id) ON DELETE CASCADE,
    source_file_id            UUID NOT NULL REFERENCES files(id) ON DELETE CASCADE,
    source_offset_start       INTEGER NOT NULL,
    source_offset_end         INTEGER NOT NULL,
    source_page               INTEGER,
    source_text               TEXT NOT NULL,
    verified                  BOOLEAN NOT NULL DEFAULT FALSE,
    verification_method       TEXT,  -- enum below
    verification_confidence   NUMERIC(3,2),
    created_at                TIMESTAMPTZ NOT NULL DEFAULT now(),

    CONSTRAINT chk_message_citations_offset_start_nonneg
        CHECK (source_offset_start >= 0),
    CONSTRAINT chk_message_citations_offset_end_gt_start
        CHECK (source_offset_end > source_offset_start),
    CONSTRAINT chk_message_citations_method_values
        CHECK (
            verification_method IS NULL
            OR verification_method IN (
                'exact_match', 'tolerant_match', 'llm_judge', 'ensemble', 'failed'
            )
        ),
    CONSTRAINT chk_message_citations_confidence_range
        CHECK (
            verification_confidence IS NULL
            OR (verification_confidence >= 0 AND verification_confidence <= 1)
        ),
    CONSTRAINT chk_message_citations_verified_has_method
        CHECK ((verified = false) OR (verification_method IS NOT NULL))
);

CREATE INDEX idx_message_citations_message ON message_citations(message_id);
CREATE INDEX idx_message_citations_file ON message_citations(source_file_id);

The verification_method enum carries the stage that produced the verdict — every stage writes into the same row shape so the persistence layer (and the UI) don't need to switch on stage:

Value Stage Confidence Lands in
'exact_match' Stage 1: byte-for-byte against documents.normalized_content[start:end] always 1.0 M2-A2 (here)
'tolerant_match' Stage 2: whitespace + OCR-artefact + smart-quote normalization similarity-based M2-B1
'llm_judge' Stage 3: LLM paraphrase judge judge-reported M2-C1
'ensemble' Stage 4: multi-model agreement for high-stakes ops quorum-derived M2-D1
'failed' Every stage rejected; rendered as unverified NULL M2-C2 wiring

The verified=true ⇒ verification_method IS NOT NULL CHECK constraint prevents a row from claiming verification without naming which stage passed.

M2-A2 ships Stage 1 only: extraction (app.citation.extraction) finds "..." (Source: [N]) pairs in the assistant response, locates the quote inside the cited retrieved chunk's content, and derives byte- precise document offsets. The verifier (app.citation.verification) confirms normalized_content[start:end] == source_text byte-for-byte. Candidates that fail Stage 1 are dropped (not persisted) until later stages ship; the M2-C2 UI work decides what to render for "model emitted but we couldn't verify."