Document review & citations, end to end
How a contract becomes reviewed, cited, and trustworthy: the verification cascade, where citations are stored, the playbook and tabular surfaces that ride on it, and the honest edges where verification stops.
The workflow
The executor (api/app/tabular/executor.py) is a three-node LangGraph graph run by the ARQ worker (api/app/workers/tabular_worker.py) on the shared playbook queue (arq:m3a6, per Phase C Decision C-3 — the same queue as the Easy Playbook wizard). The nodes live in api/app/tabular/nodes.py:
load_documents_node— resolvesdocument_idstoDocumentrows in the operator's selection order (missing / soft-deleted documents are skipped silently; the row is preserved as audit but the result set is honest about which sources resolved). It joins the parentFileso each row's display name is the operator-uploaded filename, not the document UUID.extract_cells_node— for each(document, column)pair: runs a lexical full-text search (FTS) over that document'sdocument_chunksusing the column'squeryas keyword input (top-K = 4 chunks; falls back to the first N chunks when FTS yields nothing, so the LLM still sees document content), then dispatches one structured-output LLM call. The model returns{value, cited_chunk_indices, confidence, justification}; the node mapscited_chunk_indices→cited_chunk_idsagainst the retrieved chunks. Every cell is wrapped in its own try/except: any failure (no chunks retrieved, gateway error, malformed response, empty value) lands asconfidence='failed'with a populatederror(Decision C-10) rather than aborting the run. Dispatch is sequential in v0.3.0; per-cell parallelism is a follow-on if the 2,000-cell latency forces it.aggregate_node— groups per-cell results by document into theresultsJSONB and flips status tocompleted(orfailedif a prior node setstate['error']). It writescost_actual_usdas the sum of per-cell costs — which is currently always 0 (see Per-cell cost & tier).
Retrieval is lexical FTS, not vector search — the column query is treated as keyword input over the document's chunks. This is a deliberate v0.3.0 choice: it needs no embedding pre-pass and keeps each cell's context small. It also means a column query whose wording diverges from the contract's vocabulary may retrieve weaker chunks; the FTS-falls-back-to-first-N behavior keeps the cell from failing outright, but the operator should treat low-confidence cells accordingly.