Project Visuals
Pipeline, LLM, and configuration architecture diagrams
LangGraph Pipeline
graph_v2.mdNode flow — v3 ML-first routing with tool steps
ML router or LLM router → route_audience → specialist_generate → evaluate
→ tool nodes → persist_attempt — branches to revise (retry loop) or persist_results (done)
LLM Call Flow
llm_file_visual.mdFlow: Check mock → (mock result) OR (OpenAI call → extract text → parse JSON) → dict
Mock mode activates when OPENAI_API_KEY is absent.
Real mode calls the OpenAI Responses API and parses the JSON reply.
Agent Profiles Loader
config_file_visual.mdFlow: Getter → load → validate → cache
Profiles are loaded once from disk and cached in _PROFILES.
Subsequent calls skip the file read entirely.
Agent Profiles Data Structure
config_file_dataShape_visual.mdYAML Configuration Schema
agent_profiles.yaml (top-level dict)
├── audiences (dict, required, non-empty)
│ ├── <audience_name_1> (dict)
│ │ ├── display_name (string, required)
│ │ ├── system_prompt (string, required)
│ │ ├── required_sections (list[string], required)
│ │ └── default_max_words (int > 0, required)
│ └── <audience_name_2> ...
│
├── routing (dict, required)
│ ├── auto_router_prompt (string, required)
│ ├── low_confidence_threshold (number 0..1, required)
│ ├── ml_router_threshold (number 0..1, optional)
│ ├── ml_router_margin (number 0..1, optional)
│ └── ml_router_min_samples (int >= 1, optional)
│
├── evaluation (dict, required)
│ └── evaluator_prompt (string, required)
│
└── generation (dict, required)
└── prompt (string, required)
All top-level keys are required. audiences must contain at least one named entry.
The routing section supports optional ml_router_* keys for the ML-first routing path.
Database Schema
Entity relationships, table structure, indexes, and job lifecycle
Entity Relationship Diagram
db_schema_visual_v2.mdPK = Primary Key | FK = Foreign Key
Table & Relationship Tree
db_schema_visual_v2.mdFull column & FK reference
Tables & Relationships
│
├── Document Core content record
│ ├── id (PK)
│ ├── content
│ ├── source_type
│ ├── source_url
│ └── created_at
│ │
│ ├──< Job One document → many jobs
│ │ ├── id (PK)
│ │ ├── document_id (FK → Document.id) [indexed]
│ │ ├── selected_audience
│ │ ├── audience
│ │ ├── routing_confidence
│ │ ├── routing_candidates_json
│ │ ├── routing_reasons_json
│ │ ├── routing_source
│ │ ├── router_version
│ │ ├── status
│ │ ├── attempt_count
│ │ ├── max_words
│ │ ├── max_retries
│ │ └── created_at
│ │ │
│ │ └──< JobAttempt One job → many attempts
│ │ ├── id (PK)
│ │ ├── job_id (FK → Job.id) [indexed]
│ │ ├── attempt_no
│ │ ├── audience
│ │ ├── agent_used
│ │ ├── generated_one_line_summary
│ │ ├── generated_tags_json
│ │ ├── generated_clues_json
│ │ ├── generated_bullets_json
│ │ ├── generated_mindmap
│ │ ├── evaluator_json
│ │ ├── passed
│ │ └── created_at
│ │
│ ├──< DocumentClue One document → many clues
│ │ ├── id (PK)
│ │ ├── document_id (FK → Document.id) [indexed]
│ │ ├── clue_text
│ │ └── created_at
│ │
│ ├──< DocumentTagSummary One document → many tag summaries
│ │ ├── id (PK)
│ │ ├── document_id (FK → Document.id) [indexed]
│ │ ├── job_id (FK → Job.id) [indexed]
│ │ ├── domain [indexed]
│ │ ├── canonical_tags_json
│ │ └── created_at
│ │
│ ├──< DocumentClaim One document → many citation claims
│ │ ├── id (PK)
│ │ ├── document_id (FK → Document.id) [indexed]
│ │ ├── job_id (FK → Job.id) [indexed]
│ │ ├── claim_text
│ │ ├── quote_text
│ │ ├── source_start
│ │ ├── source_end
│ │ ├── confidence
│ │ └── created_at
│ │
│ ├──< DocumentRiskFlag One document → many risk flags
│ │ ├── id (PK)
│ │ ├── document_id (FK → Document.id) [indexed]
│ │ ├── job_id (FK → Job.id) [indexed]
│ │ ├── severity
│ │ ├── category
│ │ ├── text_span
│ │ ├── suggested_fix
│ │ └── created_at
│ │
│ └──< DocumentTag Join table (many-to-many)
│ ├── document_id (PK, FK → Document.id) [indexed]
│ └── tag_id (PK, FK → Tag.id) [indexed]
│
├── Tag Global tag registry
│ ├── id (PK)
│ └── name [unique, indexed]
│
└── TagAlias Tag normalization map
├── id (PK)
├── alias [unique, indexed]
├── canonical [indexed]
└── created_at
└──< denotes a one-to-many relationship from parent to child
Indexes
db_schema_visual_v2.mdAll DB indexes and their purpose
| Index Name | Table | Column(s) | Notes |
|---|---|---|---|
ix_job_document_id | job | document_id | FK lookup |
ix_jobattempt_job_id | jobattempt | job_id | FK lookup |
ix_documentclue_document_id | documentclue | document_id | FK lookup |
ix_documenttag_document_id | documenttag | document_id | FK lookup |
ix_documenttag_tag_id | documenttag | tag_id | FK lookup |
ix_tagalias_alias | tagalias | alias | FK lookup |
ix_tagalias_canonical | tagalias | canonical | FK lookup |
ix_doc_tag_summary_doc | documenttagsummary | document_id | FK lookup |
ix_doc_tag_summary_job | documenttagsummary | job_id | FK lookup |
ix_doc_tag_summary_domain | documenttagsummary | domain | Filter/sort |
ix_doc_claim_doc | documentclaim | document_id | FK lookup |
ix_doc_claim_job | documentclaim | job_id | FK lookup |
ix_doc_risk_doc | documentriskflag | document_id | FK lookup |
ix_doc_risk_job | documentriskflag | job_id | FK lookup |
| (auto) | tag | name | unique constraint |
| (auto) | tagalias | alias | unique constraint |
Job Status Flow
db_schema_visual_v2.mdLifecycle: pending → running → completed / failed
Jobs retry up to max_retries (default 2) before permanently staying in failed.