Frontmatter the agent reads.
Every article in 10-cortex/ opens with a small set of YAML fields and ends with a typed list of sources. cortex-compile writes them; cortex-lint enforces them. This is the standard.
Nine fields. Each one earns its place by changing how the LLM picks, ranks, or refreshes the article.
| Field | Type | Why it's there |
|---|---|---|
| title | string | The display name. cortex-lint flags articles missing it as errors. |
| description | string | Two sentences. What this is, why the vault owner cares. Used in 10-cortex/_index.md and the harness's preview pane. |
| topic | slug | Routes the article into 10-cortex/<topic>/. cortex-compile uses this to plan merges vs new articles. |
| created | YYYY-MM-DD | First write date. Never updated. |
| last_compiled | YYYY-MM-DD | Last time cortex-compile ran on this article. Auto-updated. |
| verified_at | YYYY-MM-DD | Last time you confirmed the article still matches reality. cortex-lint flags articles where this is older than 90 days. |
| confidence | low | medium | high | high = primary research / official docs. medium = credible analysis. low = single source / speculation. |
| staleness_signal | string | A one-line condition that, if true, means the article is stale. cortex-lint best-effort matches this against world knowledge. |
| sources | list | Plain list of source paths. The body Sources section adds typed-edge prefixes (see below). |
The Sources section uses five wikilink prefixes so the relationship between article and source is explicit. cortex-lint flags untyped sources as warnings.
supports::The source's findings back up the article's claims. Most common.
contradicts::The source disputes part of the article. Useful when you want to remember the disagreement.
extends::The source builds on the article -- newer or deeper material.
mentions::Passing reference. The article isn't the source's main subject but the source noted it.
inspired-by::The article exists because of this source. Often a seed thought from 40-raw/plain/.
What a clean cortex article looks like end-to-end.
--- title: "Retrieval-augmented generation" description: "Grounding LLM responses in your own documents. The standard architecture for memory-aware agents." topic: "ai" sources: - "40-raw/youtube/rag-explained.md" - "40-raw/papers/lewis-et-al-2020-rag.md" created: 2026-04-12 last_compiled: 2026-05-08 verified_at: 2026-05-08 confidence: high staleness_signal: "RAG architecture moves to graph-RAG by default, or vector DBs are replaced by long-context models" --- # Retrieval-augmented generation LLMs answer better when they retrieve relevant documents first. ## TL;DR RAG pairs an LLM with a retriever that fetches relevant chunks from your own corpus before generation. The model's response is grounded in real documents instead of pure parametric memory, which reduces hallucination and lets you update knowledge without retraining. Cost: retrieval quality is now your bottleneck. ## Summary [2-3 paragraphs.] ## Key Facts - Lewis et al. (2020) introduced the term, pairing BART with a dense retriever. - Modern stacks pick top-k chunks via embedding similarity, then concatenate. - Retrieval quality dominates response quality once the LLM is good enough. ## Connections - Related: [[10-cortex/ai/embeddings]], [[10-cortex/ai/long-context]] - Used in: vault-side memory loop (cortex-compile + cortex-connect) - Contrasts with: [[10-cortex/ai/finetuning]] -- finetuning bakes knowledge in; RAG keeps it swappable. ## Sources - supports:: [[40-raw/papers/lewis-et-al-2020-rag.md]] -- original architecture description - extends:: [[40-raw/youtube/rag-explained.md]] -- modern stack walkthrough (top-k, reranking, hybrid search)
Read-only diagnostic. Run /cortex-lint to surface decay and drift across 10-cortex/. Never mutates a file; the report is for you to read and act on.
| Rule | Severity | What it catches |
|---|---|---|
| missing-tldr | error | Article body has no `## TL;DR` heading. Block re-compile until fixed. |
| missing-sources | error | Article body has no `## Sources` heading or the section has no items. |
| missing-field:* | error / warn | Required frontmatter fields. title and topic are errors; the rest are warnings. |
| untyped-source | warn | Sources list item that doesn't start with one of the 5 typed-edge prefixes. |
| tldr-out-of-range | warn | TL;DR word count is below 30 or above 150. Target is 50-100. |
| stale-verified-at | info | verified_at is older than 90 days. Surface for review, not blocking. |
| staleness-signal-triggered | info | Best-effort match -- the LLM noticed the staleness_signal mentions something it knows has changed. |