10/Cortex frontmatter

Frontmatter the agent reads.

Every article in 10-cortex/ opens with a small set of YAML fields and ends with a typed list of sources. cortex-compile writes them; cortex-lint enforces them. This is the standard.

20/Required fields

Nine fields. Each one earns its place by changing how the LLM picks, ranks, or refreshes the article.

FieldTypeWhy it's there
titlestringThe display name. cortex-lint flags articles missing it as errors.
descriptionstringTwo sentences. What this is, why the vault owner cares. Used in 10-cortex/_index.md and the harness's preview pane.
topicslugRoutes the article into 10-cortex/<topic>/. cortex-compile uses this to plan merges vs new articles.
createdYYYY-MM-DDFirst write date. Never updated.
last_compiledYYYY-MM-DDLast time cortex-compile ran on this article. Auto-updated.
verified_atYYYY-MM-DDLast time you confirmed the article still matches reality. cortex-lint flags articles where this is older than 90 days.
confidencelow | medium | highhigh = primary research / official docs. medium = credible analysis. low = single source / speculation.
staleness_signalstringA one-line condition that, if true, means the article is stale. cortex-lint best-effort matches this against world knowledge.
sourceslistPlain list of source paths. The body Sources section adds typed-edge prefixes (see below).
30/Typed-edge sources

The Sources section uses five wikilink prefixes so the relationship between article and source is explicit. cortex-lint flags untyped sources as warnings.

supports::

The source's findings back up the article's claims. Most common.

contradicts::

The source disputes part of the article. Useful when you want to remember the disagreement.

extends::

The source builds on the article -- newer or deeper material.

mentions::

Passing reference. The article isn't the source's main subject but the source noted it.

inspired-by::

The article exists because of this source. Often a seed thought from 40-raw/plain/.

40/Example article

What a clean cortex article looks like end-to-end.

---
title: "Retrieval-augmented generation"
description: "Grounding LLM responses in your own documents. The standard architecture for memory-aware agents."
topic: "ai"
sources:
  - "40-raw/youtube/rag-explained.md"
  - "40-raw/papers/lewis-et-al-2020-rag.md"
created: 2026-04-12
last_compiled: 2026-05-08
verified_at: 2026-05-08
confidence: high
staleness_signal: "RAG architecture moves to graph-RAG by default, or vector DBs are replaced by long-context models"
---

# Retrieval-augmented generation

LLMs answer better when they retrieve relevant documents first.

## TL;DR

RAG pairs an LLM with a retriever that fetches relevant chunks from your own
corpus before generation. The model's response is grounded in real documents
instead of pure parametric memory, which reduces hallucination and lets you
update knowledge without retraining. Cost: retrieval quality is now your
bottleneck.

## Summary

[2-3 paragraphs.]

## Key Facts

- Lewis et al. (2020) introduced the term, pairing BART with a dense retriever.
- Modern stacks pick top-k chunks via embedding similarity, then concatenate.
- Retrieval quality dominates response quality once the LLM is good enough.

## Connections

- Related: [[10-cortex/ai/embeddings]], [[10-cortex/ai/long-context]]
- Used in: vault-side memory loop (cortex-compile + cortex-connect)
- Contrasts with: [[10-cortex/ai/finetuning]] -- finetuning bakes knowledge in;
  RAG keeps it swappable.

## Sources

- supports:: [[40-raw/papers/lewis-et-al-2020-rag.md]] -- original architecture description
- extends:: [[40-raw/youtube/rag-explained.md]] -- modern stack walkthrough (top-k, reranking, hybrid search)
50/cortex-lint rules

Read-only diagnostic. Run /cortex-lint to surface decay and drift across 10-cortex/. Never mutates a file; the report is for you to read and act on.

RuleSeverityWhat it catches
missing-tldrerrorArticle body has no `## TL;DR` heading. Block re-compile until fixed.
missing-sourceserrorArticle body has no `## Sources` heading or the section has no items.
missing-field:*error / warnRequired frontmatter fields. title and topic are errors; the rest are warnings.
untyped-sourcewarnSources list item that doesn't start with one of the 5 typed-edge prefixes.
tldr-out-of-rangewarnTL;DR word count is below 30 or above 150. Target is 50-100.
stale-verified-atinfoverified_at is older than 90 days. Surface for review, not blocking.
staleness-signal-triggeredinfoBest-effort match -- the LLM noticed the staleness_signal mentions something it knows has changed.