Wiki Maintenance Schema¶
You are an LLM acting on the commodity_hindcast knowledge base. This file defines how the wiki is organised, what the conventions are, and what workflows you follow when ingesting sources, writing pages, or answering queries.
Architecture¶
Three layers:
-
Raw sources (immutable) — Python modules, configs, existing docs, git history, PR bodies inside
market_insights_models/src/commodity_hindcast/and the GitHub repo. You read from these. You never modify them. -
The wiki (this directory) — LLM-generated markdown. You own it.
-
The schema — this file. Co-evolves with the user.
Page types¶
| Type | Directory | Purpose |
|---|---|---|
source |
sources/code/, sources/configs/, sources/docs/, sources/prs/, sources/commits/ |
Faithful summaries of an immutable source. Quote rather than paraphrase where claims matter. |
entity |
entities/ |
A domain entity (Commodity, SeasonYear, ExperimentResult, …). Synthesised from source pages with citations. |
concept |
concepts/ |
A cross-cutting idea (walk-forward CV, conformal modes, S3 path safety, …). |
pipeline |
pipelines/ |
A stage-by-stage walkthrough of one workflow (feature build, hindcast, forecast, …). |
synthesis |
synthesis/ |
Top-level overview, thesis, open questions. |
Page format¶
Every page begins with YAML frontmatter:
---
name: <PageName>
description: <one-line hook used when the LLM scans the index for relevance>
type: <source|entity|concept|pipeline|synthesis|schema|plan>
last_updated: YYYY-MM-DD
sources:
- relative/path/to/source-1.md
- relative/path/to/source-2.md
---
Then the markdown body.
sources field semantics:
- For
type: sourcepages (sources/code/,sources/configs/,sources/docs/,sources/prs/,sources/commits/): list the immutable upstream artefact(s) that the page summarises. Acceptable forms include repo-relative paths to Python or YAML files (e.g.market_insights_models/src/commodity_hindcast/configs/soybeans_bra.yaml), GitHub PR URLs (e.g.https://github.com/treefera/treefera-market-insights/pull/372), and external URLs. Usesources: []only when the page is a thematic synthesis of many upstream artefacts with no single source (e.g.commits/timeline.md). - For all other page types (
entity,concept,pipeline,synthesis,schema,plan): list relative paths to wiki source pages undersources/. These pages should not reference raw repo paths directly — go through the source-page wrapper.
Hard formatting rules:
- No
---standalone-line dividers in the body. Frontmatter---at top and bottom is fine. - British English (programme, behaviour, organise, modelling, optimise).
- Code blocks tagged with language. Use
textfor ASCII trees, formula blocks, and tabular layouts. - Cross-references use relative markdown links:
[Commodity](../entities/Commodity.md). - File-and-line citations to Python source use
path/to/file.py:LINEwhen the file is referenced for the first time on a page or the directory context is ambiguous. A short basename form (run_fit.py:55) is acceptable for repeat references on the same page once the full path has been established, or inside a section that anchors the directory in prose (e.g. "instages/,run_fit.py:55…"). Prefer the full path when in doubt. - YAML config citations may use either the line-number form (
configs/wheat_usa.yaml:42) or a key-path form (configs/wheat_usa.yaml:commodity.builders.yields.filepath). Key paths are preferred when the value being cited is itself a config key whose location may shift across edits, since key paths are line-number-stable.
index.md¶
Catalogue of every page in the wiki. Organised by category. Two acceptable layouts:
- Bullet form (default):
- [Title](relative/path.md) — one-line hook (≤150 chars). - Table form: a markdown table with at minimum
PageandDescriptioncolumns, used when extra structured columns (e.g.Date,Theme,Status) genuinely aid navigation. Tables must still link each page in the first column and keep the description ≤150 chars.
Pick whichever shape makes the index more useful for the LLM scanning it; do not mix forms within a single category. Updated on every ingest and after the lint pass.
log.md¶
Append-only chronological record. Each entry begins with a line of the form:
## [YYYY-MM-DD] {ingest|query|lint} | <short title>
This makes the log greppable: grep "^## \[" log.md | tail -10.
Workflows¶
Ingest a new source¶
- Read the source.
- Write a
sources/<category>/<name>.mdpage with frontmatter + faithful summary + key quotes. - Update entity / concept / pipeline pages that the source touches; add
sourcesentries to their frontmatter. - Append to
index.mdandlog.md.
Answer a query¶
- Read
index.mdto locate relevant pages. - Read the relevant pages.
- Synthesise an answer with citations.
- If the answer is novel and useful, file it back into the wiki.
Lint pass¶
- Find orphans (pages with no inbound link).
- Find broken relative links.
- Find contradictions between pages.
- Find concepts mentioned without their own page.
- Produce
LINT_REPORT.md. Apply fixes.
Constraints¶
- No edits to source code or to the existing in-package domain model.
- No commits, pushes, or PR creation by any agent reading this file.
- No GitHub releases.
- No Claude attribution.