Full hindcast re-run for one commodity¶

Failure-mode flowchart — full hindcast re-run

End-to-end re-run of the commodity_hindcast walk-forward pipeline for a single commodity. Each invocation mints a fresh timestamped run_root under config.run_dir_base, so this runbook is non-destructive with respect to prior runs.

For the orchestrator visual, see the explainer panel /data/processing/tmp/tmi-explainers/output/ch_01_hindcast.png.

1. When to use this runbook¶

A commodity YAML (e.g. configs/corn_usa.yaml) has changed — feature builders, experiment protocol, postprocess modes, or production_cumulative_threshold.
A model bug fix has landed and the historical CV metrics need to be regenerated to confirm the regression has been corrected.
Periodic refresh: a new harvest year of NASS / WASDE / weather data has been ingested and the hindcast must be re-cut to extend the history.
A/B comparison: producing two run_roots side-by-side under the same config family and diffing reports/metrics_table.csv between them.

2. Preconditions¶

The canonical machine-checked precondition list lives in market_insights_models/src/commodity_hindcast/run/preflight.py:59 (preflight_paths_for_hindcast). The hindcast stage invokes it before any compute starts and exits non-zero on the first failure. In addition, the following must hold on the operator's side before invocation:

INPUT_DATA_DIR is exported and points at the repo root (/data/processing/github/treefera-market-insights on the dev EC2). The config loader fails loud via require_input_data_dir() when it is unset (DESIGN.md Clause 6, surfaced in wiki/commodity_hindcast/pipelines/hindcast.md).
CROP_YIELD_GEOBOUNDARIES_FILE is exported. For US commodities use /data/processing/yield_forecast/raw/boundaries/geometry.parquet; for non-US commodities use the all-countries boundaries file (/data/processing/boundaries/processed/geoboundaries/all_geoboundaries_processed.parquet).
Feature parquets exist on disk: {features_dir}/{experiment_key}/fit.parquet and pred.parquet. These are hard preconditions for preflight_paths_for_hindcast (run/preflight.py:71-73); they are produced by cli run features, never by cli run hindcast itself.
Every ResolvablePath field on the resolved config (raw NASS, WASDE/CONAB reference series, weather climo, indices zarrs) resolves to an existing path. Coverage is mechanical: anything typed ResolvablePath is preflight-checked by construction (DESIGN.md Clause 69).
AWS credentials are valid for the active profile if data_root is an S3 URI (QA / prod). On the dev EC2 with a local data_root, this is moot.
The MLflow tracking SQLite DB at the URI configured in the experiment YAML is writable, and no other hindcast for the same commodity is in flight (see Section 5 — concurrent same-commodity runs collide on the SQLite file).
No same-commodity hindcast is currently writing to the same MLflow DB (project MEMORY.md known issue).

3. Procedure¶

All commands below are run from the repo root (/data/processing/github/treefera-market-insights). The Make targets in market_insights_models/src/commodity_hindcast/Makefile already cd $(REPO_ROOT) internally so relative paths in configs resolve correctly.

3.1 Set the operator environment¶

export INPUT_DATA_DIR=/data/processing/github/treefera-market-insights
export CROP_YIELD_GEOBOUNDARIES_FILE=/data/processing/yield_forecast/raw/boundaries/geometry.parquet
export EXPERIMENT_KEY=corn_usa   # or soybeans_usa | wheat_usa | cotton_usa | soybeans_bra

EXPERIMENT_KEY is consumed by the Makefile and resolves to configs/$(EXPERIMENT_KEY).yaml (Makefile lines 10-17). The Makefile unexports it so pydantic-settings cannot misread it as a nested config field.

3.2 Build features (only if not already cached)¶

If {features_dir}/{experiment_key}/fit.parquet and pred.parquet are missing or stale, build them first. The hindcast stage will not do this for you.

make -C market_insights_models/src/commodity_hindcast features EXPERIMENT_KEY=$EXPERIMENT_KEY

The Make target wraps cli run features --config configs/$EXPERIMENT_KEY.yaml (Makefile lines 49-51). Use features-force (lines 54-56) to overwrite cached parquets when the feature recipe itself has changed.

3.3 Run the hindcast¶

make -C market_insights_models/src/commodity_hindcast hindcast EXPERIMENT_KEY=$EXPERIMENT_KEY

This wraps cli run hindcast --config configs/$EXPERIMENT_KEY.yaml (Makefile lines 59-61), which lands in market_insights_models/src/commodity_hindcast/cli.py:209 (@run.command("hindcast")) and dispatches to stages/run_hindcast.py:197 (run()), the orchestrator entrypoint.

run() executes, in order: preflight → mint run_root → MLflow context → load + preprocess + select_by_production → persist included counties → walk-forward CV across config.experiment_protocol.test_years → production fit → investigate → postprocess → deliver → evaluate. The full stage DAG is documented in wiki/commodity_hindcast/pipelines/hindcast.md.

3.4 (Optional) Re-run a single downstream stage on the same `run_root`¶

Each post-fit stage has its own Make target that takes an existing RUN_DIR, useful when iterating on plots or delivery without retraining.

make -C market_insights_models/src/commodity_hindcast postprocess RUN_DIR=runs/20260508_103000_corn_usa
make -C market_insights_models/src/commodity_hindcast evaluate    RUN_DIR=runs/20260508_103000_corn_usa
make -C market_insights_models/src/commodity_hindcast plots       RUN_DIR=runs/20260508_103000_corn_usa
make -C market_insights_models/src/commodity_hindcast deliver     RUN_DIR=runs/20260508_103000_corn_usa

(Makefile lines 73-96.)

4. Verification¶

After cli run hindcast returns 0:

Run root exists. A directory matching {config.run_dir_base}/{YYYYMMDD_HHMMSS}_{experiment_key}/ is present and contains config_resolved.yaml (the audit snapshot of the resolved config — always diff this against the input YAML when investigating drift).
MLflow run is FINISHED, not FAILED. Open the MLflow UI (make mlflow in the top-level Makefile, line 104) and confirm the run named hindcast_<experiment_key>_<stamp> shows status FINISHED. The run id is also written to metadata_<stage>.yaml inside run_root (DESIGN.md Clause on MLflow / metadata co-persistence).
Delivery CSVs exist. Three files under <run_root>/delivery/ matching Treefera_<experiment_key>_ADM0_Hindcast_*.csv, Treefera_<experiment_key>_ADM1_Hindcast_*.csv, and Treefera_<experiment_key>_ADM2_Hindcast_*.csv (path contract from wiki/commodity_hindcast/pipelines/hindcast.md Outputs table).
Evaluate artefacts. <run_root>/reports/metrics_table.csv and <run_root>/reports/stage5_metrics*.txt are non-empty; per-fold PNGs under <run_root>/reports/hindcast/ follow the P##_<fold>_*.png naming convention (DESIGN.md Clause on plot filenames).
Sanity-check the national frame. Read <run_root>/postprocessed/national.parquet and confirm row count matches len(test_years) * len(init_dates) for the experiment, and that yield_predicted is finite and within the historical envelope.
Included-geo set persisted. <run_root>/included_geo_identifiers.txt exists (it is written before walk-forward; missing file means walk-forward could not have completed).

5. Failure modes and recovery¶

Symptom	Cause	Recovery
`SystemExit: Critical preflight check failed` on `fit.parquet` or `pred.parquet` (`run/preflight.py:71-73`)	Features not built for this commodity	Run Section 3.2 first (`make features EXPERIMENT_KEY=...`) — DESIGN.md Clause 31 forbids the hindcast stage from building features.
`SystemExit: Critical preflight check failed` on a `ResolvablePath` field	`data_root` anchoring incorrect, or the upstream sync did not populate that file	Verify `INPUT_DATA_DIR` is the repo root (project MEMORY.md). The historical canonical case was a missing `forecast.raw_obs_filepath` ten minutes into the corn run (DESIGN.md Clause 69) — preflight is the correct place to catch this.
`RuntimeError: INPUT_DATA_DIR not set` at config load	Env var missing	Export it (Section 3.1). The config helper is intentionally fail-loud; there is no silent fallback (project MEMORY.md, `market_insights_models/CLAUDE.md`).
`sqlalchemy.exc.OperationalError` from MLflow	Two hindcast runs for the same commodity in flight on the same SQLite DB	Serialise: wait for the in-flight run, or kill it. Project MEMORY.md known issue; see `wiki/commodity_hindcast/pipelines/hindcast.md` failure-modes table. Cross-reference the `mlflow_db_recovery` runbook ([PLACEHOLDER: link]).
Failure mid-walk-forward (e.g. transient disk / network)	One fold wrote partial artefacts before the exception	The `run_root` is left in place for inspection. Mint a fresh run by rerunning Section 3.3 — the prior `run_root` is untouched. Per-fold surgical recovery (delete the failed fold's directory under `preds/{key}/` and rerun) is documented in `wiki/commodity_hindcast/pipelines/hindcast.md` but is brittle; preferred path is a clean re-run.
Empty or near-empty delivery CSV	The included-geo cut was too aggressive — `production_cumulative_threshold` filtered out most of the universe	Check `<run_root>/included_geo_identifiers.txt` row count against
`config.experiment_protocol.production_cumulative_threshold`. DESIGN.md Clause 128 sets the per-commodity defaults: corn/cotton 0.95, soybean 0.90, wheat 0.9999. A misconfigured threshold is the canonical cause of an empty delivery.
`FileNotFoundError` at postprocess on `walk_forward_preds.parquet`	Walk-forward phase did not complete (silent earlier exception masked)	Re-run the full hindcast (`wiki/commodity_hindcast/pipelines/hindcast.md` failure modes).
`KeyError` in `MedianImputer.transform` at predict time	Stale fold artefact written with `RangeIndex` (pre-Clause-30 layout)	Delete the affected fold directory under `preds/{key}/` and re-run.
CI widths look wrong (very narrow)	`in_sample_pooled` calibration mode active	Switch `forecast.residual_mode` to an OOS mode and re-run `cli run postprocess` against the same `run_root`.
TMI selection-bias-correction values look wrong sign / magnitude	Upstream NaN dropouts in weather features remove core corn-belt counties before training (project MEMORY.md known bug)	Out of scope for this runbook; track the known-bugs ticket. The hindcast itself completes; only the SBC column is suspect.

6. Rollback¶

There is nothing to roll back. _create_run_root mints a fresh timestamped directory per invocation (run_hindcast.py:77, documented in wiki/commodity_hindcast/pipelines/hindcast.md); the previous run_root remains untouched on disk and remains the canonical record in MLflow. To revert to a prior hindcast as the active artefact, point downstream consumers (forecast, delivery sync) at the older run_root directly — no file mutation is required.

Disk-cleanup policy for stale run_root directories: [PLACEHOLDER: define retention period and pruning cadence; the current pattern is to keep all runs indefinitely, which will eventually fill the data volume].

Citations¶

market_insights_models/src/commodity_hindcast/Makefile lines 8, 10-17, 49-51, 54-56, 59-61, 73-96 — CLI wrapping, target list, EXPERIMENT_KEY contract.
market_insights_models/src/commodity_hindcast/cli.py:209 (@run.command("hindcast")) — entrypoint decorator.
market_insights_models/src/commodity_hindcast/stages/run_hindcast.py:197 (run()) — orchestrator entrypoint and stage DAG.
market_insights_models/src/commodity_hindcast/run/preflight.py:59 (preflight_paths_for_hindcast) — canonical precondition list.
wiki/commodity_hindcast/pipelines/hindcast.md — pipeline reference page, inputs/outputs/invariants/failure modes.
market_insights_models/src/commodity_hindcast/DESIGN.md Clauses 6, 17, 23, 31, 69, 128 — env var contract, plot filenames, MLflow contract, features-as-precondition, ResolvablePath preflight rule, and production_cumulative_threshold defence-in-depth.
Project MEMORY.md — INPUT_DATA_DIR per-pipeline value, MLflow DB locking known issue, geoboundaries env var, TMI SBC known bug.
/data/processing/tmp/tmi-explainers/output/ch_01_hindcast.png — visual reference for the orchestrator.