Skip to content

Full hindcast re-run for one commodity

Failure-mode flowchart — full hindcast re-run

End-to-end re-run of the commodity_hindcast walk-forward pipeline for a single commodity. Each invocation mints a fresh timestamped run_root under config.run_dir_base, so this runbook is non-destructive with respect to prior runs.

For the orchestrator visual, see the explainer panel /data/processing/tmp/tmi-explainers/output/ch_01_hindcast.png.

1. When to use this runbook

  • A commodity YAML (e.g. configs/corn_usa.yaml) has changed — feature builders, experiment protocol, postprocess modes, or production_cumulative_threshold.
  • A model bug fix has landed and the historical CV metrics need to be regenerated to confirm the regression has been corrected.
  • Periodic refresh: a new harvest year of NASS / WASDE / weather data has been ingested and the hindcast must be re-cut to extend the history.
  • A/B comparison: producing two run_roots side-by-side under the same config family and diffing reports/metrics_table.csv between them.

2. Preconditions

The canonical machine-checked precondition list lives in market_insights_models/src/commodity_hindcast/run/preflight.py:59 (preflight_paths_for_hindcast). The hindcast stage invokes it before any compute starts and exits non-zero on the first failure. In addition, the following must hold on the operator's side before invocation:

  • INPUT_DATA_DIR is exported and points at the repo root (/data/processing/github/treefera-market-insights on the dev EC2). The config loader fails loud via require_input_data_dir() when it is unset (DESIGN.md Clause 6, surfaced in wiki/commodity_hindcast/pipelines/hindcast.md).
  • CROP_YIELD_GEOBOUNDARIES_FILE is exported. For US commodities use /data/processing/yield_forecast/raw/boundaries/geometry.parquet; for non-US commodities use the all-countries boundaries file (/data/processing/boundaries/processed/geoboundaries/all_geoboundaries_processed.parquet).
  • Feature parquets exist on disk: {features_dir}/{experiment_key}/fit.parquet and pred.parquet. These are hard preconditions for preflight_paths_for_hindcast (run/preflight.py:71-73); they are produced by cli run features, never by cli run hindcast itself.
  • Every ResolvablePath field on the resolved config (raw NASS, WASDE/CONAB reference series, weather climo, indices zarrs) resolves to an existing path. Coverage is mechanical: anything typed ResolvablePath is preflight-checked by construction (DESIGN.md Clause 69).
  • AWS credentials are valid for the active profile if data_root is an S3 URI (QA / prod). On the dev EC2 with a local data_root, this is moot.
  • The MLflow tracking SQLite DB at the URI configured in the experiment YAML is writable, and no other hindcast for the same commodity is in flight (see Section 5 — concurrent same-commodity runs collide on the SQLite file).
  • No same-commodity hindcast is currently writing to the same MLflow DB (project MEMORY.md known issue).

3. Procedure

All commands below are run from the repo root (/data/processing/github/treefera-market-insights). The Make targets in market_insights_models/src/commodity_hindcast/Makefile already cd $(REPO_ROOT) internally so relative paths in configs resolve correctly.

3.1 Set the operator environment

export INPUT_DATA_DIR=/data/processing/github/treefera-market-insights
export CROP_YIELD_GEOBOUNDARIES_FILE=/data/processing/yield_forecast/raw/boundaries/geometry.parquet
export EXPERIMENT_KEY=corn_usa   # or soybeans_usa | wheat_usa | cotton_usa | soybeans_bra

EXPERIMENT_KEY is consumed by the Makefile and resolves to configs/$(EXPERIMENT_KEY).yaml (Makefile lines 10-17). The Makefile unexports it so pydantic-settings cannot misread it as a nested config field.

3.2 Build features (only if not already cached)

If {features_dir}/{experiment_key}/fit.parquet and pred.parquet are missing or stale, build them first. The hindcast stage will not do this for you.

make -C market_insights_models/src/commodity_hindcast features EXPERIMENT_KEY=$EXPERIMENT_KEY

The Make target wraps cli run features --config configs/$EXPERIMENT_KEY.yaml (Makefile lines 49-51). Use features-force (lines 54-56) to overwrite cached parquets when the feature recipe itself has changed.

3.3 Run the hindcast

make -C market_insights_models/src/commodity_hindcast hindcast EXPERIMENT_KEY=$EXPERIMENT_KEY

This wraps cli run hindcast --config configs/$EXPERIMENT_KEY.yaml (Makefile lines 59-61), which lands in market_insights_models/src/commodity_hindcast/cli.py:209 (@run.command("hindcast")) and dispatches to stages/run_hindcast.py:197 (run()), the orchestrator entrypoint.

run() executes, in order: preflight → mint run_root → MLflow context → load + preprocess + select_by_production → persist included counties → walk-forward CV across config.experiment_protocol.test_years → production fit → investigate → postprocess → deliver → evaluate. The full stage DAG is documented in wiki/commodity_hindcast/pipelines/hindcast.md.

3.4 (Optional) Re-run a single downstream stage on the same run_root

Each post-fit stage has its own Make target that takes an existing RUN_DIR, useful when iterating on plots or delivery without retraining.

make -C market_insights_models/src/commodity_hindcast postprocess RUN_DIR=runs/20260508_103000_corn_usa
make -C market_insights_models/src/commodity_hindcast evaluate    RUN_DIR=runs/20260508_103000_corn_usa
make -C market_insights_models/src/commodity_hindcast plots       RUN_DIR=runs/20260508_103000_corn_usa
make -C market_insights_models/src/commodity_hindcast deliver     RUN_DIR=runs/20260508_103000_corn_usa

(Makefile lines 73-96.)

4. Verification

After cli run hindcast returns 0:

  • Run root exists. A directory matching {config.run_dir_base}/{YYYYMMDD_HHMMSS}_{experiment_key}/ is present and contains config_resolved.yaml (the audit snapshot of the resolved config — always diff this against the input YAML when investigating drift).
  • MLflow run is FINISHED, not FAILED. Open the MLflow UI (make mlflow in the top-level Makefile, line 104) and confirm the run named hindcast_<experiment_key>_<stamp> shows status FINISHED. The run id is also written to metadata_<stage>.yaml inside run_root (DESIGN.md Clause on MLflow / metadata co-persistence).
  • Delivery CSVs exist. Three files under <run_root>/delivery/ matching Treefera_<experiment_key>_ADM0_Hindcast_*.csv, Treefera_<experiment_key>_ADM1_Hindcast_*.csv, and Treefera_<experiment_key>_ADM2_Hindcast_*.csv (path contract from wiki/commodity_hindcast/pipelines/hindcast.md Outputs table).
  • Evaluate artefacts. <run_root>/reports/metrics_table.csv and <run_root>/reports/stage5_metrics*.txt are non-empty; per-fold PNGs under <run_root>/reports/hindcast/ follow the P##_<fold>_*.png naming convention (DESIGN.md Clause on plot filenames).
  • Sanity-check the national frame. Read <run_root>/postprocessed/national.parquet and confirm row count matches len(test_years) * len(init_dates) for the experiment, and that yield_predicted is finite and within the historical envelope.
  • Included-geo set persisted. <run_root>/included_geo_identifiers.txt exists (it is written before walk-forward; missing file means walk-forward could not have completed).

5. Failure modes and recovery

Symptom Cause Recovery
SystemExit: Critical preflight check failed on fit.parquet or pred.parquet (run/preflight.py:71-73) Features not built for this commodity Run Section 3.2 first (make features EXPERIMENT_KEY=...) — DESIGN.md Clause 31 forbids the hindcast stage from building features.
SystemExit: Critical preflight check failed on a ResolvablePath field data_root anchoring incorrect, or the upstream sync did not populate that file Verify INPUT_DATA_DIR is the repo root (project MEMORY.md). The historical canonical case was a missing forecast.raw_obs_filepath ten minutes into the corn run (DESIGN.md Clause 69) — preflight is the correct place to catch this.
RuntimeError: INPUT_DATA_DIR not set at config load Env var missing Export it (Section 3.1). The config helper is intentionally fail-loud; there is no silent fallback (project MEMORY.md, market_insights_models/CLAUDE.md).
sqlalchemy.exc.OperationalError from MLflow Two hindcast runs for the same commodity in flight on the same SQLite DB Serialise: wait for the in-flight run, or kill it. Project MEMORY.md known issue; see wiki/commodity_hindcast/pipelines/hindcast.md failure-modes table. Cross-reference the mlflow_db_recovery runbook ([PLACEHOLDER: link]).
Failure mid-walk-forward (e.g. transient disk / network) One fold wrote partial artefacts before the exception The run_root is left in place for inspection. Mint a fresh run by rerunning Section 3.3 — the prior run_root is untouched. Per-fold surgical recovery (delete the failed fold's directory under preds/{key}/ and rerun) is documented in wiki/commodity_hindcast/pipelines/hindcast.md but is brittle; preferred path is a clean re-run.
Empty or near-empty delivery CSV The included-geo cut was too aggressive — production_cumulative_threshold filtered out most of the universe Check <run_root>/included_geo_identifiers.txt row count against
config.experiment_protocol.production_cumulative_threshold. DESIGN.md Clause 128 sets the per-commodity defaults: corn/cotton 0.95, soybean 0.90, wheat 0.9999. A misconfigured threshold is the canonical cause of an empty delivery.
FileNotFoundError at postprocess on walk_forward_preds.parquet Walk-forward phase did not complete (silent earlier exception masked) Re-run the full hindcast (wiki/commodity_hindcast/pipelines/hindcast.md failure modes).
KeyError in MedianImputer.transform at predict time Stale fold artefact written with RangeIndex (pre-Clause-30 layout) Delete the affected fold directory under preds/{key}/ and re-run.
CI widths look wrong (very narrow) in_sample_pooled calibration mode active Switch forecast.residual_mode to an OOS mode and re-run cli run postprocess against the same run_root.
TMI selection-bias-correction values look wrong sign / magnitude Upstream NaN dropouts in weather features remove core corn-belt counties before training (project MEMORY.md known bug) Out of scope for this runbook; track the known-bugs ticket. The hindcast itself completes; only the SBC column is suspect.

6. Rollback

There is nothing to roll back. _create_run_root mints a fresh timestamped directory per invocation (run_hindcast.py:77, documented in wiki/commodity_hindcast/pipelines/hindcast.md); the previous run_root remains untouched on disk and remains the canonical record in MLflow. To revert to a prior hindcast as the active artefact, point downstream consumers (forecast, delivery sync) at the older run_root directly — no file mutation is required.

Disk-cleanup policy for stale run_root directories: [PLACEHOLDER: define retention period and pruning cadence; the current pattern is to keep all runs indefinitely, which will eventually fill the data volume].

Citations

  • market_insights_models/src/commodity_hindcast/Makefile lines 8, 10-17, 49-51, 54-56, 59-61, 73-96 — CLI wrapping, target list, EXPERIMENT_KEY contract.
  • market_insights_models/src/commodity_hindcast/cli.py:209 (@run.command("hindcast")) — entrypoint decorator.
  • market_insights_models/src/commodity_hindcast/stages/run_hindcast.py:197 (run()) — orchestrator entrypoint and stage DAG.
  • market_insights_models/src/commodity_hindcast/run/preflight.py:59 (preflight_paths_for_hindcast) — canonical precondition list.
  • wiki/commodity_hindcast/pipelines/hindcast.md — pipeline reference page, inputs/outputs/invariants/failure modes.
  • market_insights_models/src/commodity_hindcast/DESIGN.md Clauses 6, 17, 23, 31, 69, 128 — env var contract, plot filenames, MLflow contract, features-as-precondition, ResolvablePath preflight rule, and production_cumulative_threshold defence-in-depth.
  • Project MEMORY.md — INPUT_DATA_DIR per-pipeline value, MLflow DB locking known issue, geoboundaries env var, TMI SBC known bug.
  • /data/processing/tmp/tmi-explainers/output/ch_01_hindcast.png — visual reference for the orchestrator.