Full hindcast re-run for one commodity¶

End-to-end re-run of the commodity_hindcast walk-forward pipeline for a single
commodity. Each invocation mints a fresh timestamped run_root under
config.run_dir_base, so this runbook is non-destructive with respect to prior
runs.
For the orchestrator visual, see the explainer panel
/data/processing/tmp/tmi-explainers/output/ch_01_hindcast.png.
1. When to use this runbook¶
- A commodity YAML (e.g.
configs/corn_usa.yaml) has changed — feature builders, experiment protocol, postprocess modes, orproduction_cumulative_threshold. - A model bug fix has landed and the historical CV metrics need to be regenerated to confirm the regression has been corrected.
- Periodic refresh: a new harvest year of NASS / WASDE / weather data has been ingested and the hindcast must be re-cut to extend the history.
- A/B comparison: producing two
run_roots side-by-side under the same config family and diffingreports/metrics_table.csvbetween them.
2. Preconditions¶
The canonical machine-checked precondition list lives in
market_insights_models/src/commodity_hindcast/run/preflight.py:59
(preflight_paths_for_hindcast). The hindcast stage invokes it before any compute
starts and exits non-zero on the first failure. In addition, the following must
hold on the operator's side before invocation:
INPUT_DATA_DIRis exported and points at the repo root (/data/processing/github/treefera-market-insightson the dev EC2). The config loader fails loud viarequire_input_data_dir()when it is unset (DESIGN.md Clause 6, surfaced inwiki/commodity_hindcast/pipelines/hindcast.md).CROP_YIELD_GEOBOUNDARIES_FILEis exported. For US commodities use/data/processing/yield_forecast/raw/boundaries/geometry.parquet; for non-US commodities use the all-countries boundaries file (/data/processing/boundaries/processed/geoboundaries/all_geoboundaries_processed.parquet).- Feature parquets exist on disk:
{features_dir}/{experiment_key}/fit.parquetandpred.parquet. These are hard preconditions forpreflight_paths_for_hindcast(run/preflight.py:71-73); they are produced bycli run features, never bycli run hindcastitself. - Every
ResolvablePathfield on the resolved config (raw NASS, WASDE/CONAB reference series, weather climo, indices zarrs) resolves to an existing path. Coverage is mechanical: anything typedResolvablePathis preflight-checked by construction (DESIGN.md Clause 69). - AWS credentials are valid for the active profile if
data_rootis an S3 URI (QA / prod). On the dev EC2 with a localdata_root, this is moot. - The MLflow tracking SQLite DB at the URI configured in the experiment YAML is writable, and no other hindcast for the same commodity is in flight (see Section 5 — concurrent same-commodity runs collide on the SQLite file).
- No same-commodity hindcast is currently writing to the same MLflow DB (project MEMORY.md known issue).
3. Procedure¶
All commands below are run from the repo root
(/data/processing/github/treefera-market-insights). The Make targets in
market_insights_models/src/commodity_hindcast/Makefile already cd $(REPO_ROOT)
internally so relative paths in configs resolve correctly.
3.1 Set the operator environment¶
export INPUT_DATA_DIR=/data/processing/github/treefera-market-insights
export CROP_YIELD_GEOBOUNDARIES_FILE=/data/processing/yield_forecast/raw/boundaries/geometry.parquet
export EXPERIMENT_KEY=corn_usa # or soybeans_usa | wheat_usa | cotton_usa | soybeans_bra
EXPERIMENT_KEY is consumed by the Makefile and resolves to
configs/$(EXPERIMENT_KEY).yaml (Makefile lines 10-17). The Makefile
unexports it so pydantic-settings cannot misread it as a nested config field.
3.2 Build features (only if not already cached)¶
If {features_dir}/{experiment_key}/fit.parquet and pred.parquet are missing
or stale, build them first. The hindcast stage will not do this for you.
The Make target wraps cli run features --config configs/$EXPERIMENT_KEY.yaml
(Makefile lines 49-51). Use features-force (lines 54-56) to overwrite cached
parquets when the feature recipe itself has changed.
3.3 Run the hindcast¶
This wraps cli run hindcast --config configs/$EXPERIMENT_KEY.yaml
(Makefile lines 59-61), which lands in
market_insights_models/src/commodity_hindcast/cli.py:209
(@run.command("hindcast")) and dispatches to
stages/run_hindcast.py:197 (run()), the orchestrator entrypoint.
run() executes, in order: preflight → mint run_root → MLflow context →
load + preprocess + select_by_production → persist included counties →
walk-forward CV across config.experiment_protocol.test_years →
production fit → investigate → postprocess → deliver → evaluate. The full stage
DAG is documented in wiki/commodity_hindcast/pipelines/hindcast.md.
3.4 (Optional) Re-run a single downstream stage on the same run_root¶
Each post-fit stage has its own Make target that takes an existing RUN_DIR,
useful when iterating on plots or delivery without retraining.
make -C market_insights_models/src/commodity_hindcast postprocess RUN_DIR=runs/20260508_103000_corn_usa
make -C market_insights_models/src/commodity_hindcast evaluate RUN_DIR=runs/20260508_103000_corn_usa
make -C market_insights_models/src/commodity_hindcast plots RUN_DIR=runs/20260508_103000_corn_usa
make -C market_insights_models/src/commodity_hindcast deliver RUN_DIR=runs/20260508_103000_corn_usa
(Makefile lines 73-96.)
4. Verification¶
After cli run hindcast returns 0:
- Run root exists. A directory matching
{config.run_dir_base}/{YYYYMMDD_HHMMSS}_{experiment_key}/is present and containsconfig_resolved.yaml(the audit snapshot of the resolved config — always diff this against the input YAML when investigating drift). - MLflow run is
FINISHED, notFAILED. Open the MLflow UI (make mlflowin the top-level Makefile, line 104) and confirm the run namedhindcast_<experiment_key>_<stamp>shows statusFINISHED. The run id is also written tometadata_<stage>.yamlinsiderun_root(DESIGN.md Clause on MLflow / metadata co-persistence). - Delivery CSVs exist. Three files under
<run_root>/delivery/matchingTreefera_<experiment_key>_ADM0_Hindcast_*.csv,Treefera_<experiment_key>_ADM1_Hindcast_*.csv, andTreefera_<experiment_key>_ADM2_Hindcast_*.csv(path contract fromwiki/commodity_hindcast/pipelines/hindcast.mdOutputs table). - Evaluate artefacts.
<run_root>/reports/metrics_table.csvand<run_root>/reports/stage5_metrics*.txtare non-empty; per-fold PNGs under<run_root>/reports/hindcast/follow theP##_<fold>_*.pngnaming convention (DESIGN.md Clause on plot filenames). - Sanity-check the national frame. Read
<run_root>/postprocessed/national.parquetand confirm row count matcheslen(test_years) * len(init_dates)for the experiment, and thatyield_predictedis finite and within the historical envelope. - Included-geo set persisted.
<run_root>/included_geo_identifiers.txtexists (it is written before walk-forward; missing file means walk-forward could not have completed).
5. Failure modes and recovery¶
| Symptom | Cause | Recovery |
|---|---|---|
SystemExit: Critical preflight check failed on fit.parquet or pred.parquet (run/preflight.py:71-73) |
Features not built for this commodity | Run Section 3.2 first (make features EXPERIMENT_KEY=...) — DESIGN.md Clause 31 forbids the hindcast stage from building features. |
SystemExit: Critical preflight check failed on a ResolvablePath field |
data_root anchoring incorrect, or the upstream sync did not populate that file |
Verify INPUT_DATA_DIR is the repo root (project MEMORY.md). The historical canonical case was a missing forecast.raw_obs_filepath ten minutes into the corn run (DESIGN.md Clause 69) — preflight is the correct place to catch this. |
RuntimeError: INPUT_DATA_DIR not set at config load |
Env var missing | Export it (Section 3.1). The config helper is intentionally fail-loud; there is no silent fallback (project MEMORY.md, market_insights_models/CLAUDE.md). |
sqlalchemy.exc.OperationalError from MLflow |
Two hindcast runs for the same commodity in flight on the same SQLite DB | Serialise: wait for the in-flight run, or kill it. Project MEMORY.md known issue; see wiki/commodity_hindcast/pipelines/hindcast.md failure-modes table. Cross-reference the mlflow_db_recovery runbook ([PLACEHOLDER: link]). |
| Failure mid-walk-forward (e.g. transient disk / network) | One fold wrote partial artefacts before the exception | The run_root is left in place for inspection. Mint a fresh run by rerunning Section 3.3 — the prior run_root is untouched. Per-fold surgical recovery (delete the failed fold's directory under preds/{key}/ and rerun) is documented in wiki/commodity_hindcast/pipelines/hindcast.md but is brittle; preferred path is a clean re-run. |
| Empty or near-empty delivery CSV | The included-geo cut was too aggressive — production_cumulative_threshold filtered out most of the universe |
Check <run_root>/included_geo_identifiers.txt row count against |
config.experiment_protocol.production_cumulative_threshold. DESIGN.md Clause 128 sets the per-commodity defaults: corn/cotton 0.95, soybean 0.90, wheat 0.9999. A misconfigured threshold is the canonical cause of an empty delivery. |
||
FileNotFoundError at postprocess on walk_forward_preds.parquet |
Walk-forward phase did not complete (silent earlier exception masked) | Re-run the full hindcast (wiki/commodity_hindcast/pipelines/hindcast.md failure modes). |
KeyError in MedianImputer.transform at predict time |
Stale fold artefact written with RangeIndex (pre-Clause-30 layout) |
Delete the affected fold directory under preds/{key}/ and re-run. |
| CI widths look wrong (very narrow) | in_sample_pooled calibration mode active |
Switch forecast.residual_mode to an OOS mode and re-run cli run postprocess against the same run_root. |
| TMI selection-bias-correction values look wrong sign / magnitude | Upstream NaN dropouts in weather features remove core corn-belt counties before training (project MEMORY.md known bug) | Out of scope for this runbook; track the known-bugs ticket. The hindcast itself completes; only the SBC column is suspect. |
6. Rollback¶
There is nothing to roll back. _create_run_root mints a fresh timestamped
directory per invocation (run_hindcast.py:77, documented in
wiki/commodity_hindcast/pipelines/hindcast.md); the previous run_root
remains untouched on disk and remains the canonical record in MLflow. To revert
to a prior hindcast as the active artefact, point downstream consumers
(forecast, delivery sync) at the older run_root directly — no file mutation
is required.
Disk-cleanup policy for stale run_root directories: [PLACEHOLDER: define
retention period and pruning cadence; the current pattern is to keep all runs
indefinitely, which will eventually fill the data volume].
Citations¶
market_insights_models/src/commodity_hindcast/Makefilelines 8, 10-17, 49-51, 54-56, 59-61, 73-96 — CLI wrapping, target list, EXPERIMENT_KEY contract.market_insights_models/src/commodity_hindcast/cli.py:209(@run.command("hindcast")) — entrypoint decorator.market_insights_models/src/commodity_hindcast/stages/run_hindcast.py:197(run()) — orchestrator entrypoint and stage DAG.market_insights_models/src/commodity_hindcast/run/preflight.py:59(preflight_paths_for_hindcast) — canonical precondition list.wiki/commodity_hindcast/pipelines/hindcast.md— pipeline reference page, inputs/outputs/invariants/failure modes.market_insights_models/src/commodity_hindcast/DESIGN.mdClauses 6, 17, 23, 31, 69, 128 — env var contract, plot filenames, MLflow contract, features-as-precondition, ResolvablePath preflight rule, andproduction_cumulative_thresholddefence-in-depth.- Project MEMORY.md —
INPUT_DATA_DIRper-pipeline value, MLflow DB locking known issue, geoboundaries env var, TMI SBC known bug. /data/processing/tmp/tmi-explainers/output/ch_01_hindcast.png— visual reference for the orchestrator.