Skip to content

PR #372 — feat(commodity_hindcast): require forecast.residual_mode + gate forecast on run_dir compatibility

At a glance

  • Author: ai-tommytf
  • Merged: 2026-05-05
  • Branch: tl/make-ci-required-for-forecast-stage
  • Net effect: Makes forecast.residual_mode a mandatory field on ForecastConfig and adds a validate_residual_mode gate at the top of forecast.run() that rejects incompatible run_dir / config combinations before any feature build.
  • Why this matters: Before this PR, make forecast with no RUN_DIR would crash deep inside pandas with an opaque KeyError: 'fold_year' after ~2 minutes of compute; now the failure is immediate, at the door, with an actionable message.

PR body (faithful extract)

## Summary

- Make `forecast.residual_mode` a required field on `ForecastConfig`; replace `postprocess.conformalise[0]` as the runtime calibration source so the two axes (which sidecars to fit during hindcast vs which mode to apply at runtime) are no longer conflated.
- Add `validate_residual_mode` at the top of `forecast.run()`. It rejects three failure modes with actionable next-step messages, *before* any feature build:
  - empty run_dir → run `make hindcast` or `cli run fit-production`
  - OOS mode + no CV folds → run hindcast OR change to `in_sample_pooled`
  - `in_sample_pooled` + no production fold → run `cli run fit-production` first
- Remove `cli run forecast --config <yaml>` (silently broken: fitted production then crashed in calibration). Forecast now requires `--run-dir`. Makefile's `ifeq RUN_DIR` ➜ single recipe.
- Defensive guard in `CalibrationResult.to_frame()` — raises with diagnostic context on empty rows instead of bare `KeyError: 'fold_year'`.
- Extract `ResidualMode` (and the other Literal aliases) to `models/meta_models/types.py` so `config.py` can declare the field without a circular import.

## Context

Today, `make forecast EXPERIMENT_KEY=corn_usa` (no RUN_DIR) crashes deep inside pandas with `KeyError: 'fold_year'` because the convenience `--config` shortcut fits a production model but the post-forecast pipeline blindly tries to calibrate against walk-forward residuals that don't exist in a fit-production-only run_dir. This change moves the failure to the door — the user gets a fast, actionable error pointing at the exact next command to run, instead of a 2-minute walk followed by an opaque pandas exception.

## Test plan

- [x] **18-case integration test** at `tests/integration/commodity_hindcast/test_forecast_residual_mode_validation.py` pins the contract: 16-case parametrised matrix (has_cv_folds × has_production × residual_mode) + 2 message-content checks. Builds synthetic real-shaped run_dirs on disk, no mocks, exercises actual `ExperimentResult.from_run_dir` + fold discovery + validator end-to-end.
- [x] Full commodity_hindcast unit + integration suite: **329 passed**.
- [x] Ruff + ruff-format clean.
- [x] Pre-commit hooks pass.

## Migration

Per the project's no-backwards-compat policy:
- All 4 production YAMLs with a `forecast:` block (`corn_usa`, `soybeans_usa`, `soybeans_bra`, `wheat_usa`) declare `residual_mode: "hindcast_oos_per_init_date"` — preserves prior behaviour for the canonical post-hindcast forecast workflow.
- Old run_dirs whose `config_resolved.yaml` predates this change cannot be replayed against the new code. That's correct — those run_dirs were already producing crashes for any forecast that needed calibration.

## Pipeline shape after this PR

```text
make features    →    PATH A: make hindcast       ┐
                      PATH B: cli run fit-prod    ┘ → run_dir → make forecast RUN_DIR=<above>
                                                          validate_residual_mode
                                                       forecast walk + postprocess + deliver
```

`forecast.residual_mode` declares which past errors to calibrate against; the validator rejects YAML / run_dir mismatches before any compute.

Files / lines touched

Additions Deletions File
+189 -0 tests/integration/commodity_hindcast/test_forecast_residual_mode_validation.py
+79 -20 market_insights_models/src/commodity_hindcast/stages/run_forecast.py
+10 -32 market_insights_models/src/commodity_hindcast/cli.py
+18 -11 market_insights_models/src/commodity_hindcast/models/meta_models/conformalise.py
+21 -7 market_insights_models/src/commodity_hindcast/config.py
+26 -0 market_insights_models/src/commodity_hindcast/models/meta_models/types.py
+12 -8 market_insights_models/src/commodity_hindcast/stages/run_meta_models.py
+1 -5 market_insights_models/src/commodity_hindcast/Makefile
+5 -0 tests/unit/commodity_hindcast/test_postprocess.py
+4 -0 market_insights_models/src/commodity_hindcast/configs/corn_usa.yaml

Cross-references

Lessons captured

  • forecast.residual_mode is now mandatory on ForecastConfig; there is no default.
  • validate_residual_mode must be the first call in forecast.run() — fail fast before any S3/feature I/O.
  • cli run forecast --config <yaml> was removed; all forecast invocations require --run-dir.
  • ResidualMode and other Literal aliases live in models/meta_models/types.py to avoid a circular import between config.py and conformalise.py.
  • The 18-case integration test (parametrised matrix, no mocks, real ExperimentResult.from_run_dir) is the contract for the validation logic.