PR #372 — feat(commodity_hindcast): require forecast.residual_mode + gate forecast on run_dir compatibility¶

At a glance¶

Author: ai-tommytf
Merged: 2026-05-05
Branch: tl/make-ci-required-for-forecast-stage
Net effect: Makes forecast.residual_mode a mandatory field on ForecastConfig and adds a validate_residual_mode gate at the top of forecast.run() that rejects incompatible run_dir / config combinations before any feature build.
Why this matters: Before this PR, make forecast with no RUN_DIR would crash deep inside pandas with an opaque KeyError: 'fold_year' after ~2 minutes of compute; now the failure is immediate, at the door, with an actionable message.

PR body (faithful extract)¶

## Summary

- Make `forecast.residual_mode` a required field on `ForecastConfig`; replace `postprocess.conformalise[0]` as the runtime calibration source so the two axes (which sidecars to fit during hindcast vs which mode to apply at runtime) are no longer conflated.
- Add `validate_residual_mode` at the top of `forecast.run()`. It rejects three failure modes with actionable next-step messages, *before* any feature build:
  - empty run_dir → run `make hindcast` or `cli run fit-production`
  - OOS mode + no CV folds → run hindcast OR change to `in_sample_pooled`
  - `in_sample_pooled` + no production fold → run `cli run fit-production` first
- Remove `cli run forecast --config <yaml>` (silently broken: fitted production then crashed in calibration). Forecast now requires `--run-dir`. Makefile's `ifeq RUN_DIR` ➜ single recipe.
- Defensive guard in `CalibrationResult.to_frame()` — raises with diagnostic context on empty rows instead of bare `KeyError: 'fold_year'`.
- Extract `ResidualMode` (and the other Literal aliases) to `models/meta_models/types.py` so `config.py` can declare the field without a circular import.

## Context

Today, `make forecast EXPERIMENT_KEY=corn_usa` (no RUN_DIR) crashes deep inside pandas with `KeyError: 'fold_year'` because the convenience `--config` shortcut fits a production model but the post-forecast pipeline blindly tries to calibrate against walk-forward residuals that don't exist in a fit-production-only run_dir. This change moves the failure to the door — the user gets a fast, actionable error pointing at the exact next command to run, instead of a 2-minute walk followed by an opaque pandas exception.

## Test plan

- [x] **18-case integration test** at `tests/integration/commodity_hindcast/test_forecast_residual_mode_validation.py` pins the contract: 16-case parametrised matrix (has_cv_folds × has_production × residual_mode) + 2 message-content checks. Builds synthetic real-shaped run_dirs on disk, no mocks, exercises actual `ExperimentResult.from_run_dir` + fold discovery + validator end-to-end.
- [x] Full commodity_hindcast unit + integration suite: **329 passed**.
- [x] Ruff + ruff-format clean.
- [x] Pre-commit hooks pass.

## Migration

Per the project's no-backwards-compat policy:
- All 4 production YAMLs with a `forecast:` block (`corn_usa`, `soybeans_usa`, `soybeans_bra`, `wheat_usa`) declare `residual_mode: "hindcast_oos_per_init_date"` — preserves prior behaviour for the canonical post-hindcast forecast workflow.
- Old run_dirs whose `config_resolved.yaml` predates this change cannot be replayed against the new code. That's correct — those run_dirs were already producing crashes for any forecast that needed calibration.

## Pipeline shape after this PR

```text
make features    →    PATH A: make hindcast       ┐
                      PATH B: cli run fit-prod    ┘ → run_dir → make forecast RUN_DIR=<above>
                                                                      │
                                                          validate_residual_mode
                                                                      │
                                                       forecast walk + postprocess + deliver
```

`forecast.residual_mode` declares which past errors to calibrate against; the validator rejects YAML / run_dir mismatches before any compute.

Files / lines touched¶

Additions	Deletions	File
+189	-0	`tests/integration/commodity_hindcast/test_forecast_residual_mode_validation.py`
+79	-20	`market_insights_models/src/commodity_hindcast/stages/run_forecast.py`
+10	-32	`market_insights_models/src/commodity_hindcast/cli.py`
+18	-11	`market_insights_models/src/commodity_hindcast/models/meta_models/conformalise.py`
+21	-7	`market_insights_models/src/commodity_hindcast/config.py`
+26	-0	`market_insights_models/src/commodity_hindcast/models/meta_models/types.py`
+12	-8	`market_insights_models/src/commodity_hindcast/stages/run_meta_models.py`
+1	-5	`market_insights_models/src/commodity_hindcast/Makefile`
+5	-0	`tests/unit/commodity_hindcast/test_postprocess.py`
+4	-0	`market_insights_models/src/commodity_hindcast/configs/corn_usa.yaml`

Cross-references¶

Related entity pages: CalibrationResult, ForecastConfig
Related concept pages: conformal calibration modes
Related code pages: meta_models, stages (run_forecast)
Directly follows: PR-361 (which introduced CalibrationResult and residual_mode)

Lessons captured¶

forecast.residual_mode is now mandatory on ForecastConfig; there is no default.
validate_residual_mode must be the first call in forecast.run() — fail fast before any S3/feature I/O.
cli run forecast --config <yaml> was removed; all forecast invocations require --run-dir.
ResidualMode and other Literal aliases live in models/meta_models/types.py to avoid a circular import between config.py and conformalise.py.
The 18-case integration test (parametrised matrix, no mocks, real ExperimentResult.from_run_dir) is the contract for the validation logic.