Concept: Hindcast vs Forecast¶

What it is¶

Hindcast is the retrospective evaluation pipeline. Given a complete set of pre-built features (fit.parquet, pred.parquet), it runs walk-forward cross-validation (one fold per season_year in config.experiment_protocol.test_years), trains a production model on all available data, then postprocesses the walk-forward predictions into calibrated delivery artefacts. Its output is a run_dir on disk that captures every artefact needed to understand the model's historical skill — CV fold models, conformal calibration sidecars, per-fold walk_forward_preds.parquet files, national aggregates, and delivery CSVs.

Forecast is the forward-looking prediction pipeline. It is a consumer of an already-complete hindcast run_dir. Given a (season_year, init_date) pair, it builds fresh features for that specific init_date, runs inference against the production fold's frozen model artefacts, postprocesses to a national frame using the configured conformal calibration, and emits delivery CSVs. It writes exclusively inside run_dir/forecast/{season_year}/{init_date}/ and never touches canonical hindcast paths.

The distinction matters practically: hindcast determines what is true about the past; forecast determines what can be said about the future given that truth. Running make forecast twice at the same init_date for different season_year values is safe because PR #369 restructured forecast artefact paths to forecast/{season_year}/{init_date}/, making the two subtrees disjoint.

Where it lives in the code¶

Orchestrator	Entry point	Key function
Hindcast	`stages/run_hindcast.py:193`	`run(config_path)`
Forecast	`stages/run_forecast.py:143`	`run(run_dir, *, season_year, init_date, force)`

run_hindcast.run also exposes fit_production(config_path) (run_hindcast.py:229), a fast path that trains only the production model (no walk-forward CV, no postprocess, no evaluate, no deliver). This is the entry point when out-of-sample calibration is not required.

The read-only boundary is declared in DESIGN.md line 125:

"The forecast pipeline SHALL NOT write to canonical hindcast artefacts under {features_dir}/{experiment_key}/."

It is enforced structurally: ForecastSlice.root always resolves to run_dir/forecast/{season_year}/{init_date}/ (lib/results/results_slice.py:326), and validate_residual_mode at run_forecast.py:91 is the first call inside run_forecast.run(), rejecting incompatible run_dir / config pairs before any feature I/O touches the disk.

Key invariants¶

Hindcast run_dir is the sole source of trained artefacts (model, detrender, fill values, conformal calibration sidecars). Forecast only reads from it.
All forecast outputs land under run_dir/forecast/{season_year}/{init_date}/. No forecast stage writes to {features_dir}/{experiment_key}/ (DESIGN.md:125).
validate_residual_mode (run_forecast.py:91) MUST be the first call in run_forecast.run() — fail fast before any S3/feature I/O (PR #372).
Feature paths (features_fit_path, features_pred_path) are read-only references on both HindcastSlice and ForecastSlice; they resolve to {features_dir}/{experiment_key}/fit.parquet and pred.parquet respectively.

Side-by-side comparison¶

Aspect	Hindcast	Forecast
Entry point	`run_hindcast.run(config_path)`	`run_forecast.run(run_dir, season_year, init_date)`
Config source	YAML at `config_path`; writes `config_resolved.yaml` into new `run_dir`	Reads `config_resolved.yaml` from existing `run_dir`; no YAML argument
What is frozen	Nothing — all artefacts are produced by this run	Production model, detrender, feature fill values, conformal sidecars
What is recomputed	Everything under `run_dir`	`indices.zarr`, `features/pred.parquet`, `walk_forward_preds.parquet`, `postprocessed/national.parquet`, delivery CSVs — all isolated under `forecast/{season_year}/{init_date}/`
Output paths	`run_dir/models/`, `run_dir/preds/`, `run_dir/conformal/`, `run_dir/postprocessed/`, `run_dir/delivery/`	`run_dir/forecast/{season_year}/{init_date}/` only
Walk-forward CV	Yes — one fold per `test_year`	No — single `(season_year, init_date)` per call
Production fit	Yes — `_run_production_fit_phase`	Reuses existing production fold; raises if absent
Conformal calibration	Fits + saves all configured modes to `run_dir/conformal/{mode}.parquet`	Loads primary mode (or fits on demand via `get_or_fit_calibration`)
Feature build	Reads pre-built `fit.parquet` / `pred.parquet` from `features_dir`	Materialises fresh `indices.zarr` + `pred.parquet` for the single `init_date`
Multi-year support	All `test_years` in a single run	Each `season_year` at a given `init_date` is a separate invocation; PR #369 made paths disjoint

How it interacts with the pipeline¶

The hindcast pipeline is described end-to-end in the hindcast pipeline page (forward ref: ../pipelines/hindcast.md). The forecast pipeline is described in ../pipelines/forecast.md. The walk-forward CV mechanism used inside hindcast is documented in walk_forward_cv.md.

Conformal calibration — which sidecars the hindcast writes and which mode the forecast consumes — is the key coupling point between the two pipelines. See conformal_modes.md and residual_modes.md.

Pitfalls and historical bugs¶

PR #369 — single directory per init_date (fixed 2026-05-05): Before PR #369, all forecast artefacts for a given init_date lived at run_dir/forecast/{init_date}/. Calling forecast for season_year=2027 and then season_year=2028 at the same init_date silently overwrote all artefacts from the first call. The second call then raised ValueError: No rows in pred.parquet for season_year=2027. PR #369 fixed this by keying on (season_year, init_date) — ForecastSlice.root is the single source of truth (results_slice.py:326).

PR #372 — opaque pandas crash from unvalidated run_dir (fixed 2026-05-05): Before PR #372, invoking make forecast with no RUN_DIR triggered production model training but then crashed inside pandas with KeyError: 'fold_year' after ~2 minutes of compute, because the calibration code expected walk-forward CV residuals that did not exist. validate_residual_mode now gates the entire run before any feature I/O.

ForecastSlice — path handle for one (season_year, init_date) forecast
HindcastSlice — path handle for one CV fold
ExperimentResult — aggregate root for the whole run_dir
CalibrationResult — conformal calibration sidecar produced by hindcast, consumed by forecast
walk_forward_cv.md — fold strategy within hindcast
residual_modes.md — forecast.residual_mode field and validate_residual_mode gate

PRs and commits¶

PR	Relevance
PR-369	Restructured forecast paths to `forecast/{season_year}/{init_date}/`; enabled multi-season_year invocations
PR-372	Made `forecast.residual_mode` mandatory; added `validate_residual_mode` gate

Open questions¶

DESIGN.md line 125 describes the read-only invariant for {features_dir}/{experiment_key}/ but does not explicitly mandate that conformal/{mode}.parquet sidecars inside run_dir are also read-only from the forecast perspective. The code currently treats them as read-or-fit-on-demand via get_or_fit_calibration, which can write a new sidecar if one is missing — this is a write to run_dir (the hindcast tree), not to features_dir. Whether this constitutes a violation of the read-only invariant is undocumented.