Concept: Hindcast vs Forecast¶
What it is¶
Hindcast is the retrospective evaluation pipeline. Given a complete set of
pre-built features (fit.parquet, pred.parquet), it runs walk-forward
cross-validation (one fold per season_year in config.experiment_protocol.test_years),
trains a production model on all available data, then postprocesses the walk-forward
predictions into calibrated delivery artefacts. Its output is a run_dir on disk
that captures every artefact needed to understand the model's historical skill —
CV fold models, conformal calibration sidecars, per-fold walk_forward_preds.parquet
files, national aggregates, and delivery CSVs.
Forecast is the forward-looking prediction pipeline. It is a consumer of an
already-complete hindcast run_dir. Given a (season_year, init_date) pair, it
builds fresh features for that specific init_date, runs inference against the
production fold's frozen model artefacts, postprocesses to a national frame using
the configured conformal calibration, and emits delivery CSVs. It writes exclusively
inside run_dir/forecast/{season_year}/{init_date}/ and never touches canonical
hindcast paths.
The distinction matters practically: hindcast determines what is true about the
past; forecast determines what can be said about the future given that truth. Running
make forecast twice at the same init_date for different season_year values is
safe because PR #369 restructured forecast artefact paths to forecast/{season_year}/{init_date}/,
making the two subtrees disjoint.
Where it lives in the code¶
| Orchestrator | Entry point | Key function |
|---|---|---|
| Hindcast | stages/run_hindcast.py:193 |
run(config_path) |
| Forecast | stages/run_forecast.py:143 |
run(run_dir, *, season_year, init_date, force) |
run_hindcast.run also exposes fit_production(config_path) (run_hindcast.py:229), a
fast path that trains only the production model (no walk-forward CV, no
postprocess, no evaluate, no deliver). This is the entry point when out-of-sample
calibration is not required.
The read-only boundary is declared in DESIGN.md line 125:
"The forecast pipeline SHALL NOT write to canonical hindcast artefacts under
{features_dir}/{experiment_key}/."
It is enforced structurally: ForecastSlice.root always resolves to
run_dir/forecast/{season_year}/{init_date}/ (lib/results/results_slice.py:326),
and validate_residual_mode at run_forecast.py:91 is the first call inside
run_forecast.run(), rejecting incompatible run_dir / config pairs before any
feature I/O touches the disk.
Key invariants¶
- Hindcast
run_diris the sole source of trained artefacts (model, detrender, fill values, conformal calibration sidecars). Forecast only reads from it. - All forecast outputs land under
run_dir/forecast/{season_year}/{init_date}/. No forecast stage writes to{features_dir}/{experiment_key}/(DESIGN.md:125). validate_residual_mode(run_forecast.py:91) MUST be the first call inrun_forecast.run()— fail fast before any S3/feature I/O (PR #372).- Feature paths (
features_fit_path,features_pred_path) are read-only references on bothHindcastSliceandForecastSlice; they resolve to{features_dir}/{experiment_key}/fit.parquetandpred.parquetrespectively.
Side-by-side comparison¶
| Aspect | Hindcast | Forecast |
|---|---|---|
| Entry point | run_hindcast.run(config_path) |
run_forecast.run(run_dir, season_year, init_date) |
| Config source | YAML at config_path; writes config_resolved.yaml into new run_dir |
Reads config_resolved.yaml from existing run_dir; no YAML argument |
| What is frozen | Nothing — all artefacts are produced by this run | Production model, detrender, feature fill values, conformal sidecars |
| What is recomputed | Everything under run_dir |
indices.zarr, features/pred.parquet, walk_forward_preds.parquet, postprocessed/national.parquet, delivery CSVs — all isolated under forecast/{season_year}/{init_date}/ |
| Output paths | run_dir/models/, run_dir/preds/, run_dir/conformal/, run_dir/postprocessed/, run_dir/delivery/ |
run_dir/forecast/{season_year}/{init_date}/ only |
| Walk-forward CV | Yes — one fold per test_year |
No — single (season_year, init_date) per call |
| Production fit | Yes — _run_production_fit_phase |
Reuses existing production fold; raises if absent |
| Conformal calibration | Fits + saves all configured modes to run_dir/conformal/{mode}.parquet |
Loads primary mode (or fits on demand via get_or_fit_calibration) |
| Feature build | Reads pre-built fit.parquet / pred.parquet from features_dir |
Materialises fresh indices.zarr + pred.parquet for the single init_date |
| Multi-year support | All test_years in a single run |
Each season_year at a given init_date is a separate invocation; PR #369 made paths disjoint |
How it interacts with the pipeline¶
The hindcast pipeline is described end-to-end in the hindcast pipeline page
(forward ref: ../pipelines/hindcast.md). The forecast pipeline is described in
../pipelines/forecast.md. The walk-forward CV mechanism used inside hindcast is
documented in walk_forward_cv.md.
Conformal calibration — which sidecars the hindcast writes and which mode the
forecast consumes — is the key coupling point between the two pipelines. See
conformal_modes.md and residual_modes.md.
Pitfalls and historical bugs¶
PR #369 — single directory per init_date (fixed 2026-05-05): Before PR #369,
all forecast artefacts for a given init_date lived at run_dir/forecast/{init_date}/.
Calling forecast for season_year=2027 and then season_year=2028 at the same
init_date silently overwrote all artefacts from the first call. The second call
then raised ValueError: No rows in pred.parquet for season_year=2027. PR #369
fixed this by keying on (season_year, init_date) — ForecastSlice.root is the
single source of truth (results_slice.py:326).
PR #372 — opaque pandas crash from unvalidated run_dir (fixed 2026-05-05):
Before PR #372, invoking make forecast with no RUN_DIR triggered production
model training but then crashed inside pandas with KeyError: 'fold_year' after
~2 minutes of compute, because the calibration code expected walk-forward CV
residuals that did not exist. validate_residual_mode now gates the entire run
before any feature I/O.
Related entities and concepts¶
ForecastSlice— path handle for one(season_year, init_date)forecastHindcastSlice— path handle for one CV foldExperimentResult— aggregate root for the wholerun_dirCalibrationResult— conformal calibration sidecar produced by hindcast, consumed by forecastwalk_forward_cv.md— fold strategy within hindcastresidual_modes.md—forecast.residual_modefield andvalidate_residual_modegate
PRs and commits¶
| PR | Relevance |
|---|---|
| PR-369 | Restructured forecast paths to forecast/{season_year}/{init_date}/; enabled multi-season_year invocations |
| PR-372 | Made forecast.residual_mode mandatory; added validate_residual_mode gate |
Open questions¶
- DESIGN.md line 125 describes the read-only invariant for
{features_dir}/{experiment_key}/but does not explicitly mandate thatconformal/{mode}.parquetsidecars insiderun_dirare also read-only from the forecast perspective. The code currently treats them as read-or-fit-on-demand viaget_or_fit_calibration, which can write a new sidecar if one is missing — this is a write torun_dir(the hindcast tree), not tofeatures_dir. Whether this constitutes a violation of the read-only invariant is undocumented.