Skip to content

Entity: ForecastConfig

Definition

ForecastConfig is the optional frozen Pydantic model that activates forecast mode in a pipeline run. Its presence as ExperimentConfig.forecast (non-None) switches every stage from the hindcast weekly-grid path to the single-init-date forecast path. It holds three YAML-declared fields — the raw observations zarr path, the materialised climatology zarr path, and the residual_mode for conformal calibration — plus the runtime-injected init_date (set to None in YAML, then populated by build_forecast_features).

ForecastConfig is a subordinate config, not a root aggregate. It has no independent lifecycle outside ExperimentConfig.

Kind

Pydantic BaseModel, frozen=True. Nested inside ExperimentConfig as an optional field (forecast: ForecastConfig | None = None).

Source of truth

market_insights_models/src/commodity_hindcast/config.py:579

Key attributes

Field Type Default Meaning YAML example
raw_obs_filepath ResolvablePath required Path to the raw observation zarr (weather data for the current season up to init_date). Resolved against data_root s3://{env}-treefera-greenprint-data/weather/processed/areal_aggregation/conus_adm2.zarr
materialised_climo_filepath ResolvablePath required Path to the pre-materialised climatology zarr (all DOYs pre-computed for fast forecast feature build). Resolved against data_root. See the centralised-climo feedback note in project memory s3://{env}-treefera-greenprint-data/weather/processed/climatology/conus_adm2_baseline_1980_2025_w31_materialised.zarr
residual_mode ResidualMode required (no default) Which past residuals to calibrate conformal intervals against at forecast time. Must correspond to a sidecar present in run_dir/conformal/{residual_mode}.parquet. Made mandatory in PR #372 hindcast_oos_per_init_date
init_date date \| None None The target calendar date for this forecast. None in YAML / hindcast mode. Injected at runtime by build_forecast_features via model_copy(update={"init_date": <date>}) from the CLI --init-date argument — (runtime only)

residual_mode values

ResidualMode is defined in models/meta_models/types.py:16:

ResidualMode = Literal[
    "hindcast_oos_per_init_date",
    "hindcast_oos_per_year",
    "hindcast_oos_fully_pooled",
    "in_sample_pooled",
]

All four production YAMLs with a forecast: block use "hindcast_oos_per_init_date".

"in_sample_pooled" is the only mode that works after make fit-production alone (no prior hindcast CV). The other three modes require a hindcast run to produce walk-forward OOS residuals.

residual_mode is mandatory — PR #372

Before PR #372, residual_mode defaulted to postprocess.conformalise[0], conflating "which sidecars to fit during hindcast" with "which sidecar to apply at forecast time". This caused opaque crashes (KeyError: 'fold_year') when the run_dir lacked the needed sidecar.

PR #372 made residual_mode a required field with no default. The validate_residual_mode gate at the top of stages/run_forecast.run() now rejects three failure modes before any feature I/O:

  1. Empty run_dir (no hindcast or fit-production) → action: run make hindcast or cli run fit-production.
  2. OOS mode + no CV folds in run_dir → action: run hindcast, or switch to "in_sample_pooled".
  3. "in_sample_pooled" + no production fold → action: run cli run fit-production first.

ResidualMode was also extracted to a leaf module (models/meta_models/types.py) to break a circular import between config.py and conformalise.py.

Lifecycle

  1. Present in YAML only for forecast-capable commodity configs (corn, soybeans USA, soybeans BRA, wheat USA). Omitted entirely for hindcast-only runs.
  2. At YAML load time, raw_obs_filepath, materialised_climo_filepath, and residual_mode are validated; init_date defaults to None.
  3. ExperimentConfig._resolve_data_paths resolves raw_obs_filepath and materialised_climo_filepath against data_root.
  4. At forecast runtime, build_forecast_features calls config.model_copy(update={"forecast": config.forecast.model_copy(update={"init_date": init_date})}) to inject the CLI-supplied --init-date.
  5. ExperimentConfig.init_dates_for(season_year) returns [self.forecast.init_date] when forecast.init_date is set, overriding the hindcast weekly grid.
  6. validate_residual_mode (stages/run_forecast.py) gates the forecast stage before any S3 reads.

Relationships

  • Owned by ExperimentConfig.forecast (0:1 — None in hindcast mode).
  • Presence switches pipeline mode: forecast is None → hindcast; forecast is not None → forecast.
  • References PostprocessConfig.conformalise indirectly: residual_mode must match a mode that was listed in conformalise so the sidecar parquet exists in run_dir/conformal/.
  • Drives ForecastSlice artefact layout: per-(season_year, init_date) subtree under run_dir/forecast/.
  • Driven by CLI flags --init-date and --season-year in cli.run_forecast_cmd.

Concepts and pipelines that touch this entity

  • Pipeline: forecastForecastConfig is the activation switch; validate_residual_mode is the entry gate.
  • PostprocessConfigconformalise declares which sidecars to write; residual_mode declares which to read.
  • Concept: conformal calibrationresidual_mode selects the pooling strategy.
  • Concept: materialised climo — materialised_climo_filepath must point to the pre-computed climo zarr produced by materialise_for_forecast().

PRs and commits

  • PR #361 (PR-361.md) — ForecastConfig gained a residual_mode placeholder field (then optional, defaulting to postprocess.conformalise[0]).
  • PR #372 (PR-372.md) — residual_mode made mandatory (no default). ResidualMode extracted to models/meta_models/types.py. validate_residual_mode gate added at forecast entry. cli run forecast --config <yaml> removed; all forecasts now require --run-dir.

Open questions

  • No pydantic validator checks that residual_mode is present in PostprocessConfig.conformalise. The validate_residual_mode runtime gate catches the mismatch but only when the forecast stage is invoked, not at config-load time.
  • init_date being None at YAML load time means ExperimentConfig is technically in an incomplete state until build_forecast_features injects it. A future refactor could separate "forecast config at load time" from "forecast invocation params" to make the injection explicit.