ForecastSlice¶

Definition¶

ForecastSlice is the per-(season_year, init_date) artefact handle for the commodity_hindcast forecast pipeline. It is a frozen dataclass that exposes all paths and loaders for one in-season forecast. It satisfies the AbstractSlice protocol alongside HindcastSlice. Its distinguishing design principle is artefact isolation: all forecast-specific files land under run_dir/forecast/{season_year}/{init_date}/ and never touch canonical hindcast paths. Trained artefacts (model, detrender, fill values) are delegated to the production HindcastSlice via the training property.

Kind: Frozen dataclass (@dataclass(frozen=True, kw_only=True)). Aggregate root within ExperimentResult; not independently addressable without its run_dir.

Source of truth: market_insights_models/src/commodity_hindcast/lib/results/results_slice.py:301.

Path layout (introduced in PR #369)¶

Before PR #369, forecast artefacts lived at run_dir/forecast/{init_date}/. Multiple season_year values at the same init_date overwrote each other. PR #369 restructured the layout to:

run_dir/forecast/{season_year}/{init_date}/
├── indices.zarr                          ← spliced obs+climo daily indices
├── features/
│   └── pred.parquet                      ← per-init feature matrix (never touches canonical hindcast copy)
├── preds/
│   ├── walk_forward_preds.parquet        ← county-level rolling predictions
│   └── year_data.parquet                 ← raw prediction slice for diagnostics
├── postprocessed/
│   └── national.parquet                  ← ADM0-aggregated predictions with CI
└── delivery/
    ├── Treefera_{key}_ADM0_Forecast_{init_date}.csv
    ├── Treefera_{key}_ADM1_Forecast_{init_date}.csv
    └── Treefera_{key}_ADM2_Forecast_{init_date}.csv

ForecastSlice.root is the canonical source of truth for this layout (results_slice.py:326):

@property
def root(self) -> Path | CloudPath:
    return self.run_dir / "forecast" / str(self.season_year) / f"{self.init_date:%Y-%m-%d}"

Key attributes¶

Identity fields¶

Field	Type	Notes
`run_dir`	`Path \\| CloudPath`	Experiment root; must be an existing directory (validated in `__post_init__`)
`experiment_key`	`str`	Commodity key (e.g. `"corn_usa"`)
`season_year`	`int`	The harvest year being forecasted (e.g. `2027`)
`init_date`	`date`	Calendar date of forecast issuance

__post_init__ (results_slice.py:319) raises FileNotFoundError if run_dir is not an existing directory — the only upfront validation at construction.

Path properties (all `Path | CloudPath`)¶

Property	Path	Notes
`root`	`run_dir/forecast/{season_year}/{init_date}/`	Per-`(season_year, init_date)` root (`results_slice.py:326`)
`indices_zarr`	`root/indices.zarr`	Spliced obs+climo daily indices zarr
`features_dir`	`root/features/`	Directory for this init's features
`features_parquet`	`root/features/pred.parquet`	Per-init prediction feature matrix
`preds_dir`	`root/preds/`	Directory for prediction outputs
`walk_forward_preds_path`	`root/preds/walk_forward_preds.parquet`	County-level rolling predictions
`year_data_path`	`root/preds/year_data.parquet`	Raw prediction slice for diagnostics
`postprocessed_dir`	`root/postprocessed/`	Directory for postprocessed national output
`postprocessed_national_path`	`root/postprocessed/national.parquet`	ADM0 postprocessed parquet
`delivery_dir`	`root/delivery/`	Directory for client-facing CSVs
`bias_corrector_path`	`root/postprocessed/{key}/production/bias_corrector.pkl`	Production bias corrector

delivery_csv(level) (results_slice.py:375) constructs Treefera_{key}_{level}_Forecast_{init_date:%Y-%m-%d}.csv.

Feature paths (shared, via lazy config load)¶

Property	Returns	Notes
`features_fit_path`	`Path \\| CloudPath`	`features_dir/{key}/fit.parquet` — canonical hindcast fit matrix
`features_pred_path`	`Path \\| CloudPath`	`features_dir/{key}/pred.parquet` — canonical hindcast pred matrix

These are read-only references to the canonical hindcast feature matrices (DESIGN.md: forecast SHALL NOT write to these).

Cutoff¶

@property
def cutoff(self) -> date:
    return self.init_date

results_slice.py:383. Symmetric with HindcastSlice.cutoff — the AbstractSlice protocol requires a uniform cutoff surface regardless of the slice type.

Trained artefact delegation¶

ForecastSlice does not own trained artefacts. It reaches them via the production HindcastSlice:

@property
def training(self) -> HindcastSlice:
    production = production_hindcast_slice(self.run_dir, self.experiment_key)
    if production is None:
        raise FileNotFoundError(
            f"No production hindcast slice found under {self.run_dir}; "
            "cannot access trained artefacts for this forecast."
        )
    return production

results_slice.py:411. The three delegation methods load_model(), load_detrender(config), and load_feature_fill_values() all call self.training.<method>().

The training property uses production_hindcast_slice() directly (the shared free function at results_slice.py:271) rather than routing through ExperimentResult.from_run_dir, avoiding the historical import cycle (PR #339, Phase 5).

Lifecycle¶

Created: By ForecastSlice(run_dir=..., experiment_key=..., season_year=..., init_date=...) inside run_forecast.run_features() and run_forecast.run_predict(). Also discovered lazily by ExperimentResult.from_run_dir at run_result.py:96–122.

Populated (stage by stage within run_forecast.run()): 1. indices_zarr — written by materialise_forecast_indices (obs+climo splice). 2. features_parquet — written by _build_forecast_features (feature assembly from zarr). 3. walk_forward_preds_path, year_data_path — written by run_predict_stage. 4. postprocessed_national_path — written by _postprocess_forecast. 5. delivery/ CSVs — written by _deliver_forecast.

Consumed: Delivery layer reads walk_forward_preds_path and postprocessed_national_path. Dashboard queries delivery_csv(level). Export stage globs forecast/*/*/delivery/... (delivery/export.py).

Destroyed: Never explicitly; the run_dir/forecast/{season_year}/{init_date}/ subtree persists until the operator prunes old run_dirs.

Multi-season_year support (PR #369)¶

The restructured root path means forecasts for 2027 and 2028 at the same init_date write to disjoint subtrees:

run_dir/forecast/2027/2026-05-05/  ← season 2027 forecast
run_dir/forecast/2028/2026-05-05/  ← season 2028 forecast

Before PR #369 both would have written to run_dir/forecast/2026-05-05/, with the second silently overwriting the first.

Relationships¶

Relationship	Entity	Notes
Child of	`ExperimentResult`	`run_dir` IS the identity anchor; not independently addressable
Implements	`AbstractSlice` protocol	Symmetric surface shared with `HindcastSlice`
Delegates trained artefacts to	`HindcastSlice` (production)	`self.training` property; raises if production model absent
Consumes	`CalibrationResult`	Loaded from `conformal/{mode}.parquet` by `_postprocess_forecast`
Generates	`HindcastDelivery` rows	`_deliver_forecast` builds delivery rows from `walk_forward_preds`

Concepts and pipelines (forward refs to P5)¶

Concept: Forecast path layout — (season_year, init_date) keying
Concept: Long-range climo stub — fills missing z-score features for future season_year values
Concept: AbstractSlice protocol — symmetric surface with HindcastSlice
Pipeline: Forecast pipeline — full run_forecast.run() walkthrough

PRs and commits¶

PR / commit	Relevance
PR-339	Created `lib/results/results_slice.py`; co-located `ForecastSlice` with `HindcastSlice` to break import cycle; introduced `production_hindcast_slice` helper
PR-369	Restructured `root` from `forecast/{init_date}/` to `forecast/{season_year}/{init_date}/`; enabled multi-season_year forecasting from a single `init_date`; updated `run_result.py` discovery to iterate two levels
PR-372	Made `forecast.residual_mode` mandatory on `ForecastConfig`; added `validate_residual_mode` gate at the start of `run_forecast.run()` to fail fast before any feature I/O

Open questions¶

The bias corrector path for ForecastSlice is root/postprocessed/{key}/production/bias_corrector.pkl, which differs in structure from HindcastSlice's postprocessed/{key}/{fold}/bias_corrector.pkl. This asymmetry is undocumented in DESIGN.md and may cause confusion when writing consumers that handle both slice types.
ForecastSlice.__post_init__ validates that run_dir is an existing directory but does not check that the production model exists. A consumer calling self.training before production is trained will get a FileNotFoundError at delegation time, not at construction time. validate_residual_mode in run_forecast.run() catches this for the CLI path, but library consumers bypass it.
Long-range forecasting (season_year beyond climo coverage) relies on a stub (forecast_long_range_stub.py) that is explicitly marked temporary; the slice itself has no field recording whether climo fill was synthetic.