Skip to content

ExperimentResult

Definition

ExperimentResult is the aggregate root for the Experiment bounded context. It is a frozen dataclass that acts as a lazy handle to every artefact under one RunDir: the resolved ExperimentConfig, the discovered set of HindcastSlice objects (one per CV fold plus production), and the discovered set of ForecastSlice objects (one per (season_year, init_date) pair). It carries no computed data in memory — the disk artefact tree is the sole contract.

Kind: Frozen dataclass (@dataclass(frozen=True, kw_only=True)). Aggregate root.

Source of truth: market_insights_models/src/commodity_hindcast/lib/results/run_result.py:31.

Key attributes

Field / property Type Purpose
config ExperimentConfig Resolved config loaded from config_resolved.yaml; embedded at construction
hindcast_slices tuple[HindcastSlice, ...] Numeric CV folds discovered from preds/{key}/{fold}/train_preds.parquet; excludes production
forecast_slices tuple[ForecastSlice, ...] Per-(season_year, init_date) handles; discovered from forecast/{sy}/{id}/preds/walk_forward_preds.parquet
run_dir Path \| CloudPath Resolved absolute path; primary identity
features_fit_path property features_dir/{key}/fit.parquet (via _features_path)
features_pred_path property features_dir/{key}/pred.parquet (via _features_path)
has_postprocessed property True if postprocessed/national.parquet exists
has_walk_forward_preds property True if all non-production folds have walk_forward_preds.parquet
postprocessed_national_path property run_dir/postprocessed/national.parquet (run_result.py:155)
production property → HindcastSlice \| None Production fold handle; None if models/{key}/production/detrender.pkl absent (run_result.py:176)

Constructor

@classmethod
def from_run_dir(cls, run_dir: Path | AnyPath | str) -> ExperimentResult:
    ...

run_result.py:40. This is the only legitimate constructor. It:

  1. Resolves run_dir via AnyPath(run_dir).resolve().
  2. Loads config_resolved.yaml (raises FileNotFoundError if absent).
  3. Iterates preds/{commodity}/{fold}/ — discovers numeric fold dirs that have train_preds.parquet; builds HindcastSlice handles. Excludes production (exposed via the production property instead).
  4. Iterates forecast/{season_year}/{init_date}/preds/walk_forward_preds.parquet — builds ForecastSlice handles (run_result.py:96).
  5. Returns a frozen instance.

An empty preds/ directory yields an empty hindcast_slices tuple — the object is constructible on a fresh run_dir holding only config_resolved.yaml (the "lazy handle" invariant, run_result.py:55–60).

Included-geo handover

Two methods mediate the county-set handover between FIT and PREDICT phases:

  • save_included_geo_identifiers(geo_identifiers: frozenset[str]) -> Path — writes sorted newline-separated IDs to run_dir/included_geo_identifiers.txt (run_result.py:157).
  • load_included_geo_identifiers() -> frozenset[str] — reads and reconstitutes; raises FileNotFoundError if the file is absent, enforcing the ordering constraint that FIT must run before PREDICT (run_result.py:164).

Lifecycle

Created: By ExperimentResult.from_run_dir(run_dir). Called at the start of every stage module (FIT, PREDICT, POSTPROCESS, DELIVER, EVALUATE, FORECAST) and by the dashboard.

Valid states:

hindcast_slices forecast_slices State
empty empty Fresh run_dir — features built, no modelling yet
non-empty empty Hindcast complete; production fit done; no forecasts issued
non-empty non-empty Full: production model trained and forecasts issued

Consumed: Passed as experiment (or result) into every stage function. None of its data is mutated; each stage writes to disk and the next stage constructs a new ExperimentResult handle.

Destroyed: Not explicitly — Python garbage collection; the on-disk run_dir persists indefinitely.

Key invariants (from AGGREGATES.md)

  • If forecast_slices is non-empty, a production HindcastSlice must exist under run_dir/models/{commodity}/production/; enforced at access time by ForecastSlice.training raising FileNotFoundError (results_slice.py:418).
  • included_geo_identifiers.txt must be written by FIT before PREDICT reads it (run_result.py:164).
  • All slice artefacts are either fully present or treated as not-yet-computed — no partial artefact state is meaningful.

Relationships

Relationship Entity Notes
Primary identity RunDir run_dir field; the path IS the identity
Embeds ExperimentConfig Loaded from config_resolved.yaml; read-only after load
Aggregates HindcastSlice × N hindcast_slices tuple (numeric folds); production via .production
Aggregates ForecastSlice × M forecast_slices tuple
Produces CalibrationResult Sidecars at conformal/{mode}.parquet; not a field — loaded on demand
Referenced by HindcastDelivery walk_forward_preds_to_delivery_rows(experiment, ...) takes the result as parameter

Concepts and pipelines (forward refs to P5)

  • Concept: Stage isolation — why the lazy-handle pattern is the correct design
  • Concept: Walk-forward CV — fold discovery loop
  • Pipeline: Hindcast pipelinefrom_run_dir called at each stage
  • Pipeline: Forecast pipelinefrom_run_dir called by run_forecast.run()

PRs and commits

PR / commit Relevance
PR-339 Created lib/results/run_result.py; moved ExperimentResult from steps/experiment_result.py; broke the import cycle via production_hindcast_slice helper
PR-369 Extended from_run_dir discovery to iterate two levels (season_year then init_date) for forecast slices (run_result.py:96–122)

Open questions

  • ExperimentResult.from_run_dir discovers slices from the preds directory; a run that completes FIT but crashes before PREDICT will have model artefacts but no slices. A has_models guard would be useful for dashboard display.
  • The features_fit_path and features_pred_path properties load config lazily on first access; if config_resolved.yaml is absent these raise FileNotFoundError rather than returning None — callers should guard with has_postprocessed / has_walk_forward_preds before accessing feature paths.