ExperimentResult¶
Definition¶
ExperimentResult is the aggregate root for the Experiment bounded context. It is a frozen dataclass that acts as a lazy handle to every artefact under one RunDir: the resolved ExperimentConfig, the discovered set of HindcastSlice objects (one per CV fold plus production), and the discovered set of ForecastSlice objects (one per (season_year, init_date) pair). It carries no computed data in memory — the disk artefact tree is the sole contract.
Kind: Frozen dataclass (@dataclass(frozen=True, kw_only=True)). Aggregate root.
Source of truth: market_insights_models/src/commodity_hindcast/lib/results/run_result.py:31.
Key attributes¶
| Field / property | Type | Purpose |
|---|---|---|
config |
ExperimentConfig |
Resolved config loaded from config_resolved.yaml; embedded at construction |
hindcast_slices |
tuple[HindcastSlice, ...] |
Numeric CV folds discovered from preds/{key}/{fold}/train_preds.parquet; excludes production |
forecast_slices |
tuple[ForecastSlice, ...] |
Per-(season_year, init_date) handles; discovered from forecast/{sy}/{id}/preds/walk_forward_preds.parquet |
run_dir |
Path \| CloudPath |
Resolved absolute path; primary identity |
features_fit_path |
property | features_dir/{key}/fit.parquet (via _features_path) |
features_pred_path |
property | features_dir/{key}/pred.parquet (via _features_path) |
has_postprocessed |
property | True if postprocessed/national.parquet exists |
has_walk_forward_preds |
property | True if all non-production folds have walk_forward_preds.parquet |
postprocessed_national_path |
property | run_dir/postprocessed/national.parquet (run_result.py:155) |
production |
property → HindcastSlice \| None |
Production fold handle; None if models/{key}/production/detrender.pkl absent (run_result.py:176) |
Constructor¶
run_result.py:40. This is the only legitimate constructor. It:
- Resolves
run_dirviaAnyPath(run_dir).resolve(). - Loads
config_resolved.yaml(raisesFileNotFoundErrorif absent). - Iterates
preds/{commodity}/{fold}/— discovers numeric fold dirs that havetrain_preds.parquet; buildsHindcastSlicehandles. Excludesproduction(exposed via theproductionproperty instead). - Iterates
forecast/{season_year}/{init_date}/preds/walk_forward_preds.parquet— buildsForecastSlicehandles (run_result.py:96). - Returns a frozen instance.
An empty preds/ directory yields an empty hindcast_slices tuple — the object is constructible on a fresh run_dir holding only config_resolved.yaml (the "lazy handle" invariant, run_result.py:55–60).
Included-geo handover¶
Two methods mediate the county-set handover between FIT and PREDICT phases:
save_included_geo_identifiers(geo_identifiers: frozenset[str]) -> Path— writes sorted newline-separated IDs torun_dir/included_geo_identifiers.txt(run_result.py:157).load_included_geo_identifiers() -> frozenset[str]— reads and reconstitutes; raisesFileNotFoundErrorif the file is absent, enforcing the ordering constraint that FIT must run before PREDICT (run_result.py:164).
Lifecycle¶
Created: By ExperimentResult.from_run_dir(run_dir). Called at the start of every stage module (FIT, PREDICT, POSTPROCESS, DELIVER, EVALUATE, FORECAST) and by the dashboard.
Valid states:
hindcast_slices |
forecast_slices |
State |
|---|---|---|
| empty | empty | Fresh run_dir — features built, no modelling yet |
| non-empty | empty | Hindcast complete; production fit done; no forecasts issued |
| non-empty | non-empty | Full: production model trained and forecasts issued |
Consumed: Passed as experiment (or result) into every stage function. None of its data is mutated; each stage writes to disk and the next stage constructs a new ExperimentResult handle.
Destroyed: Not explicitly — Python garbage collection; the on-disk run_dir persists indefinitely.
Key invariants (from AGGREGATES.md)¶
- If
forecast_slicesis non-empty, a productionHindcastSlicemust exist underrun_dir/models/{commodity}/production/; enforced at access time byForecastSlice.trainingraisingFileNotFoundError(results_slice.py:418). included_geo_identifiers.txtmust be written by FIT before PREDICT reads it (run_result.py:164).- All slice artefacts are either fully present or treated as not-yet-computed — no partial artefact state is meaningful.
Relationships¶
| Relationship | Entity | Notes |
|---|---|---|
| Primary identity | RunDir |
run_dir field; the path IS the identity |
| Embeds | ExperimentConfig |
Loaded from config_resolved.yaml; read-only after load |
| Aggregates | HindcastSlice × N |
hindcast_slices tuple (numeric folds); production via .production |
| Aggregates | ForecastSlice × M |
forecast_slices tuple |
| Produces | CalibrationResult |
Sidecars at conformal/{mode}.parquet; not a field — loaded on demand |
| Referenced by | HindcastDelivery |
walk_forward_preds_to_delivery_rows(experiment, ...) takes the result as parameter |
Concepts and pipelines (forward refs to P5)¶
- Concept: Stage isolation — why the lazy-handle pattern is the correct design
- Concept: Walk-forward CV — fold discovery loop
- Pipeline: Hindcast pipeline —
from_run_dircalled at each stage - Pipeline: Forecast pipeline —
from_run_dircalled byrun_forecast.run()
PRs and commits¶
| PR / commit | Relevance |
|---|---|
| PR-339 | Created lib/results/run_result.py; moved ExperimentResult from steps/experiment_result.py; broke the import cycle via production_hindcast_slice helper |
| PR-369 | Extended from_run_dir discovery to iterate two levels (season_year then init_date) for forecast slices (run_result.py:96–122) |
Open questions¶
ExperimentResult.from_run_dirdiscovers slices from the preds directory; a run that completes FIT but crashes before PREDICT will have model artefacts but no slices. Ahas_modelsguard would be useful for dashboard display.- The
features_fit_pathandfeatures_pred_pathproperties load config lazily on first access; ifconfig_resolved.yamlis absent these raiseFileNotFoundErrorrather than returningNone— callers should guard withhas_postprocessed/has_walk_forward_predsbefore accessing feature paths.