Entity: ExperimentProtocolConfig¶
Definition¶
ExperimentProtocolConfig is the frozen Pydantic model that declares the cross-validation schedule for a commodity run. It specifies which years are held out for walk-forward evaluation (test_years), the CV strategy name (cv_strategy, always "expanding" today), and the production-fold inclusion threshold for county selection. It is consumed by ExpandingFoldGenerator in run/experiment_protocol.py to generate (fold_label, train_data, test_data, year_data, references_fold) tuples.
FoldSchedule in the dashboard layer (app/_dashboard_config.py) is a derived read-only view of this schedule and is out of scope for the core pipeline domain model.
Kind¶
Pydantic BaseModel (no frozen=True in source — model_config is not declared on this class; it inherits pydantic defaults). Nested inside ExperimentConfig.
Source of truth¶
market_insights_models/src/commodity_hindcast/config.py:483
Key attributes¶
| Field | Type | Default | Meaning | YAML example |
|---|---|---|---|---|
cv_strategy |
str |
required | Walk-forward CV variant. Currently always "expanding". Declared as a string (not a Literal) to allow future extension without a schema break |
expanding |
test_years |
list[int] |
required | Ordered list of harvest years held out in sequence. Each entry produces one numeric fold_label (e.g. "2020") and its corresponding HindcastSlice |
[2020, 2021, 2022, 2023, 2024] |
production_cumulative_threshold |
float |
1.0 |
Top-N% of counties by recent production retained in the included_geo_identifiers universe. 0.95 → top 95%; 1.0 → all counties |
0.95 |
production_recent_years |
int |
5 |
Number of most-recent years used to rank counties for the cumulative threshold | 5 |
Fold generation¶
ExpandingFoldGenerator (run/experiment_protocol.py:110) iterates test_years in sorted order. For each test_year:
- Training data:
fit_df[year < test_year](all years strictly before the test year). - Test data:
pred_df[year == test_year](the hold-out year only). - Fold label:
str(test_year)(e.g."2020"). - References fold: subset of each reference series for
marketing_year == test_year.
After all numeric folds, a "production" fold is also generated, trained on all available data up to feature_end_year. The production fold has no test-year holdout.
cutoff for a numeric fold is date(int(fold_label), 1, 1); for the production fold it is date(feature_end_year + 1, 1, 1). (See lib/results/results_slice.py:151.)
Lifecycle¶
- Constructed as part of
ExperimentConfig. - Consumed by
run/runner.run_walk_forward()which passes aDataFoldGeneratorderived from this config. production_cumulative_thresholdandproduction_recent_yearsare used during the FIT stage to selectincluded_geo_identifiers— the county universe written torun_dir/included_geo_identifiers.txt.- Persisted only as part of
config_resolved.yaml; not a standalone artefact.
Relationships¶
- Owned by
ExperimentConfig(config.py:682). - Drives
ExpandingFoldGenerator— one generator instance per run, constructed inrun/runner.run_walk_forward. - Drives
Fold/HindcastSlicecardinality:len(test_years) + 1slices per run (numeric folds + production). - Inspected by
FoldSchedule(dashboard layer only) to map season dates to available fold labels.
Concepts and pipelines that touch this entity¶
- Concept: walk-forward CV — the expanding-window design and cutoff semantics are documented there.
- Pipeline: hindcast — fold loop is driven by this config.
- Entity: HindcastSlice — one slice per fold label in
test_yearsplus one production slice.
PRs and commits¶
- PR #331 (PR-331.md) —
production_cumulative_thresholdandproduction_recent_yearsadded to control the included-geo-identifiers selection.
Open questions¶
cv_strategyis declared as a plainstrrather than aLiteral["expanding"]. A future strategy (e.g. sliding window) would add a value here and a correspondingAbstractFoldGeneratorsubclass. No implementation exists yet.production_cumulative_thresholddefaults to1.0(all counties) at the class level, but all production YAMLs set it to0.90or0.95. The default silently keeps all counties, which may inflate uncertainty at the tail; the intent was to make the threshold explicit in every config.