Entity: InitDate¶
Definition¶
A specific calendar date on which a within-season crop forecast is issued, stored as an ISO YYYY-MM-DD string in delivery CSVs and as a column in parquet feature tables. Features included in pred.parquet are known up to init_date − lag_days (default lag_days = 1). In hindcast mode, init dates form a weekly grid derived from CommodityConfig.hindcast_init_season_doys; in forecast mode a single runtime-injected date overrides the grid.
Kind¶
Conceptual identifier (Python datetime.date in memory; str in persisted artefacts). No dedicated class or NewType. Represented as init_date: date on ForecastSlice and as the init_date column (str) in every feature parquet and delivery CSV row.
Source of truth¶
market_insights_models/src/commodity_hindcast/config.py:384 — CommodityConfig.hindcast_init_dates(season_year) returns the full weekly grid as a list[date].
config.py:381 — CommodityConfig.to_date(season_doy, season_year) converts a season DOY to a calendar date.
config.py:706 — ExperimentConfig.init_dates_for(season_year) is the single dispatch point: returns [forecast.init_date] in forecast mode, or the full weekly grid in hindcast mode.
Key attributes / structure¶
| Attribute | Type | Notes |
|---|---|---|
| Calendar date value | date (in memory) / str (persisted) |
ISO YYYY-MM-DD |
lag_days |
int |
Default 1; harvest-init training rows override to 0 |
| Season DOY | int |
CommodityConfig.to_season_doy(init_date, season_year) inverts the calendar |
| Filesystem role | path component | run_dir/forecast/{season_year}/{init_date}/ (PR-369 structure) |
Cardinality: Multiple per season year in hindcast mode — typically ~30 per crop season, matching the length of hindcast_init_season_doys. One per ForecastSlice in forecast mode. In forecast mode the same init_date can coexist with multiple season_year values under the same run_dir (introduced in PR-369).
Lifecycle¶
Created: In hindcast mode, enumerated at feature-build time from CommodityConfig.hindcast_init_season_doys via init_dates_for(season_year). In forecast mode, injected at runtime via --init-date CLI flag; build_forecast_features sets ForecastConfig.init_date.
Consumed:
- Feature assembly — the init_date column in fit.parquet / pred.parquet controls which weather and climo observations are visible (up to init_date − lag_days).
- Walk-forward prediction — _predict_fold_rolling iterates over init dates to accumulate predictions (run/runner.py:86).
- Delivery — DeliveryRow.init_date (ISO string) is an identity column in every delivery CSV; validated by _validate_init_date_format and _validate_init_date_year.
- Conformal calibration — CalibrationResult with residual_mode = "hindcast_oos_per_init_date" keys half-widths by (month, day) of the init date.
Destroyed: Never destroyed; immutable once written to a parquet or CSV.
Relationships to other entities¶
- SeasonYear — partitioned by — a single season year has many init dates;
CommodityConfig.hindcast_init_dates(season_year)returns the grid - Commodity — generated by —
hindcast_init_season_doysonCommodityConfigdefines the weekly grid;to_date()converts season DOY to calendar date - Yield — indexes — every yield prediction is keyed by
(geo_identifier, season_year, init_date); later init dates carry more complete weather information - Fold — scoped within — all init dates for a season year fall within one fold's test window
Concepts and pipelines that touch this entity¶
- Pipeline: forecast (P5) —
ForecastSliceis identified by(season_year, init_date); path lives atrun_dir/forecast/{season_year}/{init_date}/ - Pipeline: hindcast (P5) — weekly init-date grid drives the feature assembly loop
- Concept: conformal calibration (P5) —
hindcast_oos_per_init_datemode keys conformal half-widths by(month, day)ofinit_date
PRs and commits¶
- PR-369 — Restructures forecast path from
run_dir/forecast/{init_date}/torun_dir/forecast/{season_year}/{init_date}/to support multiple season years per init date; introduces long-range climo stub for distant future seasons - PR-372 — Makes
ForecastConfig.residual_modemandatory; addsvalidate_residual_modegate that inspectsCalibrationResultavailability for the given init date before any feature compute
Open questions¶
lag_days = 1means features atinit_dateitself are excluded; should this default be surfaced more prominently in configs, as it silently affects which weather day is the last visible observation?- The
_validate_init_date_yearvalidator onDeliveryRowallows init dates up toLONG_RANGE_HORIZON_YEARS = 10before the target season year — is this horizon documented and understood by consumers of the delivery CSV? - There is no validation that an
InitDatefalls within the crop's growing season window; an init date after harvest is accepted by the config but produces meaningless features. - The conformal mode
hindcast_oos_per_init_datekeys on(month, day)— if a forecastinit_datedoes not match any hindcast grid date exactly, the nearest key is used; the interpolation rule should be documented.