DeliveryRow¶
Definition¶
DeliveryRow is a frozen Pydantic model that represents one row in the client-facing hindcast or forecast delivery CSV. It is the typed contract at the delivery boundary: every value written to a Treefera_*_Hindcast_*.csv or Treefera_*_Forecast_*.csv must pass through a DeliveryRow constructor. All yield values are in delivery units (bu/ac for grains; lbs/ac for cotton) — the conversion from internal kg/ha has already occurred before construction.
Kind¶
Value object (frozen Pydantic BaseModel, extra="forbid"). There is one class for all ADM levels. ADM0, ADM1, and ADM2 rows share the same schema; the geo_identifier field carries the level information (e.g. "adm0:usa" vs "adm1:usa/Iowa" vs "adm2:usa/Iowa/Story"). An earlier design considered separate ADM0Row, ADM1Row, ADM2Row classes; the code never implemented them.
Source of truth¶
delivery/schemas.py:109 — class DeliveryRow.
Key attributes¶
Field table¶
| Field | Type | Optional | Unit | Semantics |
|---|---|---|---|---|
commodity |
str |
No | — | Upper-cased commodity name (e.g. "CORN") |
year |
int |
No | — | Harvest season year |
init_date |
str |
No | — | ISO-8601 date YYYY-MM-DD; the forecast issue date |
geo_identifier |
str |
No (default "adm0:usa") |
— | Canonical ADM identifier; encodes level by prefix |
variable |
str |
No (default "yield_bu_acre") |
— | Output variable name |
model |
str |
No (default "commodity_hindcast") |
— | Model identifier for the warehouse |
mean |
float |
No | bu/ac (or lbs/ac for cotton) | Area-weighted predicted yield in delivery units |
weather_correction_bu_ac |
float |
No | bu/ac always | Detrended component sim_yield_kg_ha_detrended × scale; never converted back to kg/ha in export (see PR-331) |
nass_actual |
float \| None |
Yes | bu/ac | Area-weighted observed NASS yield from predicted counties (model-subset) |
nass_actual_area_weighted_all |
float \| None |
Yes | bu/ac | Full-universe NASS area-weighted yield (closest to USDA headline figure; populated at ADM0 only) |
nass_actual_prod_div_area_all |
float \| None |
Yes | bu/ac | Full-universe production ÷ area (ADM0 only) |
wasde_in_season |
float \| None |
Yes | bu/ac | As-of-init_date WASDE in-season estimate |
conab_final_in_season |
float \| None |
Yes | bu/ac | As-of-init_date CONAB final estimate (Brazil soybean) |
conab_lev_in_season |
float \| None |
Yes | bu/ac | As-of-init_date CONAB levantamento estimate (Brazil soybean) |
lower_50 … lower_95 |
float \| None |
Yes | bu/ac | Conformal lower CI bands at 50/68/80/90/95 % levels |
upper_50 … upper_95 |
float \| None |
Yes | bu/ac | Conformal upper CI bands at 50/68/80/90/95 % levels |
Field count: 17 (6 identity + 1 mean + 1 weather correction + 3 NASS benchmarks + 3 in-season references + 10 CI band columns — 5 lower + 5 upper).
extra="forbid" strictness¶
ConfigDict(frozen=True, extra="forbid") is set at schemas.py:126. This is load-bearing: walk_forward_preds_to_delivery_rows materialises one column per cfg.reference_data spec named f"{spec.name}_in_season". Only wasde_in_season, conab_final_in_season, and conab_lev_in_season are declared. Without forbid, Pydantic's default extra="ignore" would silently discard any benchmark from a spec whose YAML name does not match a declared field. With forbid, the same situation raises a ValidationError immediately.
Lifecycle¶
- Assembly —
delivery/conversions.py:walk_forward_preds_to_delivery_rowsaggregates the walk-forward predictions to the requested ADM level, converts units, attaches conformal half-widths, and constructsDeliveryRowobjects. Field validators and model validators fire at construction. - Validation — Pydantic runs
_validate_init_date_format(field validator), then_validate_ci_orderingand_validate_init_date_year(model validators) on every row. - Collection — Validated rows are collected into a
list[DeliveryRow]passed toHindcastDelivery. - Serialisation —
delivery/conversions.py:delivery_to_dataframeconverts the list to a Polars DataFrame in canonical column order frombuild_delivery_column_order. - Persistence — Written to a CSV under
run_dir/delivery/(hindcast) orrun_dir/forecast/{season_year}/{init_date}/delivery/(forecast).
CI ordering invariant¶
The model validator at schemas.py:176 enforces that all present (non-None) CI bands nest correctly around the mean:
@model_validator(mode="after")
def _validate_ci_ordering(self) -> DeliveryRow:
"""Enforce lower_95 <= ... <= mean <= ... <= upper_95 for present bands."""
chain: list[tuple[str, float]] = []
for field_name in _CI_LOWER_FIELDS:
val = getattr(self, field_name)
if val is not None:
chain.append((field_name, val))
chain.append(("mean", self.mean))
for field_name in _CI_UPPER_FIELDS:
val = getattr(self, field_name)
if val is not None:
chain.append((field_name, val))
for i in range(len(chain) - 1):
name_a, val_a = chain[i]
name_b, val_b = chain[i + 1]
if val_a > val_b:
msg = f"CI ordering violation: {name_a}={val_a:.3f} > {name_b}={val_b:.3f}"
raise ValueError(msg)
return self
_CI_LOWER_FIELDS is ordered outermost-to-innermost (lower_95, lower_90, …, lower_50) and _CI_UPPER_FIELDS is ordered innermost-to-outermost (upper_50, …, upper_95). Absent bands are skipped, so partial CI subsets (e.g. only 90/95 levels) still pass. Violation raises ValueError at row construction time — the invariant cannot be silently violated.
Validators summary¶
| Validator | Kind | Location | Invariant |
|---|---|---|---|
_validate_init_date_format |
field validator | schemas.py:164 |
init_date matches ^\d{4}-\d{2}-\d{2}$ |
_validate_ci_ordering |
model validator | schemas.py:176 |
lower_95 ≤ … ≤ lower_50 ≤ mean ≤ upper_50 ≤ … ≤ upper_95 for present bands |
_validate_init_date_year |
model validator | schemas.py:197 |
init_date calendar year in [year − 10, year + 1]; long-range horizon constant LONG_RANGE_HORIZON_YEARS = 10 at schemas.py:106 |
Relationships¶
- Contained by: HindcastDelivery — holds
list[DeliveryRow] - Produced by:
delivery/conversions.py:walk_forward_preds_to_delivery_rows - Serialised by:
delivery/conversions.py:delivery_to_dataframe - Column order governed by:
schemas.py:build_delivery_column_order(prefix + CI columns + suffix)
Concepts and pipelines¶
- Delivery pipeline — end-to-end CSV production
- Unit conventions: all delivery-boundary fields in bu/ac;
weather_correction_bu_acstays in bu/ac even during warehouse re-ingestion (export.py:56) nass_actual_area_weighted_allwas added alongsideweather_correction_bu_acto expose the full-universe NASS figure (see PR-340)
PRs and commits¶
- PR-331 — introduced
weather_correction_bu_acas a requiredfloat(previouslyfloat | None, always null); added P90 CI bands to wheat, cotton, and soybean configs - PR-340 — dashboard changes that surfaced
nass_actual_area_weighted_allside-by-side withnass_actual
Open questions¶
nass_actual_area_weighted_allandnass_actual_prod_div_area_allare only meaningful at ADM0 level but the schema permits them at ADM1/ADM2. No validator currently enforces that they areNonebelow national level.variabledefaults to"yield_bu_acre"even for cotton (which useslbs/ac). Whether the default should be overridden per commodity is unresolved in the codebase.