Entity: Commodity¶
Definition¶
An agricultural crop being modelled — corn, soybean, wheat, cotton, or brazil_soybean. Commodity is the root discriminator for all crop-specific constants: crop calendar, feature columns, delivery units, plausibility bounds, and the ADM0 country scope. It is not represented as a standalone class; its identity lives as the CommodityConfig.commodity: str field, and all derived constants are co-located on CommodityConfig.
Kind¶
Value object (string field on a frozen Pydantic model). The canonical container is CommodityConfig (config.py:274). Commodity itself has no class; it is the name for the identity dimension of CommodityConfig.
Source of truth¶
market_insights_models/src/commodity_hindcast/config.py:274 — CommodityConfig class declaration. The commodity identity field is at config.py:279. The country_code that pairs with it to form experiment_key is at config.py:289.
Key attributes / structure¶
| Field | Type | Notes |
|---|---|---|
commodity |
str |
Primary identity; lower-case slug, e.g. "corn", "soybeans", "wheat", "cotton" |
country_code |
str |
ISO-3 country code (validated, uppercased); default "USA"; pairs with commodity to form experiment_key |
season_start |
MonthDay |
First calendar day of the growing season |
season_start_year_offset |
int |
0 for same-year crops; -1 for cross-year crops (winter wheat, brazil soy) |
harvest_season_doy |
int |
Season DOY of harvest; upper bound of the growing window |
hindcast_init_season_doys |
tuple[int, …] |
Season DOYs for the weekly init-date grid |
yield_range |
tuple[float, float] |
Plausibility bounds in delivery units; enforced at delivery/ boundary |
delivery_unit |
str |
"bu_acre" for grains; "lbs_acre" for cotton |
bushel_weight_lbs |
float |
Commodity-specific bushel weight for kg/ha ↔ bu/ac conversion; 1.0 for cotton |
feature_cols |
list[str] |
Ordered list of weather/climo/NDVI columns fed to the regressor |
target_col |
str |
Training target column (typically "yield_kg_ha") |
actuals_source_short |
str |
Label for the truth source (default "NASS"; "CONAB" for Brazil) |
Known instances (5 active commodities):
| YAML config | commodity |
country_code |
Delivery unit |
|---|---|---|---|
configs/corn_usa.yaml |
corn |
USA |
bu_acre |
configs/soybeans_usa.yaml |
soybeans |
USA |
bu_acre |
configs/wheat_usa.yaml |
wheat |
USA |
bu_acre |
configs/cotton_usa.yaml |
cotton |
USA |
lbs_acre |
configs/soybeans_bra.yaml |
soybeans |
BRA |
bu_acre |
Lifecycle¶
Created: Parsed from a commodity YAML config at ExperimentConfig construction time. The _inject_builder_type_from_key validator (config.py:354) auto-wires the builders dict keys. country_code is validated and uppercased at load time.
Consumed: Every pipeline stage receives the parent ExperimentConfig; all commodity-specific dispatch (calendar, unit conversion, feature column selection) keys on CommodityConfig. make_geo_identifier (lib/geo/identifiers.py:207) uses country_code to anchor the ADM0 segment of every GeoIdentifier.
Destroyed: Never destroyed; CommodityConfig is frozen and immutable for the lifetime of a run.
Relationships to other entities¶
- SeasonYear — governs —
CommodityConfig.season_start_date(season_year)converts aSeasonYearinteger to a calendar anchor - InitDate — generates —
CommodityConfig.hindcast_init_dates(season_year)returns the full weekly init-date grid for any season year - Region — scopes —
country_codedetermines the ADM0 segment of everyGeoIdentifierminted for this commodity's features - Yield — defines bounds for —
yield_rangeanddelivery_unitgovern unit conversion and plausibility clamping at delivery - Fold — implicitly scoped by — each fold's filesystem path includes
CommodityConfig.experiment_keyas a directory component
Concepts and pipelines that touch this entity¶
- Pipeline: hindcast (P5) — commodity is the top-level discriminator for every stage
- Pipeline: forecast (P5) —
ExperimentConfig.init_dates_fordispatches on commodity'shindcast_init_season_doys - Concept: unit conversion (P5) —
lib/unit_utils.pyusesbushel_weight_lbsfromCommodityConfig
PRs and commits¶
- PR-360 — Adds
configs/soybeans_bra.yaml; introducescountry_codeas a required field onCommodityConfigto disambiguate US vs Brazil soybean runs; fixes a silent factor-67 unit bug in CONAB yield loading - PR-339 — Package restructure that moved
CommodityConfigto its canonical location inconfig.py
Open questions¶
- Should
Commoditybe promoted to a properNewTypeorEnumso that downstream code receives type-safe dispatch rather than rawstrcomparisons? - The
actuals_source_short/actuals_source_labelsplit is partly cosmetic; is there a cleaner encoding that ties the label to theReferenceYieldSpecdiscriminator? - Wheat sub-type labels (
WINTER_WHEAT, etc.) appear in config notes but are never produced by the NASS preprocessor — this is an open issue documented inDOMAIN_MODEL2.md §9. brazil_soybeanvssoybeans+BRAnaming: two configs use the commodity string"soybeans"with differentcountry_codevalues, rather than a separate commodity slug. This may cause confusion when filtering by commodity string alone.