Skip to content

Requirements — New Commodity Hindcast Model

Purpose

This document is the contract a new commodity model must satisfy to fit into the commodity_hindcast pipeline. It serves two audiences:

  • Builder — you are adding a new commodity, region, or modelling approach. Use this as a checklist.
  • Validator — someone has proposed an external approach and you need to decide whether it fits. Skip to Section 11 — Compatibility checklist for a quick-cut.

Every requirement uses EARS notation and carries a stable ID R-<section>-<n> for cross-reference. Citations point at existing wiki pages and source code; the wiki is the authoritative reference for any detail not stated here.

Document conventions

Every requirement uses one of five EARS forms:

  • Ubiquitous — "The system shall..." (always true)
  • Event-driven — "When [trigger], the system shall..." (triggered by an event)
  • State-driven — "While [state], the system shall..." (true while in a given state)
  • Unwanted — "The system shall not..." (forbidden)
  • Optional — "Where [feature is enabled], the system shall..." (configuration-gated)

"The system" refers to the proposed new model end-to-end: config + features + training + inference + delivery.

Section 1 — Configuration contract

R-CFG-1. The system shall declare its configuration in a single Pydantic-validated YAML file at market_insights_models/src/commodity_hindcast/configs/<commodity>_<region>.yaml.

R-CFG-2. The system shall be loadable as a valid ExperimentConfig via pydantic-settings, validated atomically at load time. (See wiki/commodity_hindcast/entities/ExperimentConfig.md.)

R-CFG-3. The system shall not accept a data_root field in user-facing YAML. (DESIGN.md Clause 6; wiki/commodity_hindcast/concepts/input_data_dir_contract.md.)

R-CFG-4. When the INPUT_DATA_DIR environment variable is unset, the system shall raise RuntimeError at config-load time with an actionable message naming the missing variable.

R-CFG-5. The system shall declare the following top-level config fields: - commodity — name, region (country_code), season_start, season_end, harvest dates - feature_start_year, feature_end_year — bounding window - builders — ordered list (yields, weather, climo, NDVI, stress) - model — regression + detrending hyperparameters - postprocess.conformalise — tuple of supported residual modes - forecast.residual_mode — single residual mode used at forecast time (mandatory since PR #372) - reference_data — discriminated union (WasdeRefSpec | ConabFinalRefSpec | ConabLevantamentoRefSpec) - experiment_protocol — fold-schedule configuration

R-CFG-6. While forecast.residual_mode is unset or absent, the system shall reject the config at load time. (PR #372; wiki/commodity_hindcast/concepts/residual_modes.md.)

R-CFG-7. The system shall persist config_resolved.yaml at the run-dir root for reproducibility.

Section 2 — Data input contract

R-IN-1. The system shall consume reference yield benchmarks via ReferenceYieldSpec selecting one of {WasdeRefSpec, ConabFinalRefSpec, ConabLevantamentoRefSpec}. (wiki/commodity_hindcast/entities/ReferenceYieldSpec.md.)

R-IN-2. Where the commodity is United States-based, the system shall use WasdeRefSpec as the primary reference benchmark.

R-IN-3. Where the commodity is Brazil-based, the system shall use both ConabFinalRefSpec and ConabLevantamentoRefSpec reference benchmarks.

R-IN-4. The system shall produce yield-history features via the YieldsBuilder from NASS or an equivalent national statistical source.

R-IN-5. The system shall produce weather features via the WeatherBuilder and ClimoBuilder against zarr indices conforming to the existing schema.

R-IN-6. When weather observations are unavailable for a forecast init_date, the system shall splice climatology values via materialise_forecast_indices. (wiki/commodity_hindcast/concepts/climo_materialisation.md.)

R-IN-7. The system shall not call .as_posix() on cloudpathlib.AnyPath objects. (wiki/commodity_hindcast/concepts/s3_path_safety.md.)

R-IN-8. While running with an S3-backed data_root, the system shall not anchor SQLite or lockfiles under the data root.

Section 2a — YieldsBuilder contract

The contract R-IN-4 resolves to. geo denotes any geographical unit (county, NUTS-3, microrregião, district, etc.), parent_geo its administrative parent. Source: features/builders/yields.py:_load_nass, lib/reference_data/nass.py:load_nass_obs, config schema at config.py:YieldsBuilder.

R-YB-1. The system shall expose yield history as a single parquet file at commodity.builders.yields.filepath, resolved against INPUT_DATA_DIR.

R-YB-2. The system shall guarantee uniqueness on (year, geo, parent_geo) after any crop_type filter; duplicates resolve to first occurrence via pivot_table(aggfunc="first").

R-YB-3. The system shall include the following columns in the input parquet: - year — integer-coercible calendar year (literal column name, not configurable) - column named by builders.yields.county_col — the geo identifier (lower unit) - column named by builders.yields.state_col — the parent_geo identifier - column named by commodity.yield_col — yield in kg/ha (NASS-native units, even for non-US commodities) - column named by commodity.area_col — harvested area in ha

R-YB-4. Where builders.yields.production_col is set, the system shall include that column carrying production in kg, and the builder shall raise ValueError if the column is absent at load time.

R-YB-5. Where builders.yields.crop_type is set, the system shall include a crop_type column (literal name) and the builder filters rows to the configured value before pivot.

R-YB-6. The system shall not include a geo_identifier column in the input parquet — the builder constructs it as make_geo_identifier(geo, parent_geo, country_code).

R-YB-7. The system shall not drop rows where yield is NaN but production and area are non-null — Fellegi-Holt rules declared in builders.yields.edits impute yield from production / area post-pivot.

R-YB-8. When year fails integer coercion, the system shall drop the affected row silently.

R-YB-9. The system shall report yield, area, and production in NASS-native SI units (kg/ha, ha, kg) at ingest, regardless of commodity.delivery_unit (unit conversion happens in the delivery layer, not the builder).

R-YB-10. The system shall not require contiguous year coverage per geo — gap years are tolerated and _nanmean_last_k skips them when computing trailing-mean features.

R-YB-11. The system shall provide geo and parent_geo values that, after make_geo_identifier(geo, parent_geo, country_code) (which uppercases the ISO-3 code and _normalise_names the rest — lowercase, ASCII-fold, whitespace-collapse), yield a geo_identifier of the form ADM0:<ISO3>/ADM1:<adm1>/ADM2:<adm2> that exists verbatim in the project geoboundaries geometry.parquet.

R-YB-12. While the source statistical agency publishes under a non-ADM scheme (NUTS, ITS, microrregião, IBGE codes, FIPS, district codes, etc.), the system shall reconcile those identifiers to the geoboundaries ADM1/ADM2 names in an upstream preprocessor before writing the yields parquet — YieldsBuilder performs no lookup against geometry.parquet itself.

R-YB-13. The system shall not emit a yield row whose constructed geo_identifier has no match in geometry.parquet — unmatched rows surface downstream as silent join drops in delivery/geo_normalise.py and corrupt ADM1/ADM2 aggregates.

Example — UK wheat (wheat_gbr.yaml)

Hypothetical UK wheat. DEFRA publishes by NUTS-1 region; geoboundaries geometry.parquet for GBR carries ADM1 = country/region (england, scotland, wales, northern ireland) and ADM2 = unitary authority / district names. An upstream preprocessor must map NUTS codes to those exact names — UKCnorth east (england) (or whatever literal string geometry.parquet carries for GBR/ADM1=…), and similarly for the lower-tier unit — before writing the yields parquet. The output of that preprocessor lives at data/defra/preprocessed_wheat.parquet:

commodity:
    country_code: GBR
    yield_col: yield_kg_ha
    area_col: area_harvested_ha
    target_col: yield_kg_ha
    builders:
        yields:
            filepath: data/defra/preprocessed_wheat.parquet
            # Values in these columns are geoboundaries ADM names, NOT raw NUTS codes:
            county_col: adm2_name      # legacy YAML key — the `geo`        (e.g. "leeds")
            state_col:  adm1_name      # legacy YAML key — the `parent_geo` (e.g. "england")
            production_col: production_kg
            crop_type: WHEAT
            edits:
                - name: impute_null_yield_from_prod_area
                  kind: deductive_impute
                  target: yield_kg_ha
                  expression: production_kg / area_harvested_ha

Required parquet columns: year, adm1_name, adm2_name, yield_kg_ha, area_harvested_ha, production_kg, crop_type. The builder pivots to a (geo × year) matrix and emits geo_identifier of the form ADM0:GBR/ADM1:<adm1_name>/ADM2:<adm2_name> via make_geo_identifier (ADM0 upper-case, ADM1/ADM2 lower-case + ASCII-folded). The county_col/state_col YAML keys are legacy NASS-USA naming and bind to whatever administrative pair the source publishes — they are not restricted to county/state, but the values must already match geometry.parquet.

Section 3 — Yield data output contract

R-OUT-1. The system shall produce CSV deliverables conforming to the DeliveryRow schema with extra="forbid". (wiki/commodity_hindcast/entities/DeliveryRow.md.)

R-OUT-2. The system shall include in every delivery row the following identity fields: commodity, year, init_date, geo_identifier, variable, model.

R-OUT-3. The system shall encode the ADM level inside geo_identifier (e.g. ADM0:USA, ADM0:USA/ADM1:iowa, ADM0:USA/ADM1:iowa/ADM2:johnson) rather than as a separate column. (See R-YB-11 and wiki/commodity_hindcast/entities/Region.md for the canonical pattern.)

R-OUT-4. The system shall produce three CSVs per delivery event — one each at ADM0, ADM1, and ADM2.

R-OUT-5. Where the commodity is United States-based, the system shall report yield in bu/ac.

R-OUT-6. Where the commodity is non-US, the system shall report yield in kg/ha.

R-OUT-7. The system shall populate weather_correction_bu_ac as the structural identity (sim_yield_kg_ha_detrended × scale), not as a vintage delta. (PR #331; wiki/commodity_hindcast/concepts/weather_correction.md.)

R-OUT-8. The system shall not include any column in a delivery row whose name is not declared in the Pydantic schema.

R-OUT-9. While CI bands are written, the system shall enforce the monotonic ordering invariant lower_95 ≤ … ≤ lower_50 ≤ mean ≤ upper_50 ≤ … ≤ upper_95 defined in _validate_ci_ordering at delivery/schemas.py:176.

R-OUT-10. The system shall preserve (commodity, year, init_date, geo_identifier) uniqueness across all rows of a single delivery file.

Section 4 — Hindcast deliverable (5-year walk-forward)

R-HIND-1. The system shall produce a walk-forward hindcast covering at least the most recent 5 complete season_years.

R-HIND-2. The system shall produce one HindcastSlice per fold, each comprising a separate train/test partition where training uses only data with season_year < fold_year.

R-HIND-3. The system shall produce a separate production fold trained on all available data and used as the inference artefact at forecast time.

R-HIND-4. When walk-forward CV completes, the system shall persist walk_forward_preds.parquet per fold at <run_dir>/preds/<experiment_key>/<fold_label>/.

R-HIND-5. When the hindcast deliver stage completes, the system shall produce CSVs at: - <run_dir>/delivery/Treefera_<experiment_key>_ADM0_Hindcast_<YYYYMMDD>.csv - <run_dir>/delivery/Treefera_<experiment_key>_ADM1_Hindcast_<YYYYMMDD>.csv - <run_dir>/delivery/Treefera_<experiment_key>_ADM2_Hindcast_<YYYYMMDD>.csv

R-HIND-6. The system shall include in each hindcast row at least one init_date per WASDE-publication month within the season window. (wiki/commodity_hindcast/pipelines/hindcast.md.)

R-HIND-7. The system shall not emit a hindcast row for a season_year lacking a finalised reference yield benchmark.

R-HIND-8. The system shall produce, per fold, a train_preds.parquet and a walk_forward_preds.parquet with identical column schemas.

Section 5 — Forecast deliverable (produced at forecast time)

R-FCST-1. When cli run forecast is invoked with --season-year and --init-date, the system shall produce a forecast for that pair without retraining.

R-FCST-2. The system shall persist forecast artefacts under <run_dir>/forecast/<season_year>/<init_date>/. (PR #369; wiki/commodity_hindcast/pipelines/forecast.md.)

R-FCST-3. When the forecast deliver stage completes, the system shall produce CSVs at: - <run_dir>/forecast/<season_year>/<init_date>/delivery/Treefera_<experiment_key>_ADM0_Forecast_<init_date>.csv - …/ADM1_Forecast_<init_date>.csv - …/ADM2_Forecast_<init_date>.csv

R-FCST-4. The system shall reuse the production-fit model artefacts from the canonical hindcast RunDir rather than retraining at forecast time.

R-FCST-5. The system shall not write to canonical hindcast artefact paths during forecast execution. (DESIGN.md; wiki/commodity_hindcast/concepts/hindcast_vs_forecast.md.)

R-FCST-6. Where multi-year forecasting is requested (multiple season_years per init_date), the system shall splice climatology via forecast_long_range_stub for years beyond zarr coverage. (PR #369.)

R-FCST-7. While weather observations are unavailable for the requested forecast horizon, the system shall fall back to climatological inputs and emit a trend-only signal rather than failing.

R-FCST-8. The system shall produce a forecast CSV whose row schema is identical to the hindcast CSV row schema.

Section 6 — Uncertainty intervals

R-CI-1. The system shall produce conformal prediction intervals via the CalibrationResult meta-model. (PR #361; wiki/commodity_hindcast/entities/CalibrationResult.md.)

R-CI-2. The system shall persist a per-residual-mode parquet sidecar at <run_dir>/conformal/<mode>.parquet.

R-CI-3. The system shall support exactly the four residual modes defined in models/meta_models/types.py:16: - hindcast_oos_per_init_date - hindcast_oos_per_year - hindcast_oos_fully_pooled - in_sample_pooled

R-CI-4. The system shall include CI bands in every delivery row at the coverage levels: 50%, 68%, 80%, 95%.

R-CI-5. Where 90 appears in delivery.ci_levels, the system shall additionally include 90% bands. (PR #331.)

R-CI-6. While the calibration set for a residual mode is empty (e.g. early walk-forward folds), the system shall emit NaN for the affected CI columns rather than raise.

R-CI-7. The system shall not emit CI bands that violate the monotonic ordering invariant in _validate_ci_ordering.

R-CI-8. The system shall log the chosen residual_mode and the on-disk path of the calibration sidecar to MLflow.

Section 7 — Behavioural-role contracts

R-ROLE-1. The system shall implement regression via a class deriving from AbstractRegressionImpl with the four abstract methods fit, predict, save_model, load_model. (wiki/commodity_hindcast/entities/Regressor.md.)

R-ROLE-2. The system shall implement detrending via a class deriving from AbstractDetrend. (wiki/commodity_hindcast/entities/Detrender.md.)

R-ROLE-3. When persisting a regressor, the system shall write to a single artefact path resolvable via cloudpathlib.AnyPath.

R-ROLE-4. When loading a regressor, the system shall produce a working instance from the persisted artefact alone (no implicit reliance on global state, environment variables beyond INPUT_DATA_DIR, or shared in-process caches).

R-ROLE-5. The system shall implement reference-yield loading via a class deriving from ReferenceYieldLoader and registering itself for one kind: value in the discriminated union. (wiki/commodity_hindcast/entities/ReferenceYieldLoader.md.)

R-ROLE-6. Where bias correction is enabled, the system shall implement a BiasCorrector deriving from AbstractBiasCorrector.

R-ROLE-7. The system shall implement feature builders conforming to the BuilderFn Protocol at features/builders/interface.py:25.

Section 8 — Pipeline isolation invariants

R-ISO-1. The system shall execute the canonical hindcast pipeline as a single self-contained cli run hindcast invocation.

R-ISO-2. The system shall not modify or overwrite hindcast artefacts during forecast execution.

R-ISO-3. While two pipelines target the same commodity simultaneously, the system shall not corrupt the shared MLflow tracking database. (Known issue: parallel runs cause SQLAlchemy OperationalError; same-commodity runs must be serialised.)

R-ISO-4. The system shall log every fold's hyperparameters, metrics, and artefact paths to MLflow with run tags identifying the commodity and experiment_key.

R-ISO-5. The system shall produce RunDir outputs inside <INPUT_DATA_DIR>/runs/<YYYYMMDD_HHMMSS>_<experiment_name>/. (wiki/commodity_hindcast/entities/RunDir.md.)

R-ISO-6. The system shall not depend on artefacts from a different RunDir at runtime, except where the forecast pipeline reuses production-fit artefacts as declared by the user.

Section 9 — Validation and preflight

R-VAL-1. When cli run is invoked, the system shall execute the appropriate preflight check-set before any compute starts. (wiki/commodity_hindcast/pipelines/preflight.md.)

R-VAL-2. While any required input path is missing, the system shall halt with an actionable error message naming the missing path.

R-VAL-3. The system shall produce metrics including MAE, RMSE, and detrended MAE per fold. (wiki/commodity_hindcast/pipelines/evaluate.md.)

R-VAL-4. The system shall produce diagnostic plots — at minimum: per-fold scatter, rolling national vs WASDE, ADM1 error heatmap, residual-predictability scatter, and partial-dependence plots.

R-VAL-5. When evaluation completes, the system shall emit <run_dir>/reports/stage5_metrics.txt, stage5_metrics_ADM1.txt, stage5_metrics_ADM2.txt.

R-VAL-6. The system shall include a smoke-test target in the package Makefile that runs end-to-end on a small subset of years for the new commodity.

R-VAL-7. The system shall ship at least one unit test per new public class added under tests/unit/commodity_hindcast/.

Section 10 — Optional capabilities (nice-to-haves)

10.1 Management practices (irrigation, planted area, rotation)

R-OPT-1. Where management-practice features are enabled, the system shall include the chosen indicators (e.g. irrigation share, planted area, crop rotation flag) as exogenous columns in pred.parquet.

R-OPT-2. Where management-practice features are enabled, the system shall not assume the indicators are available for every county-year — gaps shall be imputed via a declared EditOperation (typically PanelTrailingMedian or DeductiveImpute). (wiki/commodity_hindcast/entities/EditRuleConfig.md.)

R-OPT-3. Where management-practice features are enabled, the system shall expose a per-fold marginal-effect estimate of each practice (e.g. "irrigated counties yield X bu/ac higher on average") as a row in <run_dir>/reports/management_effects.csv.

R-OPT-4. The system shall not allow a management-practice feature to leak future information (e.g. end-of-season planted area used as a within-season feature).

10.2 Explainability

R-OPT-5. Where explainability is enabled, the system shall emit per-fold partial-dependence plots and feature-importance rankings to <run_dir>/reports/explainability/.

R-OPT-6. Where explainability is enabled and the underlying regressor supports it, the system shall persist SHAP values or equivalent attribution per delivered prediction.

R-OPT-7. Where explainability is enabled, the system shall produce a stable, documented schema for the attribution outputs so risk-intelligence consumers can rely on them.

10.3 Scenario shocking (risk-intelligence reuse)

R-OPT-8. Where scenario-shock mode is enabled, the system shall accept a scenario_overrides payload mapping (season_year, init_date, geo_identifier) → feature_name → value and produce a parallel set of forecast CSVs labelled …_Forecast_<init_date>_<scenario_id>.csv.

R-OPT-9. Where scenario-shock mode is enabled, the system shall not overwrite the canonical (no-shock) forecast outputs. Scenario outputs shall coexist alongside the canonical CSVs.

R-OPT-10. Where scenario-shock mode is enabled, the system shall log the scenario_id and override payload to MLflow tags for traceability.

R-OPT-11. Where the scenario implies a climate shock (e.g. uniform +2°C, drought, EDD multiplier), the system shall accept a parametric WeatherShock config that drives materialise_forecast_indices to apply the shock at zarr-read time rather than mutating cached parquet outputs.

R-OPT-12. While running in scenario-shock mode, the system shall mark every output row with a scenario_id column (or equivalent metadata) so downstream consumers can join shocked vs canonical forecasts.

R-OPT-13. The system shall not produce scenario-shock outputs that are mathematically impossible (e.g. negative yields, CI bands violating monotonicity) — shocks shall be clipped or the row marked invalid.

R-OPT-14. Where scenario-shock mode is enabled, the system shall expose the canonical-vs-shocked yield delta as a column in the scenario CSV (yield_delta_<scenario_id>_bu_ac or equivalent in kg_ha).

Section 11 — Compatibility checklist

Use when evaluating whether an external/3rd-party approach can fit:

# Question If "no" then...
1 Does it produce yield predictions at ADM0/ADM1/ADM2 resolution? Will not fit delivery layer; fundamental incompatibility.
2 Does it produce 5+ years of walk-forward hindcast? Cannot be benchmarked against the existing client deliverables.
3 Does it produce uncertainty intervals at multiple coverage levels? Cannot meet the client deliverable contract.
4 Can its regressor implement AbstractRegressionImpl (fit/predict/save/load)? Wrap in an adapter or rewrite.
5 Can its detrending implement AbstractDetrend? Use LinearStateDetrend as fallback.
6 Does it accept INPUT_DATA_DIR-anchored paths via cloudpathlib.AnyPath? Path-handling rewrite required.
7 Does it accept a Pydantic-settings YAML config (no data_root key)? Config-loading shim required.
8 Does it integrate with MLflow tracking (run tags + artefact logging)? Lose lineage; not a hard blocker.
9 Does it emit forecasts under forecast/<season_year>/<init_date>/? Path adapter required.
10 Is it isolated from canonical hindcast artefacts at forecast time? Refactor required to enforce isolation.
11 Does it support all four residual modes for conformal calibration? Must declare which mode it supports; downgrade behaviour explicitly.
12 Does it support the extra="forbid" strict-schema delivery contract? CSV post-processor required.

A proposal that fails on items 1–3 is fundamentally incompatible. Items 4–10 typically need adapter code. Items 11–12 are usually solvable with a thin wrapper.

Section 12 — Stress-test scenarios (for reviewers)

Walk through each of these "what if" prompts against the proposed approach. For each, the proposal should have a defined failure mode and recovery story.

Network and infrastructure

  • S3 outage — what if the bucket holding weather indices is unreachable for an hour mid-forecast?
  • MLflow DB locked — what if a parallel same-commodity run has the SQLite backend locked?
  • INPUT_DATA_DIR typo — what if the env var points at a sibling experiment's run-dir?

Data quality

  • Garbage data — what if NASS publishes a county yield with a missing decimal (e.g. 1700 instead of 170.0)?
  • Missing reference benchmark — what if WASDE skips a publication month?
  • Sparse counties — what if 30% of ADM2 units have <3 years of yield history?

Scale and traffic

  • 100x traffic — what if 50 forecasts are requested for the same season_year simultaneously?
  • Long-range zarr exhaustion — what if a forecast asks for season_year = current + 5?

Adversarial

  • Malicious config — what if a user submits a YAML with INPUT_DATA_DIR pointing at another experiment's run-dir?
  • Schema-evasion — what if a builder produces a column outside the declared feature_cols?

Modelling failure modes

  • Climate shock — what does the model output if EDD is 3σ above climatology for the entire season?
  • Calibration empty — what if the chosen residual_mode has zero residuals because of fold-window edge effects?
  • Production-fit collapse — what if production-fit fails silently and downstream stages run on stale artefacts?
  • Detrender divergence — what if the partial-pooling detrender fails to converge for a specific commodity?
  • Sign-instability — what if the PCA-Ridge pipeline produces sign-flipped components on different folds?

Stakeholder trust

  • "If this system failed spectacularly, what would be the consequences for client trust?"
  • "What would make a client demand a refund or unsubscribe?"
  • "What's the worst single-row delivery error a client could spot in a published CSV?"

Open questions

  • Should management-practice features be a first-class FeatureBuilder variant or a separate post-hoc augmentation?
  • Should scenario-shock outputs share the canonical RunDir or live in a sibling scenarios/ directory?
  • Is 5 years the right minimum hindcast horizon, or should it be tied to the WASDE history depth per commodity?
  • Should explainability outputs have a public schema versioned independently of DeliveryRow?
  • Should WeatherShock be a CLI-level abstraction or a Python-API-only abstraction?
  • For multi-year forecasts where output collapses to trend-only, should the system emit a degraded-confidence flag?
  • Should R-OPT-14 (canonical-vs-shocked delta) be promoted to mandatory if scenario-shock mode becomes a primary product?

Cross-references

  • Master plan: IMPLEMENTATION_PLAN.md
  • Wiki entry point: wiki/commodity_hindcast/synthesis/overview.md
  • Domain model: domain_model/README.md
  • DESIGN.md (in-package, contractual): market_insights_models/src/commodity_hindcast/DESIGN.md
  • Pipeline pages: wiki/commodity_hindcast/pipelines/{hindcast,forecast,deliver,evaluate}.md
  • Concept pages: wiki/commodity_hindcast/concepts/{hindcast_vs_forecast,conformal_modes,residual_modes,weather_correction,s3_path_safety,input_data_dir_contract,climo_materialisation,adm_levels}.md
  • Entity pages: wiki/commodity_hindcast/entities/{ExperimentConfig,DeliveryRow,HindcastDelivery,CalibrationResult,Regressor,Detrender,ReferenceYieldSpec}.md