ADR-002: CalibrationResult is a persistable aggregate sidecar¶
Status: Accepted (retroactively documented 2026-05-08)
Date: 2026-04-30 (commit f83eed5d, "polymorphic apply_conformal(experiment, ...) with CalibrationResult"); landed on main via PR #361 on 2026-05-02 (commit 2e88cd85)
Authors: ai-tommytf (per git log on models/meta_models/conformalise.py)
Context¶
Before PR #361, conformal calibration was a transient computation that lived
inside the postprocess walk-forward loop. Half-widths were computed from
residuals, embedded directly into run_dir/postprocessed/national.parquet as
lower/upper columns, and discarded. There was no first-class on-disk
representation of "the calibration that was fitted from this run".
That design failed for two concrete reasons:
- Forecast had no residuals to fit from. At forecast time the production
fold's
obs_yield_kg_hais NaN (the harvest has not happened), so a forecast-time call site cannot reconstruct the OOS residuals that postprocess used. The CV folds it needs to scan are the same artefacts postprocess already scanned. Re-fitting perinit_dateduplicated the walk-forward residual scan that postprocess had just done. - Multiple residual recipes co-exist. PR #361 introduced four
residual_modevalues (market_insights_models/src/commodity_hindcast/models/meta_models/types.py:16) describing which residuals feed the quantile: per init_md, per year, fully pooled, and an in-sample fallback. Embedding "the calibration" innational.parquetcollapses all four to whichever one happened to run; consumers who want a different mode have to re-run postprocess.
The wiki entity page records that an earlier domain_model/AGGREGATES.md
entry described CalibrationResult as "a transient value returned from
stages/run_meta_models.py"
(wiki/commodity_hindcast/entities/CalibrationResult.md "Correction to
AGGREGATES.md" section, lines 24-35). That description has been corrected;
wiki/domain_model/AGGREGATES.md:214 now calls it "a persisted aggregate
sidecar, not a transient value".
Decision¶
CalibrationResult is a frozen dataclass with first-class persistence,
written once per configured mode during postprocess and loaded by every
downstream consumer.
- Frozen dataclass declaration at
market_insights_models/src/commodity_hindcast/models/meta_models/conformalise.py:106-117. - Long-format parquet serialisation via
to_frame(conformalise.py:136),save(conformalise.py:215), andload(conformalise.py:225). I/O routes throughtreefera_market_insights.shared.utils.dataframes.read_dataframe/write_dataframeand acceptscloudpathlib.AnyPathfor S3. - One sidecar per mode at the canonical path
run_dir/conformal/{mode}.parquet, computed bycalibration_path(stages/run_meta_models.py:60). - Postprocess persists every mode listed in
config.postprocess.conformaliseviafit_and_save_all_configured(stages/run_meta_models.py:85); the first-listed mode is the primary calibration applied at runtime byprimary_calibration(stages/run_meta_models.py:119). - Application surface:
predict_interval(sim, *, fold_year=None, init_md=None)dispatches by populated field (conformalise.py:360-398). The four mode handlers (conformalise.py:470, :493, :526, :546) each populate exactly one ofper_init_md/per_year/pooled— the one-of invariant noted in the dataclass docstring (conformalise.py:108-117).
Consequences¶
Positive¶
- Forecast, delivery, and diagnostics read a CV-derived calibration
without re-running postprocess;
get_or_fit_calibration(stages/run_meta_models.py:100-116) loads the sidecar when present and only fits on demand if it is missing. - A hindcast run can pre-fit several candidate calibrations without
forcing the forecast to use any specific one
(
stages/run_meta_models.py:119-135); per-mode sidecars allow comparing modes side-by-side from the same run_dir (wiki/commodity_hindcast/sources/prs/PR-361.md"95% half-width comparison across modes"). - Round-trip safety is enforced:
to_frameraisesValueErrorrather than serialising an empty sidecar (conformalise.py:187-197). - Self-describing:
residual_mode,method,experiment_key,n_residuals,per_year_fallbackpersist as columns (conformalise.py:138-212);loadreconstructs without out-of-band metadata.
Negative¶
- One additional artefact per configured mode to persist (extra disk).
- Schema changes to
CalibrationResultcolumns require migrating existing sidecars; consumers must not hardcode the old paths (wiki/commodity_hindcast/sources/prs/PR-361.md"Lessons captured").
Neutral¶
- The
commodity_filename prefix was dropped in PR #361 becauserun_diralready encodes the commodity-region (wiki/commodity_hindcast/sources/prs/PR-361.md"Filename layout").
Alternatives considered¶
- Keep transient (status quo ante). Rejected because forecast had to
re-fit calibration per
init_date, duplicating the CV residual scan postprocess had just done; and because forecast time has no production-fold residuals to fit from at all (productionobsis NaN). - Embed in
postprocessed/national.parquet. Rejected because that collapses calibration to whichever single mode postprocess ran with; storing four modes side-by-side per(year, init_date)row would bloat the wide national frame and tie consumers to its schema. - In-sample-only fallback as the default. Rejected on statistical
grounds —
_in_sample_pooled(conformalise.py:546-568) emits a loguru warning that intervals are biased narrow because in-sample residuals under-estimate true OOS uncertainty; it is retained only as the fallback when no CV folds exist.
Verification¶
- Round-trip property:
tests/unit/commodity_hindcast/test_apply_conformal_experiment.py:146(test_apply_conformal_hindcast_oos_per_init_date_round_trips_brackets_and_orders) exercisescal.save(path)thenCalibrationResult.load(path)and asserts dictionary equality acrossper_init_mdkeys and levels. - Empty-row guard at
to_frame()raisesValueError(conformalise.py:187-197); dedicated test coverage is [PLACEHOLDER: not located in the audit window]. - One-of invariant encoded by the four mode handlers
(
conformalise.py:470, :493, :526, :546).
References¶
market_insights_models/src/commodity_hindcast/models/meta_models/conformalise.py:106-117— frozen dataclass declarationmodels/meta_models/conformalise.py:136—to_framemodels/meta_models/conformalise.py:187-197— empty-row guardmodels/meta_models/conformalise.py:215—savemodels/meta_models/conformalise.py:225—loadclassmethodmodels/meta_models/conformalise.py:360-398—predict_intervaldispatchmodels/meta_models/conformalise.py:470, :493, :526, :546— four mode handlersmodels/meta_models/types.py:16—ResidualModeliteralstages/run_meta_models.py:60—calibration_pathstages/run_meta_models.py:85—fit_and_save_all_configuredstages/run_meta_models.py:100-116—get_or_fit_calibrationstages/run_meta_models.py:119—primary_calibrationtests/unit/commodity_hindcast/test_apply_conformal_experiment.py:146— round-trip testwiki/commodity_hindcast/entities/CalibrationResult.mdwiki/commodity_hindcast/sources/prs/PR-361.mdwiki/commodity_hindcast/concepts/conformal_modes.mdwiki/commodity_hindcast/concepts/residual_modes.mdwiki/domain_model/AGGREGATES.md:214- PR #361 (commit
2e88cd85, merged 2026-05-02)