Bias Correction¶
What it is¶
The bias corrector adjusts the national-level simulated yield to account for the gap between the counties the model covers (included_geo_identifiers, the top-N% production universe) and the full NASS county panel. Counties outside the model's training universe systematically differ from modelled counties — typically lower-yielding, thinner coverage states — so the raw production-weighted average over modelled counties overestimates (or underestimates) the true NASS national yield.
AbstractBiasCorrector¶
models/meta_models/bias_correction.py (bias_correction.py:19) defines an ABC with the interface:
| Method | Signature | Notes |
|---|---|---|
fit |
(nass_panel, included_geo_identifiers, test_year) → None |
Computes internal bias estimate for the fold |
bias_kg_ha |
@property → float |
Raises RuntimeError if fit not called first |
apply_national |
(sim_kg_ha: float) → float |
Returns sim - bias_kg_ha |
apply_frame |
(df, *, sim_col) → pd.DataFrame |
Subtracts bias_kg_ha from sim_col on a copy |
save |
(path) → None |
Pickles self to bias_corrector.pkl; creates parent dirs |
load |
classmethod (path) → AbstractBiasCorrector |
Raises FileNotFoundError if missing |
NoBiasCorrector¶
Degenerate pass-through: bias_kg_ha = 0.0, fit is a no-op. Used when BiasCorrectorConfig.kind == "none", which is the default for corn and other commodities where the 95-percentile production universe is already broad enough that the out-of-universe gap is negligible.
CoverageBiasCorrector¶
Accounts for the coverage gap by computing for each lookback year y:
where in = modelled counties (in included_geo_identifiers), out = remaining counties, yields are area-weighted. Annual bias_y values are then reduced to a scalar by the configured reduction_method (default "median").
CoverageBiasCorrector.fit(nass_panel, included_geo_identifiers, test_year) iterates [test_year - n_lookback_years, test_year - 1]. Accessing bias_kg_ha before calling fit raises RuntimeError.
bias_correction.py:107 — full implementation.
Per-fold persistence¶
One bias corrector per fold, stored at:
Forecast delivery reloads the corrector at results.bias_corrector_path. Presence is guarded by HindcastSlice.has_bias_corrector.
Configuration¶
Controlled by PostprocessConfig.bias_corrector: BiasCorrectorConfig:
postprocess:
bias_corrector:
kind: coverage # or "none"
n_lookback_years: 5
reduction_method: median
See BiasCorrectorConfig for the full config schema.
Cross-references¶
- MetaModel — the bias corrector sits in the MetaModel layer alongside conformal calibration
- BiasCorrectorConfig — configuration schema
- Pipeline: postprocess — where bias correctors are fitted per fold
- Source: meta_models — detailed code walkthrough of
bias_correction.py