Skip to content

Bias Correction

What it is

The bias corrector adjusts the national-level simulated yield to account for the gap between the counties the model covers (included_geo_identifiers, the top-N% production universe) and the full NASS county panel. Counties outside the model's training universe systematically differ from modelled counties — typically lower-yielding, thinner coverage states — so the raw production-weighted average over modelled counties overestimates (or underestimates) the true NASS national yield.

AbstractBiasCorrector

models/meta_models/bias_correction.py (bias_correction.py:19) defines an ABC with the interface:

Method Signature Notes
fit (nass_panel, included_geo_identifiers, test_year) → None Computes internal bias estimate for the fold
bias_kg_ha @property → float Raises RuntimeError if fit not called first
apply_national (sim_kg_ha: float) → float Returns sim - bias_kg_ha
apply_frame (df, *, sim_col) → pd.DataFrame Subtracts bias_kg_ha from sim_col on a copy
save (path) → None Pickles self to bias_corrector.pkl; creates parent dirs
load classmethod (path) → AbstractBiasCorrector Raises FileNotFoundError if missing

NoBiasCorrector

Degenerate pass-through: bias_kg_ha = 0.0, fit is a no-op. Used when BiasCorrectorConfig.kind == "none", which is the default for corn and other commodities where the 95-percentile production universe is already broad enough that the out-of-universe gap is negligible.

CoverageBiasCorrector

Accounts for the coverage gap by computing for each lookback year y:

bias_y = (area_out / area_total) × (yield_in − yield_out)

where in = modelled counties (in included_geo_identifiers), out = remaining counties, yields are area-weighted. Annual bias_y values are then reduced to a scalar by the configured reduction_method (default "median").

CoverageBiasCorrector.fit(nass_panel, included_geo_identifiers, test_year) iterates [test_year - n_lookback_years, test_year - 1]. Accessing bias_kg_ha before calling fit raises RuntimeError.

bias_correction.py:107 — full implementation.

Per-fold persistence

One bias corrector per fold, stored at:

{run_dir}/postprocessed/{experiment_key}/{fold_label}/bias_corrector.pkl

Forecast delivery reloads the corrector at results.bias_corrector_path. Presence is guarded by HindcastSlice.has_bias_corrector.

Configuration

Controlled by PostprocessConfig.bias_corrector: BiasCorrectorConfig:

postprocess:
  bias_corrector:
    kind: coverage        # or "none"
    n_lookback_years: 5
    reduction_method: median

See BiasCorrectorConfig for the full config schema.

Cross-references