Source: Streamlit Dashboard (app/)¶
Overview¶
The commodity-hindcast dashboard is a Streamlit web application that reads saved hindcast run artefacts directly from disk (no API layer) and presents five interactive chart sections for comparing Treefera seasonal-yield forecasts against USDA NASS actuals and WASDE in-season estimates.
Launched by hand from a developer machine:
The app is commodity-agnostic: it discovers available runs via a scanned directory, populates the sidebar selectors, and adapts every chart, metric card, and table to the selected commodity's phenology calendar. Four commodities are supported — corn, wheat, soybeans, cotton.
Key design facts:
- Reads
RunDirartefacts (delivery/Treefera_<commodity>_ADM0_Hindcast_*.csv) directly — no database or API intermediary. - PR #363 fixed three startup bugs (wrong
_CONFIGS_DIRdepth, renamedWasdeLoaderAPI, missingdata/prefix in WASDE path). - PR #340 added window-aware MAPE cards that restrict the WASDE comparison to folds where WASDE actively forecasts (May+ for all four commodities), plus a configurable truth source selector and generic vintage-subset decision-point table.
Modules¶
app.py (~905 lines) — main entrypoint¶
Entry function: the module-level code is the Streamlit entrypoint; there is no main() guard.
Startup sequence:
st.set_page_configtheninject_font_styles()(app.py:53–59)._discover_available_commodities()(app.py:64–73) callslist_runs()and collects the set of commodity strings; returns[]onFileNotFoundErrorso the sidebar shows a graceful error rather than a traceback.- Sidebar: commodity selectbox (wheat default), truth-source selectbox, model filter text input + model selectbox (
app.py:94–173). _load_predictions(commodity)(app.py:79–89,@st.cache_data(ttl=60)) callsload_all_deliveries; returnsNoneonFileNotFoundError.load_results(selected_model, commodity)(app.py:181–211,@st.cache_data(ttl=60)) slices the concatenated frame to one model, strips retired folds, renames columns to canonical names (mean→treefera_forecast,wasde_in_season→wasde_estimate), looks upwasde_janfromload_wasde_jan_actuals, then callsscore_hindcast.
Truth-source recomputation (app.py:237–253): score_hindcast scores against NASS by default. When the user selects a different truth source (e.g. wasde_jan), treefera_error and wasde_error are recomputed in-place against results["truth"] so every downstream metric and chart reflects the chosen benchmark.
WASDE-comparable fold filtering (app.py:289–320): only folds where WASDE actively publishes a new-crop forecast are included in MAPE headline cards and in WASDE-side bar charts. For all four commodities wasde_comparison_start_month = 5 (May). Fold positions are compared via schedule.fold_order so cross-year crops (winter wheat) correctly exclude the November/January dormancy folds even though their calendar months are ≥ 5.
Pre/post-survey split (app.py:323–333): folds with cutoff_doy ≤ wasde_survey_doy are "pre-survey" (before the USDA field survey). The dashboard renders two additional metric cards — Pre-Survey Edge (% error reduction vs WASDE) and Pre-Survey Wins (forecast windows where Treefera is closer to truth) — only when pre-survey folds exist in the data.
Chart sections rendered (in order):
| Section | Builder | Description |
|---|---|---|
| Performance Summary | metric_card |
Treefera MAPE, WASDE MAPE, Pre-Survey Edge, Pre-Survey Wins |
| Seasonal Accuracy | build_fold_breakdown_chart |
Per-fold MAE grouped bars |
| Forecast Edge | build_improvement_heatmap |
Year × fold heatmap (% of yield) |
| Advantage Over WASDE | build_information_advantage_chart |
Signed advantage per fold, mean line |
| Forecast Evolution | build_forecast_evolution_chart |
Continuous timeline with stars |
| Direction Correctness | build_direction_correctness_chart |
Pre→harvest trajectory check |
Expanders: Detailed Performance Comparison, Vintage Accuracy Table (Subset / Full tabs), Per-Year Accuracy Breakdown, Leave-One-Out Stability, Detailed Results.
Vintage Accuracy Table (app.py:527–632): derives eight "decision points" anchored to the commodity's growing season (pre-planting −30 d, planting, 20/40/60/80% through season, pre-harvest −14 d, harvest). Each anchor snaps to the nearest configured fold by absolute day distance using schedule.fold_to_init_date. Works correctly for cross-year crops by detecting gs_start_doy > gs_end_doy and offsetting the planting year by −1.
Leave-One-Out Stability (app.py:818–855): for each year, removes it from the WASDE-comparable rows and reports the change in overall MAE, labelling years as "hardest" (removing improves MAE) or "best" (removing raises MAE).
run_loader.py (~155 lines) — run discovery¶
RunDescriptor (run_loader.py:56–64): frozen dataclass holding commodity, timestamp, model, csv_path (Path), and run_dir (Path).
list_runs(runs_dir) (run_loader.py:102–137): scans HINDCAST_RUNS_DIR for directories matching the regex ^(\d{8}_\d{6})_([a-z]+)_yield_prediction$, locates the ADM0 delivery CSV in <run_dir>/delivery/, reads the model name from the first row, and returns a list of RunDescriptor sorted newest-first by timestamp. Directories with no matching CSV are skipped with a warning.
load_all_deliveries(commodity) (run_loader.py:140–155): concatenates every run for a commodity into one DataFrame, overriding the model column with the run directory name (e.g. 20260422_170936_corn_yield_prediction) so individual runs populate the "Select Model" picker.
load_delivery_csv(csv_path, commodity) (run_loader.py:26–48): parses init_date as datetime, optionally filters by commodity (case-insensitive), then derives a fold column via FoldSchedule.init_date_to_fold. The fold derivation uses season-DOY arithmetic rather than strftime("%m-%d") to avoid off-by-one drift on seasons that cross 29 February.
HINDCAST_RUNS_DIR (_dashboard_config.py:42–49): resolved at module import from $HINDCAST_RUNS_DIR (override) or require_input_data_dir() / "runs". Uses AnyPath so S3 paths are supported.
_dashboard_config.py (~319 lines) — display configuration¶
COMMODITY_CONFIG (_dashboard_config.py:127–192): dict keyed by lowercase commodity name. Per-commodity fields:
| Field | Purpose |
|---|---|
display_name |
Human-readable name shown in UI titles |
region_label |
Spatial coverage label (e.g. "US Corn Belt") |
wasde_comparison_start_month |
First month WASDE publishes a new-crop forecast (May = 5 for all four) |
wasde_survey_doy |
DOY of the first NASS field survey (224 for corn/soybeans/cotton, 152 for wheat) |
first_survey_month |
Calendar month of first survey |
retired_folds |
Fold names excluded from comparison tables (corn: {"03-01", "01-10"}) |
yield_range |
(lo, hi) sourced from CommodityConfig.yield_range (canonical YAML, not duplicated) |
phenology_labels |
{"MM-DD": "Mon — Stage"} used for chart axis labels |
gs_start_doy, gs_end_doy |
Growing season bounds; gs_end_doy < gs_start_doy signals a cross-year crop |
wasde_final_at_reveal |
True for wheat — pins the WASDE-Jan gold star at the season reveal date, not mid-January of year+1 |
FoldSchedule (_dashboard_config.py:198–251): frozen dataclass. Provides:
fold_to_init_date(fold, season_year) -> str— ISO date for (fold, year), delegates toCommodityConfig.to_date(sdoy, season_year)for correct cross-year arithmetic.init_date_to_fold(init_date, season_year) -> str— inverse lookup via season-DOY; raisesKeyErrorif the date does not correspond to any configured fold.
build_fold_schedule(commodity) (_dashboard_config.py:261–298): constructs a FoldSchedule from the commodity's CommodityConfig.hindcast_init_season_doys. Fold labels are "MM-DD" derived from a fixed reference year (2024). Duplicate MM-DDs (rare) are collapsed to the first occurrence in season order.
_load_commodity_config(commodity) (_dashboard_config.py:64–75): @functools.lru_cache(maxsize=8). Parses only the commodity subtree of the experiment YAML, bypassing ExperimentConfig's BaseSettings machinery; the dashboard needs only calendar and metadata fields.
inject_font_styles() (_dashboard_config.py:304–319): injects Inter font CSS via st.markdown. Self-contained replacement for the previously required streamlit_applets_common sibling-repo dependency (removed in PR #363).
charts.py (~446 lines) — accuracy and comparison charts¶
All functions are pure: they accept a results-shaped DataFrame and keyword arguments, and return a go.Figure.
build_fold_breakdown_chart (charts.py:196–313): grouped bar chart of per-fold MAE (Treefera vs WASDE). Adapts to fold count: >12 folds → date x-axis with monthly ticks; ≤12 folds → categorical x-axis with phenology labels. WASDE bars are zeroed (NaN) for folds outside wasde_comparable_folds.
build_improvement_heatmap (charts.py:316–446): year × season-stage heatmap. Cell value is (wasde_error − treefera_error) / truth × 100 (% of yield). Green = Treefera closer; red = WASDE closer. Uses explicit categorical axes to prevent Plotly from interpreting year labels as dates. WASDE milestone vline positioned by _nearest_fold_index for categorical axes, or by _WASDE_VLINE_OFFSET-adjusted index for date axes.
build_information_advantage_chart (charts.py:32–193): signed advantage (WASDE |error| − Treefera |error|) per fold. Individual year traces in light grey behind a thick mean line. Green/red fill between mean and y=0, with interpolated zero-crossings computed by _split_fill.
charts_evolution.py (~409 lines) — evolution and direction charts¶
build_forecast_evolution_chart (charts_evolution.py:29–242): continuous multi-year timeline. Per-year traces:
- Treefera forecast — solid green line.
- WASDE in-season estimate — grey dotted diamonds.
- WASDE-Jan final — gold star. For same-year crops, a trailing segment extends to mid-January of year+1. For cross-year crops (
wasde_final_at_reveal = True, e.g. wheat), the star is pinned at the season's last fold date. - Truth — blue star at the last fold, drawn from the user-selected
truth_col.
A "WASDE starts" dotted vline and annotation appear at the first non-NaN WASDE point when fold count >12.
build_direction_correctness_chart (charts_evolution.py:245–409): checks whether Treefera's forecast moves towards truth over the season. For each year: dashed black line from WASDE planting value to truth (the "required direction"), solid green Treefera trajectory, truth star. A year is "correct" when |t_last − truth| < |t_first − truth|. Returns (fig, n_correct, n_years).
app_utils.py (~176 lines) — UI helpers and re-exports¶
metric_card(label, value, status, risk_level) (app_utils.py:123–144): renders a self-contained HTML card with a large numeric value and a traffic-light dot (green/yellow/red). No external CSS dependency.
text_box(text) (app_utils.py:147–153): rounded grey panel for inline HTML narrative; used for pre/post-survey interpretation text.
apply_plotly_fonts(fig) (app_utils.py:159–170): applies Inter font at consistent sizes to all Plotly figure axes and legend.
MODEL_INFO (app_utils.py:36–114): static dict keyed by model name (matching the model column in delivery CSVs). Each entry has label, description, and change_from_b fields rendered in the info box above the metric cards. Covers Model A through Model B+ v6 and the 2F delivery variant.
Re-exports all chart builders from charts.py and charts_evolution.py, and constants/helpers from _chart_helpers.py, as a single import surface for app.py.
_chart_helpers.py (~272 lines) — shared chart utilities¶
wasde_milestones_doy(commodity) (_chart_helpers.py:62–65): returns a list of (doy, label, y_paper) tuples for WASDE milestone vlines. Currently one vline per commodity at wasde_survey_doy − _WASDE_VLINE_OFFSET. _WASDE_VLINE_OFFSET = 4 places the line between the last pre-survey bar and the first post-survey bar (half a weekly bar width).
fold_labels_for_data(folds, schedule, commodity) (_chart_helpers.py:123–141): returns (fold→label map, ordered labels, ordered fold names). For cross-year crops (wheat), delegates to _display_fold_order which rotates the fold list so the active growing season (March onwards) comes first in chart display order, with dormancy/final folds at the right edge.
fold_to_date(fold, year, schedule) (_chart_helpers.py:147–164): converts a fold name to dt.datetime for chart x-axes via FoldSchedule.fold_to_init_date. Returns None for unknown folds.
_split_fill(xs, ys) (_chart_helpers.py:240–272): splits a line series into positive and negative segments with interpolated zero-crossings, enabling Plotly's fill="tonexty" to produce correct green/red shading in the information-advantage chart.
_monthly_tick_config(data_folds, label_order, schedule) (_chart_helpers.py:209–237): computes (tick_vals, tick_text) for weekly-fold heatmaps, placing one month abbreviation label at the central fold of each month.
_eval_shim.py (~99 lines) — dashboard-side evaluation adapter¶
load_wasde_jan_actuals(commodity) (_eval_shim.py:49–72): loads the WASDE yield CSV via WasdeLoader(WasdeRefSpec(...)), filters to releases strictly before Feb 1 of harvest_year + 1, converts from kg/ha to bu/ac via kg_ha_to_bu_acre_series, and returns a pd.Series indexed by harvest year. Path: require_input_data_dir() / "data" / "wasde" / "wasde_{commodity}_us_yield.csv" (PR #363 added the data/ prefix).
score_hindcast(df) (_eval_shim.py:75–99): adds treefera_error, wasde_error, and improvement_pct columns (all vs nass_actual). The caller may overwrite treefera_error/wasde_error when a different truth source is selected; improvement_pct remains fixed against NASS as a stable reference in the Detailed Results table.
Cross-references¶
- orchestration — CLI that writes the
RunDirartefacts this dashboard reads - regression — models whose predictions populate delivery CSVs
Relationships¶
app.py
├── run_loader.py (discovers RunDescriptors, loads delivery CSVs)
├── _dashboard_config.py (COMMODITY_CONFIG, FoldSchedule, build_fold_schedule)
├── _eval_shim.py (load_wasde_jan_actuals, score_hindcast)
└── app_utils.py (metric_card, text_box, apply_plotly_fonts, MODEL_INFO)
├── charts.py (build_fold_breakdown_chart, build_improvement_heatmap,
│ build_information_advantage_chart)
└── charts_evolution.py (build_forecast_evolution_chart,
build_direction_correctness_chart)
└── _chart_helpers.py (fold_to_date, wasde_milestones_doy,
fold_labels_for_data, _split_fill, …)
External pipeline dependencies (read-only):
commodity_hindcast.config.CommodityConfig— yield range, init-date season-DOYs, bushel weight.commodity_hindcast.lib.reference_data.wasde.WasdeLoader— WASDE CSV reader (ported in PR #363 from a now-deleted free function).commodity_hindcast.lib.unit_utils.kg_ha_to_bu_acre_series— unit conversion.$INPUT_DATA_DIRenv var (viarequire_input_data_dir()) — locates WASDE CSVs and the runs directory.$HINDCAST_RUNS_DIRenv var (optional override) — points at an alternative runs tree.