Pipeline: Dashboard¶
Purpose¶
The commodity-hindcast dashboard is a Streamlit web application that reads saved hindcast run artefacts directly from disk (no API layer, no database) and renders six interactive chart sections comparing Treefera seasonal-yield forecasts against USDA NASS actuals and WASDE/CONAB in-season estimates. It supports four commodities (corn, wheat, soybeans, cotton) and adapts all charts, metric cards, and fold labels to the selected commodity's phenology calendar. It is a read-only consumer of the delivery pipeline outputs.
Launch command:
Inputs¶
| Artefact | Source | Reader |
|---|---|---|
delivery/Treefera_{commodity}_ADM0_Hindcast_*.csv |
Each run_dir | run_loader.load_all_deliveries |
config_resolved.yaml |
Each run_dir | _dashboard_config._load_commodity_config (only commodity subtree) |
| WASDE CSV | $INPUT_DATA_DIR/data/wasde/wasde_{commodity}_us_yield.csv |
_eval_shim.load_wasde_jan_actuals |
$HINDCAST_RUNS_DIR or $INPUT_DATA_DIR/runs/ |
env var | _dashboard_config._dashboard_config.py:42–49 |
No write paths. The dashboard never mutates run_dir artefacts.
Key Modules¶
| Module | Lines | Responsibility |
|---|---|---|
app.py |
~905 | Main Streamlit entrypoint; sidebar, chart dispatch, metric cards |
run_loader.py |
~155 | Run discovery (list_runs), CSV loading, RunDescriptor |
_dashboard_config.py |
~319 | COMMODITY_CONFIG, FoldSchedule, font injection |
_eval_shim.py |
~99 | load_wasde_jan_actuals, score_hindcast |
charts.py |
~446 | Accuracy, heatmap, information-advantage Plotly figures |
charts_evolution.py |
~409 | Forecast evolution timeline, direction-correctness chart |
app_utils.py |
~176 | metric_card, text_box, apply_plotly_fonts, MODEL_INFO |
_chart_helpers.py |
~272 | Shared fold-label, tick, fill utilities |
Step-by-step¶
1. Startup¶
st.set_page_config+inject_font_styles()(app.py:53–59) — Inter font injected via inline CSS; no externalstreamlit_applets_commondependency (removed PR #363)._discover_available_commodities()(app.py:64–73) — callslist_runs(HINDCAST_RUNS_DIR), collects the commodity strings. Returns[]onFileNotFoundErrorso the sidebar shows a graceful error rather than a traceback.- Sidebar rendered: commodity selectbox (wheat default), truth-source selectbox, model filter text input + model selectbox.
2. Run discovery — run_loader.py¶
RunDescriptor (run_loader.py:56–64): frozen dataclass holding commodity, timestamp, model, csv_path (Path), and run_dir (Path).
list_runs(runs_dir) (run_loader.py:102–137) scans HINDCAST_RUNS_DIR for directories matching ^(\d{8}_\d{6})_([a-z]+)_yield_prediction$, locates the ADM0 delivery CSV in <run_dir>/delivery/, reads the model name from the first row, and returns list[RunDescriptor] sorted newest-first by timestamp.
load_all_deliveries(commodity) (run_loader.py:140–155) concatenates every run for a commodity into one DataFrame, overriding the model column with the run directory name so individual runs populate the "Select Model" picker.
load_delivery_csv(csv_path, commodity) (run_loader.py:26–48) parses init_date as datetime and derives a fold column via FoldSchedule.init_date_to_fold. The fold derivation uses season-DOY arithmetic, not strftime, to avoid off-by-one drift on seasons that cross 29 February.
3. COMMODITY_CONFIG and FoldSchedule¶
COMMODITY_CONFIG (_dashboard_config.py:127–192) is a dict keyed by lowercase commodity name. Per-commodity fields include display_name, region_label, wasde_comparison_start_month (May = 5 for all four), wasde_survey_doy, retired_folds, phenology_labels, gs_start_doy, gs_end_doy, and wasde_final_at_reveal. The yield_range field is sourced from the canonical YAML CommodityConfig.yield_range and not duplicated.
FoldSchedule (_dashboard_config.py:198–251) provides fold_to_init_date and init_date_to_fold using CommodityConfig.to_date(sdoy, season_year) for correct cross-year arithmetic. build_fold_schedule(commodity) constructs a FoldSchedule from CommodityConfig.hindcast_init_season_doys, labelling folds as "MM-DD" derived from a fixed reference year (2024).
_load_commodity_config(commodity) (_dashboard_config.py:64–75) is @functools.lru_cache(maxsize=8). It parses only the commodity subtree of the experiment YAML, bypassing ExperimentConfig's BaseSettings machinery — the dashboard needs only calendar and metadata fields, not the full pipeline configuration.
4. Window-aware MAPE (PR #340)¶
score_hindcast(df) (_eval_shim.py:75–99) adds treefera_error, wasde_error, and improvement_pct columns. treefera_error and wasde_error are recomputed against results["truth"] when the user selects a different truth source.
WASDE-comparable fold filtering (app.py:289–320): only folds where WASDE actively publishes a new-crop forecast are included in MAPE headline cards. For all four commodities wasde_comparison_start_month = 5 (May). Fold positions are compared via schedule.fold_order — cross-year crops (winter wheat) correctly exclude November/January dormancy folds even though their calendar months are ≥ 5.
Pre/post-survey split (app.py:323–333): folds with cutoff_doy ≤ wasde_survey_doy are "pre-survey". Two additional metric cards appear — Pre-Survey Edge (% error reduction vs WASDE) and Pre-Survey Wins (forecast windows where Treefera is closer to truth) — only when pre-survey folds exist in the data.
5. Chart sections¶
| Section | Builder | Description |
|---|---|---|
| Performance Summary | metric_card (app_utils.py:123) |
Treefera MAPE, WASDE MAPE, Pre-Survey Edge, Pre-Survey Wins |
| Seasonal Accuracy | build_fold_breakdown_chart (charts.py:196) |
Per-fold MAE grouped bars |
| Forecast Edge | build_improvement_heatmap (charts.py:316) |
Year × fold heatmap (% of yield); green = Treefera closer |
| Advantage Over WASDE | build_information_advantage_chart (charts.py:32) |
Signed advantage per fold, mean line, green/red fill |
| Forecast Evolution | build_forecast_evolution_chart (charts_evolution.py:29) |
Continuous timeline with truth stars and WASDE-Jan gold star |
| Direction Correctness | build_direction_correctness_chart (charts_evolution.py:245) |
Pre→harvest trajectory: does Treefera move towards truth? |
Expanders (not in the main render path): Detailed Performance Comparison, Vintage Accuracy Table (Subset / Full tabs), Per-Year Accuracy Breakdown, Leave-One-Out Stability, Detailed Results.
Vintage Accuracy Table (app.py:527–632): derives eight decision-point anchors (pre-planting −30 d, planting, 20/40/60/80% through season, pre-harvest −14 d, harvest), each snapped to the nearest configured fold via schedule.fold_to_init_date. Cross-year crops are handled by detecting gs_start_doy > gs_end_doy and offsetting the planting year by −1.
Leave-One-Out Stability (app.py:818–855): removes each year from the WASDE-comparable rows and reports the change in overall MAE, labelling years as "hardest" or "best".
Mermaid Flow¶
flowchart TD
ENV["$HINDCAST_RUNS_DIR\nor $INPUT_DATA_DIR/runs/"]
LR["list_runs(runs_dir)\nrun_loader.py:102\n→ list[RunDescriptor]"]
LAD["load_all_deliveries(commodity)\nrun_loader.py:140\n→ concat DataFrame"]
LCC["_load_commodity_config(commodity)\n_dashboard_config.py:64\n(lru_cached, YAML commodity subtree only)"]
FS["build_fold_schedule(commodity)\n_dashboard_config.py:261\n→ FoldSchedule"]
SCORE["score_hindcast(df)\n_eval_shim.py:75\ntreefera_error, wasde_error, improvement_pct"]
WFILT["WASDE-comparable fold filter\napp.py:289\nwasde_comparison_start_month=5"]
CARDS["metric_card()\napp_utils.py:123\nTreefera MAPE / WASDE MAPE\nPre-Survey Edge / Wins"]
CHARTS["Chart sections × 6\ncharts.py + charts_evolution.py\nfold_breakdown | heatmap |\ninformation_advantage |\nevolution | direction"]
EXP["Expanders\nVintage Accuracy Table\nLeave-One-Out Stability\nDetailed Results"]
RUNS["run_dir/delivery/\nTreefera_*_ADM0_Hindcast_*.csv\n(read-only)"]
ENV --> LR
RUNS --> LR
LR --> LAD
LCC --> FS
LAD --> SCORE
FS --> SCORE
SCORE --> WFILT
WFILT --> CARDS
WFILT --> CHARTS
CHARTS --> EXP
Invariants¶
- The dashboard never writes to
run_dir. All paths are read-only. _load_commodity_configis cached; it never reloads the YAML within a session unless the cache is evicted (maxsize=8 covers all four commodities).load_all_deliveriesreads only ADM0 CSVs. ADM1 and ADM2 files are present indelivery/but ignored by the dashboard.- MAPE headline cards restrict to
wasde_comparison_start_month=5folds for all four commodities. Including pre-May folds (no WASDE estimate) would inflate MAPE for WASDE but is meaningless for comparison. fold_to_init_dateandinit_date_to_folduse season-DOY arithmetic, notstrftime, to avoid February-29 drift on crops that span a calendar year boundary.weather_correction_bu_acandimprovement_pctreference NASS as the stable benchmark regardless of the user's truth-source selection. The user-selectable truth source only affectstreefera_errorandwasde_error.
Failure Modes¶
FileNotFoundErroron startup (app.py:65–73):HINDCAST_RUNS_DIRdoes not exist or is empty. The sidebar shows a graceful error rather than a traceback. Fix: set$HINDCAST_RUNS_DIRto the runs root.- No ADM0 CSV in
delivery/:list_runsskips the run with a warning log. The run will not appear in the model picker. - WASDE CSV missing (
_eval_shim.py:49–72):load_wasde_jan_actualsraisesFileNotFoundError. The gold-star WASDE-Jan point will be absent from the forecast evolution chart. Fix: ensure$INPUT_DATA_DIR/data/wasde/wasde_{commodity}_us_yield.csvexists. - Cross-year crop fold ordering:
_display_fold_orderin_chart_helpers.pyrotates the fold list so the active growing season (March+) comes first in chart display order. Ifgs_start_doy > gs_end_doyis misconfigured, dormancy folds will appear in the wrong chart position. - PR #363 startup regression: three startup bugs were fixed in PR #363 (wrong
_CONFIGS_DIRdepth, renamedWasdeLoaderAPI, missingdata/prefix in WASDE path). Reverting any of these three changes will break startup. @st.cache_data(ttl=60)staleness:_load_predictionsandload_resultsare cached with a 60-second TTL. A newly completed hindcast run will not appear in the model picker until the cache expires.
Cross-references¶
- dashboard.md — full source-level detail for all
app/modules - deliver.md — writes the
Treefera_*_ADM0_Hindcast_*.csvfiles this dashboard reads - evaluate.md — produces
metrics_table.csvand text reports; dashboard does not read these directly
PRs¶
- PR #363 — fixed three startup bugs; removed
streamlit_applets_commondependency; portedWasdeLoaderAPI. - PR #340 — added window-aware MAPE cards restricting WASDE comparison to May+ folds; added configurable truth-source selector; added generic vintage-subset decision-point table.