Score forecasts

score_forecasts(
  forecasts,
  truth,
  return_format = "wide",
  metrics = c("abs_error", "wis", "wis_components", "interval_coverage",
    "quantile_coverage"),
  use_median_as_point = FALSE
)

Arguments

forecasts

required data.frame with forecasts in the format returned by load_forecasts

truth

required data.frame with forecasts in the format returned by load_truth

return_format

string: "long" returns long format with a column for "score_name" and a column for "score_value"; "wide" returns wide format with a separate column for each score. Defaults to "wide".

metrics

character vector of the metrics to be returned with options "abs_error", "wis", "wis_components","interval_coverage", and "quantile_coverage"

use_median_as_point

logical: TRUE uses the median as the point forecast when scoring; FALSE uses the point forecasts from the data when scoring. Defaults to FALSE

Value

data.frame with scores. The result will have some columns that define the observation, namely, model, forecast_date, location, horizon, temporal_resolution, target_variable, horizon, and target_end_date. Other columns will contain scores dependent on metrics parameter:

  • true_value is the observed truth at that location and target_end_date (always returned)

  • abs_error is the absolute error based on median estimate if use_median_as_point is TRUE, and absolute error based on point forecast if use_median_as_point is FALSE

  • wis is the weighted interval score

  • dispersion the component of WIS made up of interval widths

  • overprediction the component of WIS made up of overprediction of intervals

  • underprediction the component of WIS made up of underprediction of intervals

  • coverage_X are prediction interval coverage at alpha level X

  • quantile_coverage_0.X are one-sided quantile coverage at quantile X

If return_format is "long", also contains columns score_name and score_value where score_name is the type of score calculated and score_value has the numeric value of the score. If return_format is "wide", each calculated score is in its own column.

References

Bracher J, Ray EL, Gneiting T, Reich NG. (2020) Evaluating epidemic forecasts in an interval format. arXiv:2005.12881. https://arxiv.org/abs/2005.12881.

Examples

library(scoringutils)
#> Note: scoringutils is currently undergoing major development changes (with an update planned for the first quarter of 2024). We would very much appreciate your opinions and feedback on what should be included in this major update: https://github.com/epiforecasts/scoringutils/discussions/333
forecasts <- load_latest_forecasts(
  models = c("COVIDhub-ensemble", "UMass-MechBayes"),
  last_forecast_date = "2020-12-14",
  forecast_date_window_size = 7,
  locations = c("US"),
  targets = paste(1:4, "wk ahead inc death"),
  source = "zoltar"
)
#> get_token(): POST: https://zoltardata.com/api-token-auth/
#> get_resource(): GET: https://zoltardata.com/api/projects/
#> get_resource(): GET: https://zoltardata.com/api/project/44/models/
#> get_resource(): GET: https://zoltardata.com/api/project/44/timezeros/
truth <- load_truth("JHU", target_variable = "inc death", locations = "US")
score_forecasts(forecasts, truth)
#> Warning: The following warnings were produced when checking inputs:
#> 1.  Some forecasts have different numbers of rows 
#> 2.  (e.g. quantiles or samples). 
#> 3.  scoringutils found: 
#> 4.  1, 23
#> 5.  . This may be a problem (it can potentially distort scores, 
#> 6.  making it more difficult to compare them), 
#> 7.  so make sure this is intended.
#> Warning: The following warnings were produced when checking inputs:
#> 1.  Some forecasts have different numbers of rows 
#> 2.  (e.g. quantiles or samples). 
#> 3.  scoringutils found: 
#> 4.  1, 23
#> 5.  . This may be a problem (it can potentially distort scores, 
#> 6.  making it more difficult to compare them), 
#> 7.  so make sure this is intended.
#> # A tibble: 8 × 49
#>   model       location horizon temporal_resolution target_variable forecast_date
#>   <chr>       <chr>    <chr>   <chr>               <chr>           <date>       
#> 1 COVIDhub-e… US       1       wk                  inc death       2020-12-14   
#> 2 COVIDhub-e… US       2       wk                  inc death       2020-12-14   
#> 3 COVIDhub-e… US       3       wk                  inc death       2020-12-14   
#> 4 COVIDhub-e… US       4       wk                  inc death       2020-12-14   
#> 5 UMass-Mech… US       1       wk                  inc death       2020-12-13   
#> 6 UMass-Mech… US       2       wk                  inc death       2020-12-13   
#> 7 UMass-Mech… US       3       wk                  inc death       2020-12-13   
#> 8 UMass-Mech… US       4       wk                  inc death       2020-12-13   
#> # ℹ 43 more variables: target_end_date <date>, quantile_coverage_0.01 <dbl>,
#> #   quantile_coverage_0.025 <dbl>, quantile_coverage_0.05 <dbl>,
#> #   quantile_coverage_0.1 <dbl>, quantile_coverage_0.15 <dbl>,
#> #   quantile_coverage_0.2 <dbl>, quantile_coverage_0.25 <dbl>,
#> #   quantile_coverage_0.3 <dbl>, quantile_coverage_0.35 <dbl>,
#> #   quantile_coverage_0.4 <dbl>, quantile_coverage_0.45 <dbl>,
#> #   quantile_coverage_0.5 <dbl>, quantile_coverage_0.55 <dbl>, …

forecasts <- load_latest_forecasts(
  models = c("ILM-EKF"),
  hub = c("ECDC", "US"), last_forecast_date = "2021-03-08",
  forecast_date_window_size = 0,
  locations = c("GB"),
  targets = paste(1:4, "wk ahead inc death"),
  source = "zoltar"
)
#> get_token(): POST: https://zoltardata.com/api-token-auth/
#> get_resource(): GET: https://zoltardata.com/api/projects/
#> get_resource(): GET: https://zoltardata.com/api/project/238/models/
#> get_resource(): GET: https://zoltardata.com/api/project/238/timezeros/
truth <- load_truth("JHU",
  hub = c("ECDC", "US"),
  target_variable = "inc death", locations = "GB"
)
score_forecasts(forecasts, truth)
#> Warning: The following warnings were produced when checking inputs:
#> 1.  Some forecasts have different numbers of rows 
#> 2.  (e.g. quantiles or samples). 
#> 3.  scoringutils found: 
#> 4.  1, 23
#> 5.  . This may be a problem (it can potentially distort scores, 
#> 6.  making it more difficult to compare them), 
#> 7.  so make sure this is intended.
#> Warning: The following warnings were produced when checking inputs:
#> 1.  Some forecasts have different numbers of rows 
#> 2.  (e.g. quantiles or samples). 
#> 3.  scoringutils found: 
#> 4.  1, 23
#> 5.  . This may be a problem (it can potentially distort scores, 
#> 6.  making it more difficult to compare them), 
#> 7.  so make sure this is intended.
#> # A tibble: 4 × 49
#>   model   location horizon temporal_resolution target_variable forecast_date
#>   <chr>   <chr>    <chr>   <chr>               <chr>           <date>       
#> 1 ILM-EKF GB       1       wk                  inc death       2021-03-08   
#> 2 ILM-EKF GB       2       wk                  inc death       2021-03-08   
#> 3 ILM-EKF GB       3       wk                  inc death       2021-03-08   
#> 4 ILM-EKF GB       4       wk                  inc death       2021-03-08   
#> # ℹ 43 more variables: target_end_date <date>, quantile_coverage_0.01 <dbl>,
#> #   quantile_coverage_0.025 <dbl>, quantile_coverage_0.05 <dbl>,
#> #   quantile_coverage_0.1 <dbl>, quantile_coverage_0.15 <dbl>,
#> #   quantile_coverage_0.2 <dbl>, quantile_coverage_0.25 <dbl>,
#> #   quantile_coverage_0.3 <dbl>, quantile_coverage_0.35 <dbl>,
#> #   quantile_coverage_0.4 <dbl>, quantile_coverage_0.45 <dbl>,
#> #   quantile_coverage_0.5 <dbl>, quantile_coverage_0.55 <dbl>, …