model_output_tbl
class (or similar) to
that of a data.frame
formatted according to standards of the COVID-19
Forecasting Hub which can be processed by functions from the covidHubUtils
package such as score_forecasts()
or plot_forecasts()
. The supplied
model_output_tbl
should have columns defining properties akin to
reference dates, locations, horizons, and targets.R/as_covid_hub_forecasts.R
as_covid_hub_forecasts.Rd
Reformat model outputs stored as a model_output_tbl
class (or similar) to
that of a data.frame
formatted according to standards of the COVID-19
Forecasting Hub which can be processed by functions from the covidHubUtils
package such as score_forecasts()
or plot_forecasts()
. The supplied
model_output_tbl
should have columns defining properties akin to
reference dates, locations, horizons, and targets.
as_covid_hub_forecasts(
model_outputs,
model_id_col = "model_id",
reference_date_col = "forecast_date",
location_col = "location",
horizon_col = "horizon",
target_col = "target",
output_type_col = "output_type",
output_type_id_col = "output_type_id",
value_col = "value",
temp_res_col = "temporal_resolution",
target_end_date_col = "target_end_date"
)
an object of class model_output_tbl
with component
model outputs (e.g., predictions). Should have columns containing the
following information: model name, reference date or target end date,
location, horizon, target, temporal resolution*, output type, output
type id, and value. Note that the temporal resolution may be included
in the target column.
character
string of the name of the column
containing the model name(s) for the forecasts. Defaults to "model_id".
Should be set to NULL if no such column exists, in which case a model_id
column will be created populated with the value "model_id".
character
string of the name of the column
containing the reference dates for the forecasts. Defaults to
"forecast_date". Should be set to NULL if no such column exists, in which
case the column will be created using the following information:
horizon, target end date, and temporal resolution.
character
string of the name of the column
containing the locations for the forecasts. Defaults to "location".
character
string of the name of the column
containing the horizons for the forecasts. Defaults to "horizon".
character
string of the name of the column
containing the targets for the forecasts. Defaults to "target". If
temp_res_col
is NULL, the target column in model_outputs
is assumed
to contain targets of the form "temporal_resolution target" or
"temporal_resolution ahead target", such as "wk ahead inc flu hosp"
"wk inc flu hosp".
character
string of the name of the column
containing the output types for the forecasts. Defaults to "output_type".
character
string of the name of the column
containing the output type ids for the forecasts. Defaults to
"output_type_id".
character
string of the name of the column
containing the values for the forecasts. Defaults to "value".
character
string of the name of the column
containing the temporal resolutions for the forecasts. Defaults to
"temporal_resolution". Should be set to NULL if no such column exists,
in which case the column will be created from the existing target column.
character
string of the name of the column
containing the target end dates for the forecasts. Defaults to
"target_end_date". Should be set to NULL if no such column exists, in
which case the column will be created using the following information:
horizon, forecast date, and temporal resolution.
a data.frame
of reformatted model outputs that may be fed into
any of the covidHubUtils
functions with 10 total columns: model,
forecast_date, location, horizon, temporal_resolution, target_variable,
target_end_date, type, quantile, value. Other columns are removed.
library(dplyr)
#>
#> Attaching package: ‘dplyr’
#> The following objects are masked from ‘package:stats’:
#>
#> filter, lag
#> The following objects are masked from ‘package:base’:
#>
#> intersect, setdiff, setequal, union
forecasts <- load_forecasts(
models = c("COVIDhub-ensemble", "UMass-MechBayes"),
dates = "2020-12-14",
date_window_size = 7,
locations = c("US"),
targets = paste(1:4, "wk ahead inc death"),
source = "zoltar"
)
#> get_token(): POST: https://zoltardata.com/api-token-auth/
#> get_resource(): GET: https://zoltardata.com/api/projects/
#> get_resource(): GET: https://zoltardata.com/api/project/44/models/
#> get_resource(): GET: https://zoltardata.com/api/project/44/timezeros/
altered_forecasts <- forecasts |> # Alter forecasts to not be CovidHub format
dplyr::rename(model_id=model, output_type=type, output_type_id=quantile) |>
dplyr::mutate(target_variable = "wk ahead inc death", horizon=as.numeric(horizon)) |>
dplyr::select(-temporal_resolution)
formatted_forecasts <- as_covid_hub_forecasts(
altered_forecasts,
target_col="target_variable",
temp_res_col=NULL
) |>
dplyr::mutate(horizon=as.character(horizon))
testthat::expect_equal(formatted_forecasts, dplyr::select(forecasts, model:value))