Skip to contents

Loads the AvonCap data from a set of csv files, which may optionally be qualified by site ('BRI' or 'NBT') and database year ('y1', 'y2', 'y3') as part of the file name. This selects the most recent files earlier than the reproduce_at date and detects whether they are in a set of files.

Usage

load_data(
  type,
  subtype = NULL,
  reproduce_at = as.Date(getOption("reproduce.at", default = Sys.Date())),
  merge = TRUE,
  ...
)

Arguments

type

the file category see valid_inputs() for current list in input directory

subtype

the subtype from valid_inputs()

reproduce_at
  • the date at which to cut off newer data files

merge
  • setting to TRUE forces multiple files be merged into a single data frame by losing mismatching columns.

...
  • passed to cached may specifically want to use `.nocache=TRUE“

Value

either a list of dataframes or a single merged dataframe

Details

The files are loaded as csv as checked that files have (A) the same columns, (B) the same type (or are empty) (C) have any major parse issues. It then merges the files into a single dataframe, if possible, otherwise it will return the individually loaded files as a list of dataframes.

Examples

try(load_data("nhs-extract","deltave"))
#> caching item: ~/.cache/avoncap/data-6c0a3f301ee14020e3907a7472c55225-efc310106a2b36fab3e67f93fe2c9461.rda
#> INCONSISTENT COLUMN(S) IN FILES: weight
#> Loaded 18009 rows from 6 files, (777+5347+1706+893+7105+2181=18009)
#> # A tibble: 18,009 × 243
#>    record_number nhs_number admission_date gender age_at_admission   imd
#>  * <chr>              <dbl> <chr>           <dbl>            <dbl> <dbl>
#>  1 B02976        1829824764 17/05/2021          2             83.4     3
#>  2 B02977         572676760 17/05/2021          2             32.5     2
#>  3 B02978         427783973 17/05/2021          2             77.4     3
#>  4 B02980        1808110131 17/05/2021          1             90.6     3
#>  5 B02984         485842687 17/05/2021          2             60.2     1
#>  6 B02993         290021094 17/05/2021          2             65.6     8
#>  7 B02994         571224641 17/05/2021          1             64.1     2
#>  8 B03018         427913753 18/05/2021          1             74.7     4
#>  9 B03020        1988976150 18/05/2021          2             65.4     4
#> 10 B03024         430812585 18/05/2021          2             75.7     9
#> # ℹ 17,999 more rows
#> # ℹ 237 more variables: rockwood <dbl>, symptom_days_preadmit <dbl>,
#> #   fever2 <dbl>, pleurtic_cp <dbl>, cough2 <dbl>, sput_prod <dbl>,
#> #   dyspnoea <dbl>, tachypnoea2 <dbl>, anosmia <dbl>, ageusia <dbl>,
#> #   dysgeusia <dbl>, fever <dbl>, hypothermia <dbl>, chills <dbl>,
#> #   headache <dbl>, malaise <dbl>, wheeze <dbl>, myalgia <dbl>,
#> #   worse_confusion <dbl>, general_det <dbl>, hr <dbl>, systolic_bp <dbl>, …