Cross compare subgroups of data to each other — var_group

This function helps construct group wise cross-correlation matrices and other between column comparisons from a dataframe. We assume we have a data with a major grouping and then data columns we wish to compare to each other. We specify the columns to compare to each other as a formula or as a tidyselect using a var_grp_df and using this we use these a set of columns to compare.

Usage

var_group_compare(var_grp_df, ..., .diagonal = FALSE)

Arguments

var_grp_df: a data frame with major and data groupings
...: a set of named functions. The functions must take 2 vectors of the type of the columns being compared and generate a single result (which may be a complex S3 object such as a lm). Such functions might be for example be chisq.test for factor columns or cor for numeric columns.
.diagonal: should a column be compared with itself? this is usually FALSE

Value

a dataframe containing the major z groupings and unique binary combinations of y and x columnsas y and x columns. The named comparisons provided in ... form the other columns. If these are not primitive types this will be a list column.

Details

Although the examples here are functional we generally expect these to be wrapped within a function within a package where the comparisons are pre-defined, and the var_group framework is hidden from the user.

Examples

iris %>% dplyr::group_by(Species) %>% var_group(~ .) %>%
  var_group_compare(
    correlation = cor
  )
#> 3 group(s): Species.
#> (subgroup) y ~ x + correlation (data)
#> # A tibble: 36 × 4
#>    Species y            x            correlation
#>  * <fct>   <chr>        <chr>              <dbl>
#>  1 setosa  Petal.Length Petal.Width        0.332
#>  2 setosa  Petal.Length Sepal.Length       0.267
#>  3 setosa  Petal.Length Sepal.Width        0.178
#>  4 setosa  Petal.Width  Petal.Length       0.332
#>  5 setosa  Petal.Width  Sepal.Length       0.278
#>  6 setosa  Petal.Width  Sepal.Width        0.233
#>  7 setosa  Sepal.Length Petal.Length       0.267
#>  8 setosa  Sepal.Length Petal.Width        0.278
#>  9 setosa  Sepal.Length Sepal.Width        0.743
#> 10 setosa  Sepal.Width  Petal.Length       0.178
#> # ℹ 26 more rows
  
ggplot2::diamonds %>% var_group(tidyselect::where(is.factor)) %>% 
  var_group_compare(
    chi.p.value = ~ stats::chisq.test(.x,.y)$p.value
  )
#> 1 group(s): .
#> (subgroup) y ~ x + chi.p.value (data)
#> # A tibble: 6 × 3
#>   y       x       chi.p.value
#> * <chr>   <chr>         <dbl>
#> 1 clarity color      0       
#> 2 clarity cut        0       
#> 3 color   clarity    0       
#> 4 color   cut        1.39e-51
#> 5 cut     clarity    0       
#> 6 cut     color      1.39e-51