Skip to contents

This is a supporting utility for functions that have a signature of function(df, ...) that operate on different groups of columns, and need the user to supply column groups in a simple way. There are 2 or 3 levels of column grouping that can be specified easily in this style of function, and they are generally referred to as z (i.e. group, or cohort), y (i.e. subgroup, or response) and x (i.e. data). In some configurations, only z and x are available.

Usage

var_group(df, ..., .infer_y = FALSE)

Arguments

df

a data frame which may be grouped

...

a specification for the groupings which may be one of:

  • A formula or list of formulae (e.g. y1 + y2 ~ x1 + x2, z:from df grouping). the . can be used to specify the rest of the columns, e.g. y1 + y2 ~ .

  • A list of symbols (x1, x2, ..., z:from df grouping, y:empty)

  • A list of quosures (e.g. dplyr::vars(x1,x2)) (x, z:from df grouping, y:empty)

  • One tidyselect specification (x, z:from df grouping, y:empty)

  • Two tidyselect specifications (x, y, z:from df grouping)

  • Three tidyselect specifications (x, y, z, N.B. df must be ungrouped for this to work)

  • Column names as strings (x, z:from df grouping, y:empty)

.infer_y

if only z and x is defined make y the rest of the dataframe columns

Value

a var_grp_df with defined z, y and x column groups, for use within the var_group_* framework.

Examples

tmp = iris %>% dplyr::group_by(Species) %>% var_group(. ~ Petal.Width + Sepal.Width)

tmp = iris %>% dplyr::group_by(Species) %>% 
  var_group(tidyselect::starts_with("Sepal"),tidyselect::starts_with("Petal"))