family of helper functions to identify columns to drop from adsl data set
Usage
adsl_identify(
data,
dict = NULL,
type = c("dttm", "constant", "combined", "flag", "factor", "redundant"),
dict_label = "label",
dict_param = "param",
id = "SUBJID",
trt = "TRT01A",
black_list = c("RANDNO", "SITEID", "SITENAM", "INVID", "INVNAM")
)
adsl_identify_dttm(data)
adsl_identify_constant(data)
adsl_identify_combined(
data,
dict = NULL,
dict_label = "label",
dict_param = "param"
)
adsl_identify_redundant(data, id, trt, clmn_flag)
adsl_identify_flag(data, dict, dict_param = "param", dict_label = "label")
adsl_identify_factor(
data,
id,
clmn_flag = NULL,
dict,
dict_param = "param",
dict_label = "label"
)
adsl_identify_factor_data(data)
Arguments
- data
adsl-like data set in which to identify particular columns of interest
- dict, dict_param, dict_label
dict is
tibble
as created byadsl_dict()
wheredict_param
anddict_label
indicate the columns indict
containing for parameter names (column names ofdata
) and labels, resp.- type
character vector determining the categories of column types to identify. defaults all possible categories:
dttm
,constant
,combined
,flag
,factor
,redundant
- id, trt
user-selected column names in
data
for ID and treatment column, defaulting toSUBJID
andTRT01A
, resp.- black_list
character vector of columns that should be dropped for most analyses, see details.
- clmn_flag
(factor and redundants only) character vector of names identified as flags
Value
list with two top level entries, where to_remove
is a list of
column names from data
that were identified as candidates for a given category
and lev_list
a list
required to set factor level orders.
Details
Columns meeting the following criteria are returned
adsl_identify_dttm()
: methods::is(.x, "Date")
is TRUE,
the label contains strings 'year', 'month', 'day', 'date' or 'time'
(not case sensitive), class is one of 'difftime',
'hms', 'Period', 'POSIXct', 'POSIXt', 'Date'
adsl_identify_constant()
: identification via
janitor::remove_empty(which = 'cols')
, janitor::remove_constant(na.rm = TRUE)
adsl_identify_redundant()
: redundant columns to selected trt and id columns
adsl_identify_combined()
: if labels (from dict) contain '/'
and all parts are column names themselves
By default, black_list
contains RANDNO
,
SITEID
, SITENAM
, INVID
, INVNAM
.