Creating a specification for building a wide format data set from ADaM data
Source:R/adam_spec.R
adam_spec.Rd
adam_spec()
is a wrapper for the adam_spec_*()
functions.
It creates a list of specifications on how to extract and process data from
ADaM data sets in a given location.
The resulting list can be passed to build()
, where the
created specs are applied and the generated data sets are combined into a single wide format data set.
Usage
adam_spec(
path,
filter = NULL,
keep = NULL,
drop = NULL,
pre_study = FALSE,
attach_data = TRUE,
id = "SUBJID",
trt = "TRT01A",
add_bds = NULL,
file_ext = c("rds", "sas7bdat"),
fct_levels = NULL,
catalog_file = NULL
)
Arguments
- path
path to a directory containing ads files in .sas7bdat format
- filter
a character vector of conditions to be passed to
dplyr::filter()
, e.g. regarding visits, treatment arms or parameters. Defaults to NULL.- keep, drop
character vectors controlling the subset of data sets in the given
path
to create the specification for (e.g.c('adsl', 'advs'))
). If bothkeep
anddrop
are specified, onlykeep
will be used. Both default to NULL, which means that all (known) domains are included.- pre_study
boolean. Include only pre-study events from occurrence data sets (see
adam_spec_occds()
for details). Defaults to FALSE.- attach_data
boolean indicating whether the imported raw data is included in the output. Defaults to TRUE.
- id, trt
id and treatment column names (see e.g.
adam_spec_adsl()
for details).- add_bds
character vector of domain names of type bds that are not included in the package library of ADaM types (yet), but should be processed as per usual, e.g. 'adfapr'
- file_ext
only rds and sas7bdat data sets are allowed (e.g.
file_ext = 'rds'
). User may select only sas7bdat, only rds or set a priorization rule (file_ext = c('rds', 'sas7bdat')
, see Details). Defaults to c('rds', 'sas7bdat'), i.e. rds if available, sas7bdat else.- fct_levels
optional list of named vectors providing code-decode pairs and/or setting the level order for factors in an adsl data set (see details section of
adam_spec_adsl()
for structure).- catalog_file
path to the catalog file to be passed to
haven::read_sas()
for adsl. Defaults toNULL
. Ignored iffile
is not a sas7bdat file.
Value
adam_spec()
returns named list of specifications that can be passed to the build()
function.
Each element contains the specification for a single data set and is named with the domain abbreviation (e.g. adsl, adqskccq).
The list can be manually adjusted if required, e.g. adding further specifications or altering existing ones. See the documentation
of the adam_spec_*()
for a detailed description of the output object.
Details
adam_spec()
matches file names in the given path against an internal library
to decide on which adam_spec_*()
function to use for which data set.
Only files in the library will be processed, the rest will be ignored. Names of unprocessed files will be printed to the console.
For those, specifications may be created manually using the appropriate adam_spec_*()
function and appended to the specification list created by adam_*_spec()
.
By specifying e.g. file_ext = 'rds'
, only rds data will be considered
for building the specification. To use only sas7bdat, analogously specify
by file extension file_ext = 'sas7bdat'
.
Preferred file types can be specified using a character vector
file_ext = c('rds', 'sas7bdat')
: If the same file name is found in
path
with both extensions, the file with the former extension is used,
the one with the latter ignored. For unambiguous file names (either only sas7bdat or only rds)
both are used.
Individual filters are only applied if the resulting data set has a positive number of rows (ignoring those causing errors or yielding a 0-row data set).
Please refer to the documentations of the adam_spec_*()
functions for full details.