Skip to contents

[Maturing]

The build() function allows to build a machine learning data set from a specification object as provided by adam_spec() (with or without data already attached).

Usage

build(spec, join = dplyr::inner_join, rm = FALSE)

Arguments

spec

a specification object as provided by adam_spec() (either spec or path has to be provided)

join

either function to join data sets (e.g. dplyr::full_join() or a character (vector) giving the names of the data sets containing the .ids to keep (e.g. join = c('adxb', 'adlb')). defaults to dplyr::inner_join

rm

boolean. defaults to FALSE. if TRUE, a repeated measurement feature matrix with an additional .rmtime column is prepared. (experimental.)

Value

build() returns a wide data set with one row per subject and standardized column names for the subject id (.id) and the treatment variable (.trt), if it is provided in the spec object. Objects with additional information on the data are provided in the attributes of the returned object.

dict
param

original parameter name in the source data

column

column name of the variable in the returned data. column is derived from param by transforming it into a valid file name and possibly adding a time extension, if multiple time points are considered for a particular parameter.

label

parameter label

source

source id provided by the specification object. If created with adam_spec(), this is the name of the domain.

type

ADaM data type of the source data (adsl, bds or occds)

unit

parameter unit (if applicable)

time

measurement time point (if applicable)

spec_id

name of the corresponding spec entry (if applicable)

source

file path and md5 checksums of the source data sets

Details

Missing values in variables from occurrence data sets are interpreted as 'absence of event', whereas NAs in adsl and bds data are considered to be true missing values. For missing values in occds data after joining with other data sets, missing values are replace by 0 for numerics, an additional level 'none' is introduced for for factors.

Authors

Maike Ahrens (ahrensmaike), Sebastian Voss (svoss09)

See also