prepare_ml_split()
allows to split a prepare_ml()
object by a factor variable, e.g. treatment.
This approach is preferable over independent preparations of each data part if
comparability of resulting models is required (e.g. between treatment groups or studies).
Note that the data preparation recipe is trained on the complete data set
(instead of independent preparation) and the split happens after preparation is completed.
Arguments
- ml_obj
Result of
prepare_ml()
.- by
character. Name of the variable to split the ml object by. Must be a factor in
ml_obj$data_raw$train
.
Value
A named list of length 'number of levels' of the by
variable where each entry contains the parts
of the ml_obj
that correspond the respective factor level.
Each entry has the same structure as the original ml_obj
and thus can be used in subsequent MARTINI modules.
As the by
variable is constant in each data part per definition,
it is removed from the prepared data, while being kept in raw versions.