4.2 Internal Data

Internal data sources include those that come bundled with R packages or data you create yourself.

4.2.1 Package Data

The datasets package comes with your default installation of R and houses many example datasets you can utilize.

Simply run the following to learn about the different ones included.

library(help = "datasets")

Once you’ve made a selection, running the following will make that data available in your R session.

# ToothGrowth looks interesting
data(ToothGrowth)
len supp dose
4.2 VC 0.5
11.5 VC 0.5
7.3 VC 0.5
5.8 VC 0.5
6.4 VC 0.5

Using package data is not just limited to the datasets package. Many specialized packages will make an effort to include some data to demonstrate how their functions work. For example, the admiral.test package has many of the SDTM datasets listed in the above repo.

4.2.2 Simulating Your Own

In some cases you may wish to simulate your own data. You can do so quite easily using a combination of sample and r* functions built into R.

Below is an example of simulating a few variables typically found in an ADSL dataset.

my_sim_data <- data.frame(subjidn = 1:10,
                          sex = sample(c('M','F'), 10, replace = TRUE),
                          age = round(rnorm(10, mean = 30, sd = 5)),
                          stringsAsFactors = FALSE
)
subjidn sex age
1 M 33
2 M 28
3 M 35
4 M 29
5 F 24