4.2 Internal Data
Internal data sources include those that come bundled with R packages or data you create yourself.
4.2.1 Package Data
The datasets
package comes with your default installation of R and houses many example datasets you can utilize.
Simply run the following to learn about the different ones included.
library(help = "datasets")
Once you’ve made a selection, running the following will make that data available in your R session.
# ToothGrowth looks interesting
data(ToothGrowth)
len | supp | dose |
---|---|---|
4.2 | VC | 0.5 |
11.5 | VC | 0.5 |
7.3 | VC | 0.5 |
5.8 | VC | 0.5 |
6.4 | VC | 0.5 |
Using package data is not just limited to the datasets
package. Many specialized packages will make an effort to include some data to demonstrate how their functions work. For example, the admiral.test
package has many of the SDTM datasets listed in the above repo.
4.2.2 Simulating Your Own
In some cases you may wish to simulate your own data. You can do so quite easily using a combination of sample
and r*
functions built into R.
Below is an example of simulating a few variables typically found in an ADSL dataset.
data.frame(subjidn = 1:10,
my_sim_data <-sex = sample(c('M','F'), 10, replace = TRUE),
age = round(rnorm(10, mean = 30, sd = 5)),
stringsAsFactors = FALSE
)
subjidn | sex | age |
---|---|---|
1 | M | 33 |
2 | M | 28 |
3 | M | 35 |
4 | M | 29 |
5 | F | 24 |