These exercises all use the same data frame we used to demo earlier(data/inflammation-01.csv
). First we need to load and save it as a variable dat
dat <- read.csv(file = "data/inflammation-01.csv", header = FALSE)
Use the slice function to select from the data frame dat:
Let’s calculate some summary statistics for specific individuals (rows) and days (columns).
Using the inflammation data frame data/inflammation-01.csv
: Let’s pretend there was something wrong with the instrument on the first five days for every second patient (#2, 4, 6, etc.), which resulted in the measurements being twice as large as they should be.
? seq
)The apply function can be used to summarize datasets and subsets of data across rows and columns using the MARGIN argument. Suppose you want to calculate the mean inflammation for specific days and patients in the patient dataset (i.e. 60 patients across 40 days).
Use a combination of the apply function and indexing to:
Think about the number of rows and columns you would expect as the result before each apply call and check your intuition by applying the mean function.
Create a plot showing the standard deviation of the inflammation data for each day across all patients. Add the argument xlab = "Day"
to the plot function call to change the x axis label. Add the argumnet ylab = "SD"
to the plot function call to change the y axis label.
Write a for loop that calculates the median and standard deviation for each day for the first 5 files.
Write a for loop that takes the numbers 1-10 and multiples numbers less than 5 by 3 and numbers greater than 5 by -3.