Course Notes Home

These exercises all use the same data frame we used to demo earlier(data/inflammation-01.csv). First we need to load and save it as a variable dat

dat <- read.csv(file = "data/inflammation-01.csv", header = FALSE)

Exercise 1

Use the slice function to select from the data frame dat:

  1. The element on the 4th row and 2nd column.
  2. The element on the 7th row and 5th column.
  3. The 10th row.
  4. The 3rd column.

Exercise 2

Let’s calculate some summary statistics for specific individuals (rows) and days (columns).

  1. Calculate the mean inflammation score for the first individual.
  2. Calculate the median inflammataion score for the 5th day.
  3. Use min() and max() to calculate the range of values for the 10th individual.
  4. Check you answer with the range() function.
  5. Use quantile() to calculate the [interquartile range] (https://en.wikipedia.org/wiki/Interquartile_range) of the 6th indiviudal.

Exercise 3

Using the inflammation data frame data/inflammation-01.csv: Let’s pretend there was something wrong with the instrument on the first five days for every second patient (#2, 4, 6, etc.), which resulted in the measurements being twice as large as they should be.

  1. Write a vector containing each affected patient (hint: ? seq)
  2. Create a new data frame with in which you halve the first five days’ values in only those patients
  3. Print out the corrected data frame to check that your code has fixed the problem

Exercise 4

The apply function can be used to summarize datasets and subsets of data across rows and columns using the MARGIN argument. Suppose you want to calculate the mean inflammation for specific days and patients in the patient dataset (i.e. 60 patients across 40 days).

Use a combination of the apply function and indexing to:

  1. calculate the mean inflammation for patients 1 to 5 over the whole 40 days
  2. calculate the mean inflammation for days 1 to 10 (across all patients).
  3. calculate the mean inflammation for every second day (across all patients).

Think about the number of rows and columns you would expect as the result before each apply call and check your intuition by applying the mean function.

Exercise 5

Create a plot showing the standard deviation of the inflammation data for each day across all patients. Add the argument xlab = "Day" to the plot function call to change the x axis label. Add the argumnet ylab = "SD" to the plot function call to change the y axis label.

Exercise 6

Write a for loop that calculates the median and standard deviation for each day for the first 5 files.

Exercise 7

Write a for loop that takes the numbers 1-10 and multiples numbers less than 5 by 3 and numbers greater than 5 by -3.

Solutions