Course Notes Home

Previously we learnt how to calculate summary statistics to get an idea of the location and spread of the data, but what if we wish to visualize the distribution? One option for a continuous variable such as age, is to produce a histogram. We will add a few arguments to adjust the way the graph looks. We will use main to add a title to the plot, xlab to change the label on the x-axis and col to change the colour of the bars.

hist(dat$Age, main = "Histogram of Age", xlab = "Age", col = "red")

We can adjust the way the histogram looks by specifiying the number of bins we want

hist(dat$Age, main = "Histogram of Age", xlab = "Age", col = "red", breaks = 8)

or the providing the specific breakpoints of the bins.

hist(dat$Age, main = "Histogram of Age", xlab = "Age", col = "red", breaks = seq(20,36,1))
## Error in hist.default(dat$Age, main = "Histogram of Age", xlab = "Age", : some 'x' not counted; maybe 'breaks' do not span range of 'x'
range(dat$Age)
## [1] 19 33
hist(dat$Age, main = "Histogram of Age", xlab = "Age", col = "red", breaks = seq(min(dat$Age),max(dat$Age),1))

The first attempt produced an error as not all the data points fall into the bins we specified.

Plots have lots of arguments that can be used to edit the way the graph looks, this includes adding titles, changing the colours, adjusting the axis. The help pages for plot are a good starting point to find the argument you need. Take a look at help(hist) to see what other arguments you could add to change the way the plot looks.

Rather than a histogram we could use a density plot, which is a smoothed line graph of the density function.

plot(density(dat$Age),  main = "Density plot of Age", xlab = "Age", col = "red")

Occasionally we may wish to add extra statistics or comments to our plot. We can use mtext() to add text to the margins of the plot. An examplar usage is

plot(density(dat$Age),  main = "Density plot of Age", xlab = "Age", col = "red")
mtext("Text goes here", side = 3, line = 0.2, adj = 1)

where the argument side takes a number from 1-4 indicating whether to place the text in the bottom, left, top or right-hand margin; the argument line specifies how many lines out frm the edge of the plot the text should be placed and adj specifies the justification of the text, 0 for left, 0.5 for centre and 1 for right.

We can use paste() to create a short sentance using statistics generated from our data, which takes a list of variables to merge together and the argument sep to specify what to join them with.

plot(density(dat$Age),  main = "Density plot of Age", xlab = "Age", col = "red")
comment<-paste("Mean = ", round(mean(dat$Age)), "; SD = ", round(sd(dat$Age)), sep = "")
mtext(comment, side = 3, line = 0.2, adj = 1)

If you want to place text inside the plot window take a look at the companion function text().

Now we know a few functions to visualize our data, let’s start peforming some statistical tests.

Next