4.3 Customising plots
All of the plots we’ve created so far in this Chapter are more than suitable for exploring your data. If however, you’d like to make them a little prettier (for your thesis, publication or even your own amusement) you’ll need to invest some time learning how to customise your plots. The good news is that the base R graphics system allows you to change almost any aspect of your plot. There are however a couple of things to bear in mind. Firstly, although many of the approaches we introduce in this section will work with most base R plotting functions, there’s no true consistency between functions. What works with the
plot() function isn’t guaranteed to necessarily work with the
boxplot() function. This can be a little frustrating to begin with but gets easier the more experience you gain. If you crave a little more consistency take a look at Chapter 5 where we introduce the excellent
ggplot2 package. Secondly, when you start customising plots you’re confronted with a huge number of options and arguments to try and remember. This isn’t necessarily a bad thing as this is what makes base R graphics so flexible but it’s a lot to take in. Often a quick Google or peek at the relevant help pages will jog your memory. Thirdly, learning how to customise plots in base R isn’t just about what code you need to use, it’s also about learning the process of building a plot. We often start with a basic layout of our plot and then add layers of complexity until we achieve the desired results. This requires a little experience (and trial and error), but again becomes easier with practice. Lastly, this section covers the basics of how to customise base R graphics and most (if not all) of these approaches will not work for plots created with the
lattice graphics system.
4.3.1 Customising with arguments
Let’s return to the basic plot we made previously in this Chapter. This was a simple scatterplot to examine the relationship between the
weight variables in the
flowers data frame.
Whilst this plot is adequate for data exploration it’s not going to cut the mustard if we want to share it with others. At the very least it could do with a better set of axes labels, more informative axes scales and some nicer plotting symbols.
Let’s start with the axis labels. To add labels to the x and y axes we use the corresponding
ylab = and
xlab = arguments in the
plot() function. Both of these arguments need character strings as values.
OK, that looks a little better but the units
(cm2) looks a little ugly as we should format the
2 as a superscript. To convert to a superscript we need to use a combination of the
paste() functions. The
expression() function allows us to format the superscript (and other mathematical expressions - see
?plotmath for more details) with the
^ symbol and the
paste() function pastes together the elements
"shoot area (cm"^"2" and
) to create our axis label.
But now we have a new problem, the very top of the y axis label gets cut off. To remedy this we need to adjust the plot margins using the
par() function and the
mar = argument before we plot the graph. The
par() function is the main function for setting graphical parameters in base R and the
mar = argument sets the size of the margins that surround the plot. You can adjust the size of the margins using the notation
par(mar = c(bottom, left, top, right) where the arguments
right are the size of the corresponding margins. By default R sets these margins as
mar = c(5.1, 4.1, 4.1, 2.1) with these numbers specifying the number of lines in each margin. Let’s increase the size of the left margin a little bit and decrease the size of the right margin by a smidge.
That looks better. Now let’s increase the range of our axes scales so we have a bit of space above and to the right of the data points. To do this we need to supply a minimum and maximum value using the
c() function to the
xlim = and
ylim = arguments. We’ll set the x axis scale to run from 0 to 30 and the range of the y axis scale from 0 to 200.
And while we’re at it let’s remove the annoying box all the way around the plot to just leave the y and x axes using the
bty = "l" argument.
OK, that’s looking a lot better already after only a few adjustments. One of the things that we still don’t like is that by default the x and y axes do not intersect at the origin (0, 0) and both axes extend beyond the maximum value of the scale by a little bit. We can change this by setting the
xaxs = "i" and
yaxs = "i" arguments when we use the
par() function. While we’re about it let’s also rotate the y axis tick mark labels so they read horizontally using by setting the
las = 1 argument in the
plot() function and make them a tad smaller with the
cex.axis = argument. The
cex.axis = argument requires a number giving the amount by which the text will be magnified (or shrunk) relative to the default value of 1. We’ll choose
0.8 making our text 20% smaller. We can also make the tick marks just a little shorter by setting
tcl = -0.2. This value needs to be negative as we want the tick marks to be outside the plotting region (see what happens if you set it to
tcl = 0.2).
We can also change the type of plotting symbol, the colour of the symbol and the size of the symbol using the
col = and
cex = arguments respectively. The
pch = argument takes an integer value between 0 and 25 to define the type of plotting symbol. Symbols 0 to 14 are open symbols, 15 to 20 are filled symbols and 21 to 25 are symbols where you can specify a different fill colour and outside line colour. Here’s a summary table displaying the value and corresponding symbol type.
col = argument changes the colour of the plotting symbols. This argument can either take an integer value to specify the colour or a character string giving the colour name. For example,
col = "red" changes the plotting symbol to red. To see a list of all 657 preset colours available in base R use the
colours() function (you can also use
colors()) or perhaps even easier see this link. More colour options are available with other packages (see the excellent
RColorBrewer package) or you can even ‘mix’ your own colours using the
colorRamp() function (see
?colorRamp for more details).
cex = argument allow you to change the size of the plotting symbol. This argument works in the same way as the other
cex arguments we’ ve already seen (i.e.
cex.axis) and requires a numeric value to indicate the proportional increase or decrease in size relative to the default value of 1.
Let’s change the plotting symbol to a filled circle (16), the colour of the symbol to “dodgerblue1” and decrease the size of the symbol by 10%.
The last thing we’ll do is add a text label to the plot so we can identify it. Perhaps this plot will be one of a series of plots we want to include in the same figure (see the section on plotting multiple graphs to see how to do this) so it would be nice to be able to refer to it in our figure title. To do this we’ll use the
text() function to add a capital ‘A’ to the top right of the plot. The
text() function needs an
x = and a
y = coordinate to position the text, a
label = for the text and we can use the
cex = argument again to change the size of the text.
par(mar = c(4.1, 4.4, 4.1, 1.9), xaxs = "i", yaxs = "i") plot(flowers$weight, flowers$shootarea, xlab = "weight (g)", ylab = expression(paste("shoot area (cm"^"2",")")), xlim = c(0, 30), ylim = c(0, 200), bty = "l", las = 1, cex.axis = 0.8, tcl = -0.2, pch = 16, col = "dodgerblue1", cex = 0.9) text(x = 28, y = 190, label = "A", cex = 2)
We think our plot now looks pretty good so we’ll stop here! There are, however, a multitude of other arguments which you can play around with to change the look of your plots. The best place to quickly look for more information is the help page associated with the
par() function (
?par) or just do a quick Google search. Here’s a table of the more commonly used arguments.
||controls justification of the text (0 left justified, 0.5 centered, 1 right justified)|
||specifies the background colour of the plot(i.e. :
||controls the type of box drawn around the plot, values include:
||controls the size of text and symbols in the plotting area with respect to the default value of 1. Similar commands include:
||controls the colour of symbols; additional argument include:
||an integer controlling the style of text (1: normal, 2: bold, 3: italics, 4: bold italics); other argument include
||an integer which controls the orientation of the axis labels (0: parallel to the axes, 1: horizontal, 2: perpendicular to the axes, 3: vertical)|
||controls the line style, can be an integer (1: solid, 2: dashed, 3: dotted, 4: dotdash, 5: longdash, 6: twodash)|
||a numeric which controls the width of lines. Works as per
||controls the type of symbol, either an integer between 0 and 25, or any single character within quotes
||an integer which controls the size in points of texts and symbols|
||a character which specifies the type of the plotting region, “s”: square, “m”: maximal|
||a value which specifies the length of tick marks on the axes as a fraction of the width or height of the plot; if
||a value which specifies the length of tick marks on the axes as a fraction of the height of a line of text (by default
4.3.2 Building plots
For even more control over how your plot looks we can build our plot up in layers, customising each step as we go along. For example, perhaps we want to create a plot of
weight as we did before but this time we want to change the symbol colours of our data points depending on what level of
nitrogen the plants were exposed to. The general approach is to use the high level plotting function
plot() to create the general plot (axes, axes labels etc) but without the data points by including the
type = "n" argument. We then use the low level function
points() to add the plotting symbols for each
nitrogen level separately choosing a different colour for each set of points. Let’s go through this approach a step at a time. First we’ll make the plot but suppress plotting the data using the
type = "n" argument in the
We can now use the
points() function in combination with our square bracket
[ ] skills to only select those data from the
low level of
nitrogen. Whilst using the
points() function we can also set the symbol type and the symbol colour using the
pch = and
col = arguments.
par(mar = c(4.1, 4.4, 4.1, 1.9), xaxs = "i", yaxs = "i") plot(flowers$weight, flowers$shootarea, type = "n", xlab = "weight (g)", ylab = expression(paste("shoot area (cm"^"2",")")), xlim = c(0, 30), ylim = c(0, 200), bty = "l", las = 1, cex.axis = 0.8, tcl = -0.2) points(x = flowers$weight[flowers$nitrogen == "low"], y = flowers$shootarea[flowers$nitrogen == "low"], pch = 16, col = "deepskyblue")
We can now use the
points() function again to plot data for the
medium level of nitrogen and change the symbol colour to something different. Notice that we do not reuse the
plot() function here as we are just using the low level function
points() to add data points to the existing plot.
And finally to add the
high level of
nitrogen data points to the plot and add our text label (‘A’) to the plot as before.
The only thing left to do is to add a legend to the plot to let your reader know what
nitrogen level each colour corresponds to. We’ll use another low level function,
legend() to do this. The
legend() function requires us to provide the x and y coordinates to specify the position of the top left of the legend in the plot, a vector of colours, symbol types and labels to use in the legend. The
bty = "n" argument stops a border being drawn around the legend and the
title = argument gives the legend a title.
If you want to see all the code together.
par(mar = c(4.1, 4.4, 4.1, 1.9), xaxs="i", yaxs="i") plot(flowers$weight, flowers$shootarea, type = "n", xlab = "weight (g)", ylab = expression(paste("shoot area (cm"^"2",")")), xlim = c(0, 30), ylim = c(0, 200), bty = "l", las = 1, cex.axis = 0.8, tcl = -0.2) points(x = flowers$weight[flowers$nitrogen == "low"], y = flowers$shootarea[flowers$nitrogen == "low"], pch = 16, col = "deepskyblue") points(x = flowers$weight[flowers$nitrogen == "medium"], y = flowers$shootarea[flowers$nitrogen == "medium"], pch = 16, col = "yellowgreen") points(x = flowers$weight[flowers$nitrogen == "high"], y = flowers$shootarea[flowers$nitrogen == "high"], pch = 16, col = "deeppink3") text(x = 28, y = 190, label = "A", cex = 2) leg_cols <- c("deepskyblue", "yellowgreen", "deeppink3") leg_sym <- c(16, 16, 16) leg_lab <- c("low", "medium", "high") legend(x = 1, y = 200, col = leg_cols, pch = leg_sym, legend = leg_lab, bty = "n", title = "Nitrogen level")
The table below highlights some of the low level potting functions you might find useful.
||add connected lines to a plot|
||draws a curve corresponding to a function|
||draws arrows between 2 points|
||adds text to a plot|
||adds text to one of the 4 plot margins|
||adds an axis to the current plot|
||draws a rectangle|
||adds a legend to the plot|
||adds points to the plot|
||adds a straight line to a plot|
||adds a rectangular grid to the current plot|
||draws a polygon|