2.3 Using functions in R
Up until now we’ve been creating simple objects by directly assigning a single value to an object. It’s very likely that you’ll soon want to progress to creating more complicated objects as your R experience grows and the complexity of your tasks increase. Happily, R has a multitude of functions to help you do this. You can think of a function as an object which contains a series of instructions to perform a specific task. The base installation of R comes with many functions already defined or you can increase the power of R by installing one of the 10000’s of packages now available. Once you get a bit more experience with using R you may want to define your own functions to perform tasks that are specific to your goals (more about this in Chapter 7).
See this video for a general introduction to using functions in R and this video on how to create vectors in R
The first function we will learn about is the c()
function. The c()
function is short for concatenate and we use it to join together a series of values and store them in a data structure called a vector (more on vectors in Chapter 3).
In the code above we’ve created an object called my_vec
and assigned it a value using the function c()
. There are a couple of really important points to note here. Firstly, when you use a function in R, the function name is always followed by a pair of round brackets even if there’s nothing contained between the brackets. Secondly, the argument(s) of a function are placed inside the round brackets and are separated by commas. You can think of an argument as way of customising the use or behaviour of a function. In the example above, the arguments are the numbers we want to concatenate. Finally, one of the tricky things when you first start using R is to know which function to use for a particular task and how to use it. Thankfully each function will always have a help document associated with it which will explain how to use the function (more on this later) and a quick Google search will also usually help you out.
To examine the value of our new object we can simply type out the name of the object as we did before.
## [1] 2 3 1 6 4 3 3 7
Now that we’ve created a vector we can use other functions to do useful stuff with this object. For example, we can calculate the mean, variance, standard deviation and number of elements in our vector by using the mean()
, var()
, sd()
and length()
functions.
mean(my_vec) # returns the mean of my_vec
## [1] 3.625
var(my_vec) # returns the variance of my_vec
## [1] 3.982143
sd(my_vec) # returns the standard deviation of my_vec
## [1] 1.995531
length(my_vec) # returns the number of elements in my_vec
## [1] 8
If we wanted to use any of these values later on in our analysis we can just assign the resulting value to another object.
Sometimes it can be useful to create a vector that contains a regular sequence of values in steps of one. Here we can make use of a shortcut using the :
symbol.
my_seq <- 1:10 # create regular sequence
my_seq
## [1] 1 2 3 4 5 6 7 8 9 10
my_seq2 <- 10:1 # in decending order
my_seq2
## [1] 10 9 8 7 6 5 4 3 2 1
Other useful functions for generating vectors of sequences include the seq()
and rep()
functions. For example, to generate a sequence from 1 to 5 in steps of 0.5.
Here we’ve used the arguments from =
and to =
to define the limits of the sequence and the by =
argument to specify the increment of the sequence. Play around with other values for these arguments to see their effect.
The rep()
function allows you to replicate (repeat) values a specified number of times. To repeat the value 2, 10 times
You can also repeat non-numeric values
or each element of a series
my_seq5 <- rep(1:5, times = 3) # repeats the series 1 to
# 5, 3 times
my_seq5
## [1] 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
or elements of a series.
my_seq6 <- rep(1:5, each = 3) # repeats each element of the
#series 3 times
my_seq6
## [1] 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5
We can also repeat a non-sequential series.
my_seq7 <- rep(c(3, 1, 10, 7), each = 3) # repeats each
# element of the
# series 3 times
my_seq7
## [1] 3 3 3 1 1 1 10 10 10 7 7 7
Note in the code above how we’ve used the c()
function inside the rep()
function. Nesting functions allows us to build quite complex commands within a single line of code and is a very common practice when using R. However, care needs to be taken as too many nested functions can make your code quite difficult for others to understand (or yourself some time in the future!). We could rewrite the code above to explicitly separate the two different steps to generate our vector. Either approach will give the same result, you just need to use your own judgement as to which is more readable.