3.1 Data types
Understanding the different types of data and how R deals with these data is important. The temptation is to glaze over and skip these technical details, but beware, this can come back to bite you somewhere unpleasant if you don’t pay attention. We’ve already seen an example of this when we tried (and failed) to add two character objects together using the +
operator.
R has six basic types of data; numeric, integer, logical, complex and character. The keen eyed among you will notice we’ve only listed five data types here, the final data type is raw which we won’t cover as it’s not useful 99.99% of the time. We also won’t cover complex numbers as we don’t have the imagination!
Numeric data are numbers that contain a decimal. Actually they can also be whole numbers but we’ll gloss over that.
Integers are whole numbers (those numbers without a decimal point).
Logical data take on the value of either
TRUE
orFALSE
. There’s also another special type of logical calledNA
to represent missing values.Character data are used to represent string values. You can think of character strings as something like a word (or multiple words). A special type of character string is a factor, which is a string but with additional attributes (like levels or an order). We’ll cover factors later.
R is (usually) able to automatically distinguish between different classes of data by their nature and the context in which they’re used although you should bear in mind that R can’t actually read your mind and you may have to explicitly tell R how you want to treat a data type. You can find out the type (or class) of any object using the class()
function.
num <- 2.2
class(num)
## [1] "numeric"
char <- "hello"
class(char)
## [1] "character"
logi <- TRUE
class(logi)
## [1] "logical"
Alternatively, you can ask if an object is a specific class using using a logical test. The is.[classOfData]()
family of functions will return either a TRUE
or a FALSE
.
is.numeric(num)
## [1] TRUE
is.character(num)
## [1] FALSE
is.character(char)
## [1] TRUE
is.logical(logi)
## [1] TRUE
It can sometimes be useful to be able to change the class of a variable using the as.[className]()
family of coercion functions, although you need to be careful when doing this as you might receive some unexpected results (see what happens below when we try to convert a character string to a numeric).
# coerce numeric to character
class(num)
## [1] "numeric"
num_char <- as.character(num)
num_char
## [1] "2.2"
class(num_char)
## [1] "character"
# coerce character to numeric!
class(char)
## [1] "character"
char_num <- as.numeric(char)
## Warning: NAs introduced by coercion
Here’s a summary table of some of the logical test and coercion functions available to you.
Type | Logical test | Coercing |
---|---|---|
Character | is.character |
as.character |
Numeric | is.numeric |
as.numeric |
Logical | is.logical |
as.logical |
Factor | is.factor |
as.factor |
Complex | is.complex |
as.complex |