1.8 Directory structure

In addition to using RStudio Projects, it’s also really good practice to structure your working directory in a consistent and logical way to help both you and your collaborators. We frequently use the following directory structure in our R based projects

 

 

In our working directory we have the following directories:

  • Root - This is your project directory containing your .Rproj file.

  • data - We store all our data in this directory. The subdirectory called raw_data contains raw data files and only raw data files. These files should be treated as read only and should not be changed in any way. If you need to process/clean/modify your data do this in R (not MS Excel) as you can document (and justify) any changes made. Any processed data should be saved to a separate file and stored in the processed_data subdirectory. Information about data collection methods, details of data download and any other useful metadata should be saved in a text document (see README text files below) in the metadata subdirectory.

  • R - This is an optional directory where we save all of the custom R functions we’ve written for the current analysis. These can then be sourced into R using the source() function.

  • Rmd - An optional directory where we save our R markdown documents.

  • scripts - All of the main R scripts we have written for the current project are saved here.

  • output - Outputs from our R scripts such as plots, HTML files and data summaries are saved in this directory. This helps us and our collaborators distinguish what files are outputs and which are source files.

Of course, the structure described above is just what works for us most of the time and should be viewed as a starting point for your own needs. We tend to have a fairly consistent directory structure across our projects as this allows us to quickly orientate ourselves when we return to a project after a while. Having said that, different projects will have different requirements so we happily add and remove directories as required.

You can create your directory structure using Windows Explorer (or Finder on a Mac) or within RStudio by clicking on the ‘New folder’ button in the ‘Files’ pane.

 

 

An alternative approach is to use the dir.create() and list.files() functions in the R Console.

# create directory called 'data'
dir.create('data')

# create subdirectory raw_data in the data directory
dir.create('data/raw_data')

# list the files and directories
list.files(recursive = TRUE, include.dirs = TRUE)

# [1] "data"  "data/raw_data"   "first_project.Rproj"