#install.packages("tidyverse")3 Data Management in R
3.1 Packages and libraries
In order to access specialised data analysis tools in R, we will need to install some R packages.
“An R package is a collection of functions, data, and documentation that extends the capabilities of base R. Using packages is key to the successful use of R.” (Wickham, Cetinkaya-Rundel, and Grolemund, n.d.)
We will start by installing the tidyverse package
To install tidyverse, type the above line of code in the console, and then press enter to run it. R will download the packages from CRAN and install them on to your computer.
Once installed, you may use this package after loading it with the library() function.
#library(tidyverse)You see above a list of packages that come with tidyverse.
You may update tidyverse by running
#tidyverse_update()3.2 Functions
You may identify functions with the () after the function name. For example, ls() that we used above.
Functions may also take arguments. The data that we pass into the function is called the function’s argument. The argument can be raw data, an R object, or even the results of another R function.
# round a number
round(4.5218)[1] 5
## 5
# calculate the factorial
factorial(3)[1] 6
## 6
# calculate the mean of values from 1 to 6:
mean(1:6)[1] 3.5
## 3.5
round(mean(1:6))[1] 4
## 4Many R functions take multiple arguments that help them do their job. You can give a function as many arguments as you like as long as you separate each argument with a comma.
To see which arguments a function can take, you may type args in parenthesis after function name:
args(round)function (x, digits = 0, ...)
NULL
## function (x, digits = 0)
## NULL
round(3.1415, digits = 2)[1] 3.14
## 3.143.2.1 Basic Functions
| Function | Description |
|---|---|
| ?() or help() | Access the documentation and help file for a particular function |
| install.packages() | Download and install an R package |
| library() | Loads an R package into the working environment |
| setwd() | Set the working directory |
| getwd() | Get the working directory |
| c() | Create a vector |
| as.numeric() | Converts an object to a numeric vector |
| as.logical() | Converts an object to a logical vector |
| as.character() | Converts and object to a character vector |
| mode() | Returns the type of the object |
| sum() | Returns the sum of all input values |
| length() | Returns the lenght of the obejct |
| mean() | Returns the arithmetic mean of the vector |
| median() | Returns the median of the vector |
| sample() | Returns a specificed size of elements from the object |
| replicate() | Repeats an expression a specific number of times |
| hist() | Creates a histogram of given data values |
3.3 Scripts
You can create a draft of your code as you go by using an R script. An R script is just a plain text file that you save R code in. You can open an R script in RStudio using the menu bar:
File –> New File –> R Script
We will write and edit R code in a script. This will help create a reproducible record of your work. When you’re finished for the day, you can save your script and then use it to rerun your entire analysis the next day.
To save a script, click the scripts pane, and then go to File –> Save As in the menu bar.
You can automatically execute a line of code in a script by clicking the Run button on the top right of the pane. R will run whichever line of code your cursor is on.
If you have a whole section highlighted, R will run the highlighted code.
You can run the entire script by clicking the Source button.
You can use Control + Return in your keyboard as a shortcut for the Run button. On Macs, that would be Command + Return.