--- # Workhop aims Introduce the main components of the Tidyverse - readr (read files) - dplyr (manipulate data) - ggplot2 (make awesome graphs) I have to assume you have a basic knowledge of R We don't really have time to cover all of the tidyverse (it is a huge universe!) --- class: inverse, center, middle # Part 0: Basic R programming --- # R is a calculator R can be thought of as a calculator, where you can perform different operations such as addition, subtraction, multiplication, divisions, and exponentiation. These are just some examples: .pull-left[ ```r 4 + 4 # addition ``` ``` # [1] 8 ``` ```r 4 - 4 # subtraction ``` ``` # [1] 0 ``` ```r 4 * 4 # multiplication ``` ``` # [1] 16 ``` ] .pull-right[ ```r 4 / 4 # division ``` ``` # [1] 1 ``` ```r 4 ^ 4 # exponentiation ``` ``` # [1] 256 ``` ```r sqrt(4) # Square root ``` ``` # [1] 2 ``` ] --- # Objects in R As R is an object-based language, you can name any value, or string or set it as a variable. For example, you can set 4 to be called a using `a = 4`, or `a <- 4`. Then when you want to call out 4, you can type `a` in the editor to call out the variable. .pull-left[ ```r a <- 4 # numerics a ``` ``` # [1] 4 ``` You can also perfom calculations with the objects you have created! ```r a + a # Addition a - a * 2 # Subtraction & Multiplication ``` ``` # [1] 8 # [1] -4 ``` ] .pull-right[ ```r b <- "Hello world!" # characters b ``` ``` # [1] "Hello world!" ``` However, you cannot perform mathematical calculations on characters! ```r b - b # Gives an error! ``` ``` # Error in b - b: non-numeric argument to binary operator ``` ] --- # Naming your objects You can name your objects anything you want (up to your imagination!), but there are a few rules. Names cannot start with number, and they cannot use special symbols such as `^`, `!`, `$`, `@`, `+`, `-`, `/`, or `*`: <table> <thead> <tr> <th style="text-align:left;"> GoodNames </th> <th style="text-align:left;"> BadNames </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> a </td> <td style="text-align:left;"> 1a </td> </tr> <tr> <td style="text-align:left;"> Boo </td> <td style="text-align:left;"> !Boo </td> </tr> <tr> <td style="text-align:left;"> FOO </td> <td style="text-align:left;"> $ </td> </tr> <tr> <td style="text-align:left;"> my_variables </td> <td style="text-align:left;"> ^my_variables </td> </tr> <tr> <td style="text-align:left;"> .variables </td> <td style="text-align:left;"> /variables </td> </tr> <tr> <td style="text-align:left;"> mod.1 </td> <td style="text-align:left;"> *mod.1 </td> </tr> </tbody> </table> Some best practice of object naming: The object name should be **`consise`**, **`meaningful`**, **`consistent`**, and **`specific`**. --- # Vectors Vectors contain **elements of the same type**. The data types can be integer, double, character, logical, complex or raw (we don't go into complex or raw data). Vectors are generally created using the `c()` function. -- .pull-left[ - integer (numbers without decimcals) ```r v1 <- c(1, 2, 3, 4, 5) v1 ``` ``` # [1] 1 2 3 4 5 ``` - double (numbers with decimals) ```r v2 <- c(1.234, 0.632, -0.234, 7/42) v2 ``` ``` # [1] 1.2340000 0.6320000 -0.2340000 0.1666667 ``` ] -- .pull-right[ - character/strings (text) ```r c("Hello", "World", "!", "Are you", "555?") ``` ``` # [1] "Hello" "World" "!" "Are you" "555?" ``` - logical ```r c(TRUE, TRUE, FALSE, F, T) ``` ``` # [1] TRUE TRUE FALSE FALSE TRUE ``` ```r # Do not use F or T for logical. It gets confusing ``` ] --- # Vectors with different types of elements What happens when you have a vector that contains different types of elements? They will coerce into a single type of element -- .pull-left[ - integers and doubles ```r v1 <- c(1L, 2L, 3L) # putting L after a number makes it an integer v2 <- c(1, 2.86, 3) v1 v2 ``` ``` # [1] 1 2 3 # [1] 1.00 2.86 3.00 ``` ```r typeof(v1) # typeof() determines the type of an object typeof(v2) ``` ``` # [1] "integer" # [1] "double" ``` ] -- .pull-right[ - characters, logicals and doubles ```r v3 <- c("Hello", TRUE, 555) v3 ``` ``` # [1] "Hello" "TRUE" "555" ``` ```r typeof(v3) ``` ``` # [1] "character" ``` ] --- # Data Frames While vectors are one-dimensional, a data frame is a table-like data structure that is two-dimensional in nature. -- Data frames can store multiple types of elements, such as doubles, logicals, and characters -- ```r df <- data.frame(c(-9.1,0.2,3.4,4,5), # Doubles c(FALSE, TRUE, TRUE, FALSE, FALSE), # Logicals c("Var1", "Var2", "Var3", "Var4", "Var5")) # Characters names(df) <- c("Double", "Logical", "Character") # Use names() to view the column names df ``` ``` # Double Logical Character # 1 -9.1 FALSE Var1 # 2 0.2 TRUE Var2 # 3 3.4 TRUE Var3 # 4 4.0 FALSE Var4 # 5 5.0 FALSE Var5 ``` --- # Data Frames We can check the structure of the data frame using `str()` to check the type of element of each column. ```r str(df) # str() stands for structure ``` ``` # 'data.frame': 5 obs. of 3 variables: # $ Double : num -9.1 0.2 3.4 4 5 # $ Logical : logi FALSE TRUE TRUE FALSE FALSE # $ Character: Factor w/ 5 levels "Var1","Var2",..: 1 2 3 4 5 ``` We see that R automatically converts the character column into a factor! --- # Loading and saving data In any programming language, we must know how to load and save our data. In R, we can load our data with various functions: .pull-left[ - text files: `read.table()` - csv files: `read.csv()` ] .pull-right[ - xlsx files: `read.xlsx()` using the `xlsx ` package - SPSS data: `read.sav()` using the `haven` package ] -- ## We also must know how to save our data! -- I would recommend saving your data in **`.csv`** format as it loads and saves faster. Additionally, whether loading or saving data, we can save it in specific folders using `file.path(yourpath, filename)`. In summary, this is how a saving/loading data looks like in R: -- ```r rootdir <- "D:/Winson/Github/R-Workshop" datadir <- file.path(rootdir, "data") # Construct the path to the data directory csvfile <- "df_example.csv" file.path(datadir, csvfile) # Gives the full path ``` ``` # [1] "D:/Winson/Github/R-Workshop/data/df_example.csv" ``` -- You will then type `write.csv(df, file.path(rootdir, csvfile))` to save the file. Simple, isn't it? --- class: inverse, center, middle # Thanks! 