Introduction to R: Basics
Inhoudsopgave
1. Off we go!..................................................................................................5
1.1. Credit and license.................................................................................................... 5
1.2. Goal and scope........................................................................................................ 5
1.3. Format..................................................................................................................... 5
1.4. Caveats.................................................................................................................... 5
2. Getting started with R................................................................................5
2.1. Typing commands.................................................................................................... 5
2.1.1. Be very careful to avoid typos...........................................................................5
2.1.2. R is (a bit) flexible with spacing.........................................................................5
2.2. Doing simple calculations........................................................................................5
2.2.1. Adding, subtracting, multiplying and dividing...................................................6
2.2.2. Taking powers.................................................................................................... 6
2.2.3. Doing calculations in the right order.................................................................6
2.3. Storing a number as a variable................................................................................6
2.4. Doing calculations using variables...........................................................................6
2.5. Storing many numbers as a vector..........................................................................7
2.5.1. Creating a vector...............................................................................................7
2.5.2. Getting information out of a vector...................................................................7
2.5.3. Altering the elements of a vector......................................................................7
2.5.4. Creating a vector using a shorthand..................................................................8
2.6. Doing calculations using vectors..............................................................................8
2.6.1. Using a single number.......................................................................................8
2.6.2. Using another vector......................................................................................... 9
2.7. Doing calculations using functions...........................................................................9
3. More fun with R........................................................................................10
3.1. Getting help........................................................................................................... 10
3.2. Using comments.................................................................................................... 10
3.3. Text data................................................................................................................ 10
3.3.1. Storing text data as a variable........................................................................10
3.3.2. Storing text data as a vector...........................................................................10
3.3.3. Working with text data....................................................................................10
3.4. Logical data aka “true” or “false” data..................................................................11
3.4.1. Assessing mathematical truths........................................................................11
3.4.2. Storing logical data as a variable....................................................................11
3.4.3. Storing logical data as a vector.......................................................................12
3.4.4. Working with logical data................................................................................12
3.4.5. More logical operations...................................................................................13
3.4.6. Applying logical operation to text....................................................................14
3.5. Variable classes..................................................................................................... 14
4. More on functions in R..............................................................................15
4.1. Function arguments...............................................................................................15
1
, 4.1.1. Argument names............................................................................................. 15
4.1.2. Argument defaults........................................................................................... 15
4.2. A few more mathematical functions......................................................................15
4.2.1. Rounding a number......................................................................................... 16
4.2.2. Logarithms and exponentials..........................................................................16
4.2.3. The sum, the mean, and the cumsum.............................................................16
4.2.4. sum() and mean() with logical data.................................................................16
4.3. A few more general functions................................................................................17
4.3.1. rep () and seq ().............................................................................................. 17
4.3.2. head () and tail ()............................................................................................ 17
4.3.3. max () and min()............................................................................................. 18
4.3.4. which ()........................................................................................................... 18
4.3.5. Tabulating and cross-tabulating data...............................................................18
4.3.6. print ()............................................................................................................. 20
4.3.7. Pasting string together....................................................................................20
4.3.8. The all.equal () function aka the problem with floating-point arithmetic.........20
5. Working in RStudio...................................................................................21
6. More on variables.....................................................................................21
6.1. Useful things to know about variables...................................................................21
6.1.1. Rules and conventions for naming variables...................................................21
6.1.2. Special values................................................................................................. 21
6.2. Matrices................................................................................................................. 22
6.2.1. Creating a matrix using rbind ().......................................................................22
6.2.2. Indexing a matrix............................................................................................ 22
6.2.3. A matrix as one big variable, really.................................................................23
6.2.4. Other ways of creating matrices......................................................................23
6.2.5. Transposing..................................................................................................... 24
6.3. Factors................................................................................................................... 24
6.3.1. Introducing factors.......................................................................................... 25
6.3.2. Labelling the factor levels...............................................................................25
6.3.3. Moving on…..................................................................................................... 26
6.4. Data frames........................................................................................................... 26
6.4.1. Introducing data frames..................................................................................26
6.4.2. Pulling out the contents of the data frame using $..........................................27
6.4.3. Getting information about a data frame..........................................................28
6.4.4. Data frames vs matrices.................................................................................28
6.4.5. Looking for more on data frames?...................................................................29
6.5. Lists....................................................................................................................... 29
6.5.1. Data frames as lists......................................................................................... 30
6.6. Formulas................................................................................................................ 31
6.7. More useful things to know when dealing with different kind of variables.............32
6.7.1. Coercing data from one class to another.........................................................32
6.7.2. Generic functions............................................................................................ 32
7. Drawing graphs........................................................................................34
7.1. An introduction to plotting.....................................................................................34
7.1.3. Customizing the title, the axis labels and the limits........................................34
7.1.2. Customizing fonts............................................................................................ 35
7.1.3. Changing the plot type....................................................................................36
7.1.4. Changing other features of the plot.................................................................37
7.1.5. Changing the appearance of the axes.............................................................37
7.1.6. Don’t panic...................................................................................................... 38
2
, 7.2. Histograms............................................................................................................ 38
7.2.1. Visual style of your histogram.........................................................................39
7.3. Stem and leaf plots................................................................................................39
7.4. Boxplots................................................................................................................. 40
7.4.1. Visual style of your boxplot.............................................................................41
7.5. Scatterplots........................................................................................................... 42
7.5.1. Adding stuff to your plot..................................................................................42
7.5.2. Using more specialized functions....................................................................43
7.5.3. More elaborate options....................................................................................44
7.6. Bar graphs............................................................................................................. 44
7.7. Pie charts............................................................................................................... 45
7.8. Moving on.............................................................................................................. 46
8. Data handling........................................................................................... 46
8.1. The naming game.................................................................................................. 46
8.2. Extracting a subset of a vector..............................................................................47
8.2.1. Extracting elements from a vector using numeric indexing.............................47
8.2.2. Extracting elements from a vector using names.............................................48
8.2.3. Dropping elements from a vector using negative indices................................48
8.2.4. Extracting elements from a vector using logical indexing...............................48
8.3. Transforming a variable.........................................................................................50
8.4. Splitting a vector by group.....................................................................................51
8.5. Cutting a numeric variable into categories............................................................51
8.5.1. Letting R take the lead....................................................................................52
8.6. Sorting data........................................................................................................... 53
8.6.1. Sorting a numeric or character vector.............................................................53
8.6.2. Sorting a factor................................................................................................ 53
8.6.3. Sorting a data frame.......................................................................................54
8.7. Tabulating and cross-tabulating a data frame........................................................54
8.7.1. Creating tables from data frames....................................................................54
8.7.2. Converting a table of counts to a table of proportions.....................................54
8.8. Extracting a subset of a data frame.......................................................................54
8.8.1. Using the subset() function.............................................................................55
8.8.2. Using square brackets: Double index..............................................................56
8.8.3. A bit more on the double index approach........................................................57
8.8.4. More than you wish to know on the double index approach............................58
8.8.5. Using square brackets: Single index................................................................59
8.8.6. Trying to make sense of it all...........................................................................60
9. Basic programming...................................................................................61
9.1. Loops..................................................................................................................... 61
9.1.1. The while loop................................................................................................. 61
9.1.2. The for loop..................................................................................................... 62
9.1.3. A more realistic example of a loop..................................................................63
9.2. Conditional statements.......................................................................................... 63
9.3. Implicit loops......................................................................................................... 65
9.4. Writing your own functions....................................................................................66
9.4.1. Function arguments revisited..........................................................................66
9.4.2. There’s more to functions than this.................................................................67
3
,10. Hey, R! Let me do some statistics............................................................67
10.1. Fooled by easiness............................................................................................... 67
10.2. Keeping it real...................................................................................................... 68
10.3. There’s more than meets the eye........................................................................68
10.4. Handling missing values......................................................................................68
10.5. Doing Everything Everywhere All at Once............................................................69
10.5.1. “Summarizing” a data frame.........................................................................70
10.6. Descriptive statistics separately for each group..................................................71
4
, 1. Off we go!
1.1. Credit and license
1.2. Goal and scope
1.3. Format
1.4. Caveats
Not important
2. Getting started with R
2.1. Typing commands
Typing a command: ctrl + enter OR command + enter R will execute
the command
[1] = answer on the first question
2.1.1. Be very careful to avoid typos
There is no autocorrect in R, so type exactly what you mean
2.1.2. R is (a bit) flexible with spacing
R ignores redundant spacing (exception: do not insert spaces in the middle
of a word)
2.2. Doing simple calculations
Addition is an operation and + is an (arithmetic) operator
5
, 2.2.1. Adding, subtracting, multiplying and dividing
Arithmetic operations:
Operation Operator
Addition +
Subtraction -
Multiplication *
Division /
Power ^
2.2.2. Taking powers
Raising x to the nth power: the act on multiplying a number x by itself n
times
o x-squared: x ^ 2 OR x ** 2
o x-cubed: x ^ 3 OR x ** 3
2.2.3. Doing calculations in the right order
Order of operations: brackets – exponents – division & multiplication –
addition - subtraction
2.3. Storing a number as a variable
Variable: label for (a) certain piece(s) of information, for example: ‘sales’
Value, for example: ‘350’
Assignment operator, for example: ‘<-‘ (left form) OR ‘->’ (right form)
OR ‘=’ (left direction)
o Do not insert spaces in the middle of an operator
o R can overwrite new values for a variable
2.4. Doing calculations using variables
R can calculate with variables and overwrite variables:
6
, 2.5. Storing many numbers as a vector
Vector: variable that can store multiple values
2.5.1. Creating a vector
c( ): combine function with a comma-separated list
For example, sales.by.month is a vector of 12 elements
2.5.2. Getting information out of a vector
By using [ ] we get information out of a vector, for example:
sales.by.month [2] gives us information about February but the output is
[1] because that is the first thing we asked
We can use this to create new variables
2.5.3. Altering the elements of a vector
To change the elements of a vector you can assign the whole vector again
with the combine function OR
7
Inhoudsopgave
1. Off we go!..................................................................................................5
1.1. Credit and license.................................................................................................... 5
1.2. Goal and scope........................................................................................................ 5
1.3. Format..................................................................................................................... 5
1.4. Caveats.................................................................................................................... 5
2. Getting started with R................................................................................5
2.1. Typing commands.................................................................................................... 5
2.1.1. Be very careful to avoid typos...........................................................................5
2.1.2. R is (a bit) flexible with spacing.........................................................................5
2.2. Doing simple calculations........................................................................................5
2.2.1. Adding, subtracting, multiplying and dividing...................................................6
2.2.2. Taking powers.................................................................................................... 6
2.2.3. Doing calculations in the right order.................................................................6
2.3. Storing a number as a variable................................................................................6
2.4. Doing calculations using variables...........................................................................6
2.5. Storing many numbers as a vector..........................................................................7
2.5.1. Creating a vector...............................................................................................7
2.5.2. Getting information out of a vector...................................................................7
2.5.3. Altering the elements of a vector......................................................................7
2.5.4. Creating a vector using a shorthand..................................................................8
2.6. Doing calculations using vectors..............................................................................8
2.6.1. Using a single number.......................................................................................8
2.6.2. Using another vector......................................................................................... 9
2.7. Doing calculations using functions...........................................................................9
3. More fun with R........................................................................................10
3.1. Getting help........................................................................................................... 10
3.2. Using comments.................................................................................................... 10
3.3. Text data................................................................................................................ 10
3.3.1. Storing text data as a variable........................................................................10
3.3.2. Storing text data as a vector...........................................................................10
3.3.3. Working with text data....................................................................................10
3.4. Logical data aka “true” or “false” data..................................................................11
3.4.1. Assessing mathematical truths........................................................................11
3.4.2. Storing logical data as a variable....................................................................11
3.4.3. Storing logical data as a vector.......................................................................12
3.4.4. Working with logical data................................................................................12
3.4.5. More logical operations...................................................................................13
3.4.6. Applying logical operation to text....................................................................14
3.5. Variable classes..................................................................................................... 14
4. More on functions in R..............................................................................15
4.1. Function arguments...............................................................................................15
1
, 4.1.1. Argument names............................................................................................. 15
4.1.2. Argument defaults........................................................................................... 15
4.2. A few more mathematical functions......................................................................15
4.2.1. Rounding a number......................................................................................... 16
4.2.2. Logarithms and exponentials..........................................................................16
4.2.3. The sum, the mean, and the cumsum.............................................................16
4.2.4. sum() and mean() with logical data.................................................................16
4.3. A few more general functions................................................................................17
4.3.1. rep () and seq ().............................................................................................. 17
4.3.2. head () and tail ()............................................................................................ 17
4.3.3. max () and min()............................................................................................. 18
4.3.4. which ()........................................................................................................... 18
4.3.5. Tabulating and cross-tabulating data...............................................................18
4.3.6. print ()............................................................................................................. 20
4.3.7. Pasting string together....................................................................................20
4.3.8. The all.equal () function aka the problem with floating-point arithmetic.........20
5. Working in RStudio...................................................................................21
6. More on variables.....................................................................................21
6.1. Useful things to know about variables...................................................................21
6.1.1. Rules and conventions for naming variables...................................................21
6.1.2. Special values................................................................................................. 21
6.2. Matrices................................................................................................................. 22
6.2.1. Creating a matrix using rbind ().......................................................................22
6.2.2. Indexing a matrix............................................................................................ 22
6.2.3. A matrix as one big variable, really.................................................................23
6.2.4. Other ways of creating matrices......................................................................23
6.2.5. Transposing..................................................................................................... 24
6.3. Factors................................................................................................................... 24
6.3.1. Introducing factors.......................................................................................... 25
6.3.2. Labelling the factor levels...............................................................................25
6.3.3. Moving on…..................................................................................................... 26
6.4. Data frames........................................................................................................... 26
6.4.1. Introducing data frames..................................................................................26
6.4.2. Pulling out the contents of the data frame using $..........................................27
6.4.3. Getting information about a data frame..........................................................28
6.4.4. Data frames vs matrices.................................................................................28
6.4.5. Looking for more on data frames?...................................................................29
6.5. Lists....................................................................................................................... 29
6.5.1. Data frames as lists......................................................................................... 30
6.6. Formulas................................................................................................................ 31
6.7. More useful things to know when dealing with different kind of variables.............32
6.7.1. Coercing data from one class to another.........................................................32
6.7.2. Generic functions............................................................................................ 32
7. Drawing graphs........................................................................................34
7.1. An introduction to plotting.....................................................................................34
7.1.3. Customizing the title, the axis labels and the limits........................................34
7.1.2. Customizing fonts............................................................................................ 35
7.1.3. Changing the plot type....................................................................................36
7.1.4. Changing other features of the plot.................................................................37
7.1.5. Changing the appearance of the axes.............................................................37
7.1.6. Don’t panic...................................................................................................... 38
2
, 7.2. Histograms............................................................................................................ 38
7.2.1. Visual style of your histogram.........................................................................39
7.3. Stem and leaf plots................................................................................................39
7.4. Boxplots................................................................................................................. 40
7.4.1. Visual style of your boxplot.............................................................................41
7.5. Scatterplots........................................................................................................... 42
7.5.1. Adding stuff to your plot..................................................................................42
7.5.2. Using more specialized functions....................................................................43
7.5.3. More elaborate options....................................................................................44
7.6. Bar graphs............................................................................................................. 44
7.7. Pie charts............................................................................................................... 45
7.8. Moving on.............................................................................................................. 46
8. Data handling........................................................................................... 46
8.1. The naming game.................................................................................................. 46
8.2. Extracting a subset of a vector..............................................................................47
8.2.1. Extracting elements from a vector using numeric indexing.............................47
8.2.2. Extracting elements from a vector using names.............................................48
8.2.3. Dropping elements from a vector using negative indices................................48
8.2.4. Extracting elements from a vector using logical indexing...............................48
8.3. Transforming a variable.........................................................................................50
8.4. Splitting a vector by group.....................................................................................51
8.5. Cutting a numeric variable into categories............................................................51
8.5.1. Letting R take the lead....................................................................................52
8.6. Sorting data........................................................................................................... 53
8.6.1. Sorting a numeric or character vector.............................................................53
8.6.2. Sorting a factor................................................................................................ 53
8.6.3. Sorting a data frame.......................................................................................54
8.7. Tabulating and cross-tabulating a data frame........................................................54
8.7.1. Creating tables from data frames....................................................................54
8.7.2. Converting a table of counts to a table of proportions.....................................54
8.8. Extracting a subset of a data frame.......................................................................54
8.8.1. Using the subset() function.............................................................................55
8.8.2. Using square brackets: Double index..............................................................56
8.8.3. A bit more on the double index approach........................................................57
8.8.4. More than you wish to know on the double index approach............................58
8.8.5. Using square brackets: Single index................................................................59
8.8.6. Trying to make sense of it all...........................................................................60
9. Basic programming...................................................................................61
9.1. Loops..................................................................................................................... 61
9.1.1. The while loop................................................................................................. 61
9.1.2. The for loop..................................................................................................... 62
9.1.3. A more realistic example of a loop..................................................................63
9.2. Conditional statements.......................................................................................... 63
9.3. Implicit loops......................................................................................................... 65
9.4. Writing your own functions....................................................................................66
9.4.1. Function arguments revisited..........................................................................66
9.4.2. There’s more to functions than this.................................................................67
3
,10. Hey, R! Let me do some statistics............................................................67
10.1. Fooled by easiness............................................................................................... 67
10.2. Keeping it real...................................................................................................... 68
10.3. There’s more than meets the eye........................................................................68
10.4. Handling missing values......................................................................................68
10.5. Doing Everything Everywhere All at Once............................................................69
10.5.1. “Summarizing” a data frame.........................................................................70
10.6. Descriptive statistics separately for each group..................................................71
4
, 1. Off we go!
1.1. Credit and license
1.2. Goal and scope
1.3. Format
1.4. Caveats
Not important
2. Getting started with R
2.1. Typing commands
Typing a command: ctrl + enter OR command + enter R will execute
the command
[1] = answer on the first question
2.1.1. Be very careful to avoid typos
There is no autocorrect in R, so type exactly what you mean
2.1.2. R is (a bit) flexible with spacing
R ignores redundant spacing (exception: do not insert spaces in the middle
of a word)
2.2. Doing simple calculations
Addition is an operation and + is an (arithmetic) operator
5
, 2.2.1. Adding, subtracting, multiplying and dividing
Arithmetic operations:
Operation Operator
Addition +
Subtraction -
Multiplication *
Division /
Power ^
2.2.2. Taking powers
Raising x to the nth power: the act on multiplying a number x by itself n
times
o x-squared: x ^ 2 OR x ** 2
o x-cubed: x ^ 3 OR x ** 3
2.2.3. Doing calculations in the right order
Order of operations: brackets – exponents – division & multiplication –
addition - subtraction
2.3. Storing a number as a variable
Variable: label for (a) certain piece(s) of information, for example: ‘sales’
Value, for example: ‘350’
Assignment operator, for example: ‘<-‘ (left form) OR ‘->’ (right form)
OR ‘=’ (left direction)
o Do not insert spaces in the middle of an operator
o R can overwrite new values for a variable
2.4. Doing calculations using variables
R can calculate with variables and overwrite variables:
6
, 2.5. Storing many numbers as a vector
Vector: variable that can store multiple values
2.5.1. Creating a vector
c( ): combine function with a comma-separated list
For example, sales.by.month is a vector of 12 elements
2.5.2. Getting information out of a vector
By using [ ] we get information out of a vector, for example:
sales.by.month [2] gives us information about February but the output is
[1] because that is the first thing we asked
We can use this to create new variables
2.5.3. Altering the elements of a vector
To change the elements of a vector you can assign the whole vector again
with the combine function OR
7