Prac%cal 1
Introduc)on
1) Basic Syntax of R
In general: don’t type directly at the prompt, but save/edit your commands in an editor (text file)
R can calculate à when you have a certain number, you can assign it to an object
• e.g.: 1+1 à assign it to an object: a<-1+1 OR a<-2
• then you can calculate with a à a+1=3
• Or you can make it more difficult and calculate with logarithm of a à log(a) = 0,301
In R, the logarithm is a funcNon à the objects that calculate, plot, and perform staNsNcal funcNons. ‘a’
is here the input for the log funcNon, and in that context, the input of a R-funcNon is referred to as an
argument à a is the argument of the funcNon log
• FuncNons in R are always wriQen with brackets
• Here, the log has only 1 argument, but someNmes the funcNons take many more arguments
• Seq generates a sequence of numbers à in this example, we have applied 3 numbers to that
seq-funcNon
– From: it tells R where to start
– To: it tells R where to stop
– By: how large should the steps be
• If a funcNon has mulNple arguments, these need to be separated by a comma.
Decimal numbers = point !!
• The output (of myX) is again an object à special object = vector = one-dimensional matrix = a
column or a row of numbers
à you can use this vector as an input for the next funcNon (to calculate for instance the mean
or the sum)
• If you can to generate a plot à you always need to address a X-axis and a Y-axis.
– plot(x=myX,y=myY)
2) Ge4ng help in R
• Google = main source of help
• ? Help funcNon = build in
, • Tutorials on the internet
• R-mailing list
• Several blogs (R-blogger…)
• TwiQer/X (#rstats, @RLangTip, @insideR…)
• ChatGPT
Formula: myX<-seq(from=-3,to=+3,by=0.5)
Suppose someone asks you to generate this formula, but you have never worked with R before.... è
go to Google: “create a sequence of numbers in R” à you will find a typical help-page (there is one for
each funcNon in R)
• hQps://www.math.ucla.edu/~anderson/rw1001/library/base/html/seq.html
You can also ask help directly in R à you need to type: ? + name of the funcNon = e.g. ?seq
à you can see that there are also other arguments which aren’t mandatory
à an example that the help-page gives: seq(0,1,length.out=11) à what does this mean?
• This generated a sequence starNng from 0 to 1, with a total of 11 numbers/entries
• 0,1 are no longer named, but they are the same as from=0,to=1 à there is a default order in
the arguments, so if you don’t tell R what the arguments mean, R assumes that the order is
the same as the default (so the first number will be ‘from’, and the second will be ‘to’)
3) Rstudio
• In R, you rarely
– Work at the prompt
o More serious analysis, you will write down the R-commands in a script
– Save the output of an analysis
• Generate R-script file with commands
– Add comments
– Save for later use (the file/script with the commands)
• Rstudio = dedicated R-editor with
– Syntax highlighNng
– Auto compleNon
– See workspace, graphs, help…
, – 4 parts
o Script = text file where you
write down the commands
o Prompt = enter commands
and they are executed
o Work space
o Various = including graphs
and help
When you open Rstudio for the first Nme, you will
not have the script-screen (you will have 1 large
screen on the lel)
à go to: file à new file à R-script
= you open a blank file à here, you will write you script
The result will appear at the prompt (below the script-screen)
You can add comments (will have a different color)à proceeded by ‘#’
When you address your script to an object, this will appear in the work space = list of all the objects
that are currently stored in R’s memory
4) The working directory
à whenever you are doing a complicated analysis with several input files, and more than 1 output file,
and some text files that contain your R-script, it is a good idea to put them into the same directory
(same folder)
• One folder for each analysis
– Containing input and output files
• Set working directory at start of your analysis
• Advantages
– All files from 1 project grouped
– No need to specify full path when reading in/wriNng out files
• Specify the working directory
– Via menu
– Via script
• FuncNons
– getwd() # current working dir
– setwd(“c:\myDir\...”) # change working dir
à PC CLASS à DON’T PUT THE WORKING DIRECTORY ON THE DESKTOP
à getwd = NO arguments à prompt: you get an object within quotes = text string (all text strings in R
are quoted)
à if you are not saNsfied with your working directory, you can change it à setwd(“…”) à you have to
specify this (where the working directory lies)
• !!!!!! In windows, folders are separated by a backslash = \ à in Rstudion, this needs to be a
forward slash = /
– You can manually change this, OR à ctr + F = find and replace
• You sNll need to run the script, to change the working directory
, à there is another way to change the working directory
• Go to upper row à file, edit, code, view …
• Go to session à set working directory à choose directory à you can navigate to the windows
explorer
• Working with the working directory
– list.files(getwd()) # files in current dir
– read.table ( “myInput.txt”) # no need to specify the enNre path!
– write.table( “myOutput.txt”)
• list files: list all the files within the brackets à if you combine this with the getwd(), you will
get a list of all the files currently present in the working directory
• read.table: if you have assigned a working directory, and the input file is in that working
directory, the only thing that you need to supply is the name of that input file
à if you don’t specify the working directory, you need to supply the enNre path within the
brackets
Online material data reading, handling and plo5ng
ImporCng data part 1
Scanning your file
• Header line
– Line with the variable names should not have special characters à no special
characters in headers (# - & % ? @ ;)
o Each variable name should start with a leQer and it should only contain leQers,
numbers and underscores(_)
– Each column should have a header
• Cells with formula
– Copy, PasteSpecial, Values
• Empty cells à is this missing data? What do you do with that?
• Mixing numeric and character data
Other things you need to be aware of à they are not problemaNc
• Missing value indicator à which symbol or character has been used as missing value indicator
(e.g. NA,?)
• How are decimal numbers showing = comma? point?
• Delimiters à which character is separaNng the columns à whitespaces?
– When you have already whitespaces inside the cell, you must separate the columns by
a tab
Introduc)on
1) Basic Syntax of R
In general: don’t type directly at the prompt, but save/edit your commands in an editor (text file)
R can calculate à when you have a certain number, you can assign it to an object
• e.g.: 1+1 à assign it to an object: a<-1+1 OR a<-2
• then you can calculate with a à a+1=3
• Or you can make it more difficult and calculate with logarithm of a à log(a) = 0,301
In R, the logarithm is a funcNon à the objects that calculate, plot, and perform staNsNcal funcNons. ‘a’
is here the input for the log funcNon, and in that context, the input of a R-funcNon is referred to as an
argument à a is the argument of the funcNon log
• FuncNons in R are always wriQen with brackets
• Here, the log has only 1 argument, but someNmes the funcNons take many more arguments
• Seq generates a sequence of numbers à in this example, we have applied 3 numbers to that
seq-funcNon
– From: it tells R where to start
– To: it tells R where to stop
– By: how large should the steps be
• If a funcNon has mulNple arguments, these need to be separated by a comma.
Decimal numbers = point !!
• The output (of myX) is again an object à special object = vector = one-dimensional matrix = a
column or a row of numbers
à you can use this vector as an input for the next funcNon (to calculate for instance the mean
or the sum)
• If you can to generate a plot à you always need to address a X-axis and a Y-axis.
– plot(x=myX,y=myY)
2) Ge4ng help in R
• Google = main source of help
• ? Help funcNon = build in
, • Tutorials on the internet
• R-mailing list
• Several blogs (R-blogger…)
• TwiQer/X (#rstats, @RLangTip, @insideR…)
• ChatGPT
Formula: myX<-seq(from=-3,to=+3,by=0.5)
Suppose someone asks you to generate this formula, but you have never worked with R before.... è
go to Google: “create a sequence of numbers in R” à you will find a typical help-page (there is one for
each funcNon in R)
• hQps://www.math.ucla.edu/~anderson/rw1001/library/base/html/seq.html
You can also ask help directly in R à you need to type: ? + name of the funcNon = e.g. ?seq
à you can see that there are also other arguments which aren’t mandatory
à an example that the help-page gives: seq(0,1,length.out=11) à what does this mean?
• This generated a sequence starNng from 0 to 1, with a total of 11 numbers/entries
• 0,1 are no longer named, but they are the same as from=0,to=1 à there is a default order in
the arguments, so if you don’t tell R what the arguments mean, R assumes that the order is
the same as the default (so the first number will be ‘from’, and the second will be ‘to’)
3) Rstudio
• In R, you rarely
– Work at the prompt
o More serious analysis, you will write down the R-commands in a script
– Save the output of an analysis
• Generate R-script file with commands
– Add comments
– Save for later use (the file/script with the commands)
• Rstudio = dedicated R-editor with
– Syntax highlighNng
– Auto compleNon
– See workspace, graphs, help…
, – 4 parts
o Script = text file where you
write down the commands
o Prompt = enter commands
and they are executed
o Work space
o Various = including graphs
and help
When you open Rstudio for the first Nme, you will
not have the script-screen (you will have 1 large
screen on the lel)
à go to: file à new file à R-script
= you open a blank file à here, you will write you script
The result will appear at the prompt (below the script-screen)
You can add comments (will have a different color)à proceeded by ‘#’
When you address your script to an object, this will appear in the work space = list of all the objects
that are currently stored in R’s memory
4) The working directory
à whenever you are doing a complicated analysis with several input files, and more than 1 output file,
and some text files that contain your R-script, it is a good idea to put them into the same directory
(same folder)
• One folder for each analysis
– Containing input and output files
• Set working directory at start of your analysis
• Advantages
– All files from 1 project grouped
– No need to specify full path when reading in/wriNng out files
• Specify the working directory
– Via menu
– Via script
• FuncNons
– getwd() # current working dir
– setwd(“c:\myDir\...”) # change working dir
à PC CLASS à DON’T PUT THE WORKING DIRECTORY ON THE DESKTOP
à getwd = NO arguments à prompt: you get an object within quotes = text string (all text strings in R
are quoted)
à if you are not saNsfied with your working directory, you can change it à setwd(“…”) à you have to
specify this (where the working directory lies)
• !!!!!! In windows, folders are separated by a backslash = \ à in Rstudion, this needs to be a
forward slash = /
– You can manually change this, OR à ctr + F = find and replace
• You sNll need to run the script, to change the working directory
, à there is another way to change the working directory
• Go to upper row à file, edit, code, view …
• Go to session à set working directory à choose directory à you can navigate to the windows
explorer
• Working with the working directory
– list.files(getwd()) # files in current dir
– read.table ( “myInput.txt”) # no need to specify the enNre path!
– write.table( “myOutput.txt”)
• list files: list all the files within the brackets à if you combine this with the getwd(), you will
get a list of all the files currently present in the working directory
• read.table: if you have assigned a working directory, and the input file is in that working
directory, the only thing that you need to supply is the name of that input file
à if you don’t specify the working directory, you need to supply the enNre path within the
brackets
Online material data reading, handling and plo5ng
ImporCng data part 1
Scanning your file
• Header line
– Line with the variable names should not have special characters à no special
characters in headers (# - & % ? @ ;)
o Each variable name should start with a leQer and it should only contain leQers,
numbers and underscores(_)
– Each column should have a header
• Cells with formula
– Copy, PasteSpecial, Values
• Empty cells à is this missing data? What do you do with that?
• Mixing numeric and character data
Other things you need to be aware of à they are not problemaNc
• Missing value indicator à which symbol or character has been used as missing value indicator
(e.g. NA,?)
• How are decimal numbers showing = comma? point?
• Delimiters à which character is separaNng the columns à whitespaces?
– When you have already whitespaces inside the cell, you must separate the columns by
a tab