100% tevredenheidsgarantie Direct beschikbaar na je betaling Lees online óf als PDF Geen vaste maandelijkse kosten 4.2 TrustPilot
logo-home
Samenvatting

Summary Practicals Data Mining

Beoordeling
-
Verkocht
8
Pagina's
118
Geüpload op
03-05-2025
Geschreven in
2024/2025

All practicals of the data mining course fully detailed (achieved 10/10 on the practical test)

Instelling
Vak











Oeps! We kunnen je document nu niet laden. Probeer het nog eens of neem contact op met support.

Geschreven voor

Instelling
Studie
Vak

Documentinformatie

Geüpload op
3 mei 2025
Bestand laatst geupdate op
25 mei 2025
Aantal pagina's
118
Geschreven in
2024/2025
Type
Samenvatting

Onderwerpen

Voorbeeld van de inhoud

Prac%cal 1
Introduc)on
1) Basic Syntax of R
In general: don’t type directly at the prompt, but save/edit your commands in an editor (text file)

R can calculate à when you have a certain number, you can assign it to an object
• e.g.: 1+1 à assign it to an object: a<-1+1 OR a<-2
• then you can calculate with a à a+1=3
• Or you can make it more difficult and calculate with logarithm of a à log(a) = 0,301

In R, the logarithm is a funcNon à the objects that calculate, plot, and perform staNsNcal funcNons. ‘a’
is here the input for the log funcNon, and in that context, the input of a R-funcNon is referred to as an
argument à a is the argument of the funcNon log




• FuncNons in R are always wriQen with brackets
• Here, the log has only 1 argument, but someNmes the funcNons take many more arguments




• Seq generates a sequence of numbers à in this example, we have applied 3 numbers to that
seq-funcNon
– From: it tells R where to start
– To: it tells R where to stop
– By: how large should the steps be
• If a funcNon has mulNple arguments, these need to be separated by a comma.
Decimal numbers = point !!
• The output (of myX) is again an object à special object = vector = one-dimensional matrix = a
column or a row of numbers
à you can use this vector as an input for the next funcNon (to calculate for instance the mean
or the sum)
• If you can to generate a plot à you always need to address a X-axis and a Y-axis.
– plot(x=myX,y=myY)

2) Ge4ng help in R
• Google = main source of help
• ? Help funcNon = build in

, • Tutorials on the internet
• R-mailing list
• Several blogs (R-blogger…)
• TwiQer/X (#rstats, @RLangTip, @insideR…)
• ChatGPT

Formula: myX<-seq(from=-3,to=+3,by=0.5)
Suppose someone asks you to generate this formula, but you have never worked with R before.... è
go to Google: “create a sequence of numbers in R” à you will find a typical help-page (there is one for
each funcNon in R)
• hQps://www.math.ucla.edu/~anderson/rw1001/library/base/html/seq.html




You can also ask help directly in R à you need to type: ? + name of the funcNon = e.g. ?seq
à you can see that there are also other arguments which aren’t mandatory
à an example that the help-page gives: seq(0,1,length.out=11) à what does this mean?
• This generated a sequence starNng from 0 to 1, with a total of 11 numbers/entries
• 0,1 are no longer named, but they are the same as from=0,to=1 à there is a default order in
the arguments, so if you don’t tell R what the arguments mean, R assumes that the order is
the same as the default (so the first number will be ‘from’, and the second will be ‘to’)

3) Rstudio
• In R, you rarely
– Work at the prompt
o More serious analysis, you will write down the R-commands in a script
– Save the output of an analysis
• Generate R-script file with commands
– Add comments
– Save for later use (the file/script with the commands)
• Rstudio = dedicated R-editor with
– Syntax highlighNng
– Auto compleNon
– See workspace, graphs, help…

, – 4 parts
o Script = text file where you
write down the commands
o Prompt = enter commands
and they are executed
o Work space
o Various = including graphs
and help

When you open Rstudio for the first Nme, you will
not have the script-screen (you will have 1 large
screen on the lel)
à go to: file à new file à R-script

= you open a blank file à here, you will write you script
The result will appear at the prompt (below the script-screen)

You can add comments (will have a different color)à proceeded by ‘#’




When you address your script to an object, this will appear in the work space = list of all the objects
that are currently stored in R’s memory

4) The working directory
à whenever you are doing a complicated analysis with several input files, and more than 1 output file,
and some text files that contain your R-script, it is a good idea to put them into the same directory
(same folder)
• One folder for each analysis
– Containing input and output files
• Set working directory at start of your analysis
• Advantages
– All files from 1 project grouped
– No need to specify full path when reading in/wriNng out files
• Specify the working directory
– Via menu
– Via script
• FuncNons
– getwd() # current working dir
– setwd(“c:\myDir\...”) # change working dir
à PC CLASS à DON’T PUT THE WORKING DIRECTORY ON THE DESKTOP

à getwd = NO arguments à prompt: you get an object within quotes = text string (all text strings in R
are quoted)
à if you are not saNsfied with your working directory, you can change it à setwd(“…”) à you have to
specify this (where the working directory lies)
• !!!!!! In windows, folders are separated by a backslash = \ à in Rstudion, this needs to be a
forward slash = /
– You can manually change this, OR à ctr + F = find and replace
• You sNll need to run the script, to change the working directory

, à there is another way to change the working directory
• Go to upper row à file, edit, code, view …
• Go to session à set working directory à choose directory à you can navigate to the windows
explorer




• Working with the working directory
– list.files(getwd()) # files in current dir
– read.table ( “myInput.txt”) # no need to specify the enNre path!
– write.table( “myOutput.txt”)

• list files: list all the files within the brackets à if you combine this with the getwd(), you will
get a list of all the files currently present in the working directory
• read.table: if you have assigned a working directory, and the input file is in that working
directory, the only thing that you need to supply is the name of that input file
à if you don’t specify the working directory, you need to supply the enNre path within the
brackets

Online material data reading, handling and plo5ng
ImporCng data part 1
Scanning your file
• Header line
– Line with the variable names should not have special characters à no special
characters in headers (# - & % ? @ ;)
o Each variable name should start with a leQer and it should only contain leQers,
numbers and underscores(_)
– Each column should have a header
• Cells with formula
– Copy, PasteSpecial, Values
• Empty cells à is this missing data? What do you do with that?
• Mixing numeric and character data

Other things you need to be aware of à they are not problemaNc
• Missing value indicator à which symbol or character has been used as missing value indicator
(e.g. NA,?)
• How are decimal numbers showing = comma? point?
• Delimiters à which character is separaNng the columns à whitespaces?
– When you have already whitespaces inside the cell, you must separate the columns by
a tab

Maak kennis met de verkoper

Seller avatar
De reputatie van een verkoper is gebaseerd op het aantal documenten dat iemand tegen betaling verkocht heeft en de beoordelingen die voor die items ontvangen zijn. Er zijn drie niveau’s te onderscheiden: brons, zilver en goud. Hoe beter de reputatie, hoe meer de kwaliteit van zijn of haar werk te vertrouwen is.
goormansamber1 Universiteit Antwerpen
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
288
Lid sinds
2 jaar
Aantal volgers
92
Documenten
51
Laatst verkocht
3 dagen geleden

4,2

29 beoordelingen

5
14
4
9
3
3
2
3
1
0

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Veelgestelde vragen