Garantie de satisfaction à 100% Disponible immédiatement après paiement En ligne et en PDF Tu n'es attaché à rien 4.2 TrustPilot
logo-home
Resume

Summary Practicals Data Mining

Note
-
Vendu
8
Pages
118
Publié le
03-05-2025
Écrit en
2024/2025

All practicals of the data mining course fully detailed (achieved 10/10 on the practical test)












Oups ! Impossible de charger votre document. Réessayez ou contactez le support.

Infos sur le Document

Publié le
3 mai 2025
Fichier mis à jour le
25 mai 2025
Nombre de pages
118
Écrit en
2024/2025
Type
Resume

Aperçu du contenu

Prac%cal 1
Introduc)on
1) Basic Syntax of R
In general: don’t type directly at the prompt, but save/edit your commands in an editor (text file)

R can calculate à when you have a certain number, you can assign it to an object
• e.g.: 1+1 à assign it to an object: a<-1+1 OR a<-2
• then you can calculate with a à a+1=3
• Or you can make it more difficult and calculate with logarithm of a à log(a) = 0,301

In R, the logarithm is a funcNon à the objects that calculate, plot, and perform staNsNcal funcNons. ‘a’
is here the input for the log funcNon, and in that context, the input of a R-funcNon is referred to as an
argument à a is the argument of the funcNon log




• FuncNons in R are always wriQen with brackets
• Here, the log has only 1 argument, but someNmes the funcNons take many more arguments




• Seq generates a sequence of numbers à in this example, we have applied 3 numbers to that
seq-funcNon
– From: it tells R where to start
– To: it tells R where to stop
– By: how large should the steps be
• If a funcNon has mulNple arguments, these need to be separated by a comma.
Decimal numbers = point !!
• The output (of myX) is again an object à special object = vector = one-dimensional matrix = a
column or a row of numbers
à you can use this vector as an input for the next funcNon (to calculate for instance the mean
or the sum)
• If you can to generate a plot à you always need to address a X-axis and a Y-axis.
– plot(x=myX,y=myY)

2) Ge4ng help in R
• Google = main source of help
• ? Help funcNon = build in

, • Tutorials on the internet
• R-mailing list
• Several blogs (R-blogger…)
• TwiQer/X (#rstats, @RLangTip, @insideR…)
• ChatGPT

Formula: myX<-seq(from=-3,to=+3,by=0.5)
Suppose someone asks you to generate this formula, but you have never worked with R before.... è
go to Google: “create a sequence of numbers in R” à you will find a typical help-page (there is one for
each funcNon in R)
• hQps://www.math.ucla.edu/~anderson/rw1001/library/base/html/seq.html




You can also ask help directly in R à you need to type: ? + name of the funcNon = e.g. ?seq
à you can see that there are also other arguments which aren’t mandatory
à an example that the help-page gives: seq(0,1,length.out=11) à what does this mean?
• This generated a sequence starNng from 0 to 1, with a total of 11 numbers/entries
• 0,1 are no longer named, but they are the same as from=0,to=1 à there is a default order in
the arguments, so if you don’t tell R what the arguments mean, R assumes that the order is
the same as the default (so the first number will be ‘from’, and the second will be ‘to’)

3) Rstudio
• In R, you rarely
– Work at the prompt
o More serious analysis, you will write down the R-commands in a script
– Save the output of an analysis
• Generate R-script file with commands
– Add comments
– Save for later use (the file/script with the commands)
• Rstudio = dedicated R-editor with
– Syntax highlighNng
– Auto compleNon
– See workspace, graphs, help…

, – 4 parts
o Script = text file where you
write down the commands
o Prompt = enter commands
and they are executed
o Work space
o Various = including graphs
and help

When you open Rstudio for the first Nme, you will
not have the script-screen (you will have 1 large
screen on the lel)
à go to: file à new file à R-script

= you open a blank file à here, you will write you script
The result will appear at the prompt (below the script-screen)

You can add comments (will have a different color)à proceeded by ‘#’




When you address your script to an object, this will appear in the work space = list of all the objects
that are currently stored in R’s memory

4) The working directory
à whenever you are doing a complicated analysis with several input files, and more than 1 output file,
and some text files that contain your R-script, it is a good idea to put them into the same directory
(same folder)
• One folder for each analysis
– Containing input and output files
• Set working directory at start of your analysis
• Advantages
– All files from 1 project grouped
– No need to specify full path when reading in/wriNng out files
• Specify the working directory
– Via menu
– Via script
• FuncNons
– getwd() # current working dir
– setwd(“c:\myDir\...”) # change working dir
à PC CLASS à DON’T PUT THE WORKING DIRECTORY ON THE DESKTOP

à getwd = NO arguments à prompt: you get an object within quotes = text string (all text strings in R
are quoted)
à if you are not saNsfied with your working directory, you can change it à setwd(“…”) à you have to
specify this (where the working directory lies)
• !!!!!! In windows, folders are separated by a backslash = \ à in Rstudion, this needs to be a
forward slash = /
– You can manually change this, OR à ctr + F = find and replace
• You sNll need to run the script, to change the working directory

, à there is another way to change the working directory
• Go to upper row à file, edit, code, view …
• Go to session à set working directory à choose directory à you can navigate to the windows
explorer




• Working with the working directory
– list.files(getwd()) # files in current dir
– read.table ( “myInput.txt”) # no need to specify the enNre path!
– write.table( “myOutput.txt”)

• list files: list all the files within the brackets à if you combine this with the getwd(), you will
get a list of all the files currently present in the working directory
• read.table: if you have assigned a working directory, and the input file is in that working
directory, the only thing that you need to supply is the name of that input file
à if you don’t specify the working directory, you need to supply the enNre path within the
brackets

Online material data reading, handling and plo5ng
ImporCng data part 1
Scanning your file
• Header line
– Line with the variable names should not have special characters à no special
characters in headers (# - & % ? @ ;)
o Each variable name should start with a leQer and it should only contain leQers,
numbers and underscores(_)
– Each column should have a header
• Cells with formula
– Copy, PasteSpecial, Values
• Empty cells à is this missing data? What do you do with that?
• Mixing numeric and character data

Other things you need to be aware of à they are not problemaNc
• Missing value indicator à which symbol or character has been used as missing value indicator
(e.g. NA,?)
• How are decimal numbers showing = comma? point?
• Delimiters à which character is separaNng the columns à whitespaces?
– When you have already whitespaces inside the cell, you must separate the columns by
a tab
€8,66
Accéder à l'intégralité du document:

Garantie de satisfaction à 100%
Disponible immédiatement après paiement
En ligne et en PDF
Tu n'es attaché à rien

Faites connaissance avec le vendeur

Seller avatar
Les scores de réputation sont basés sur le nombre de documents qu'un vendeur a vendus contre paiement ainsi que sur les avis qu'il a reçu pour ces documents. Il y a trois niveaux: Bronze, Argent et Or. Plus la réputation est bonne, plus vous pouvez faire confiance sur la qualité du travail des vendeurs.
goormansamber1 Universiteit Antwerpen
Voir profil
S'abonner Vous devez être connecté afin de suivre les étudiants ou les cours
Vendu
288
Membre depuis
2 année
Nombre de followers
92
Documents
51
Dernière vente
3 jours de cela

4,2

29 revues

5
14
4
9
3
3
2
3
1
0

Récemment consulté par vous

Pourquoi les étudiants choisissent Stuvia

Créé par d'autres étudiants, vérifié par les avis

Une qualité sur laquelle compter : rédigé par des étudiants qui ont réussi et évalué par d'autres qui ont utilisé ce document.

Le document ne convient pas ? Choisis un autre document

Aucun souci ! Tu peux sélectionner directement un autre document qui correspond mieux à ce que tu cherches.

Paye comme tu veux, apprends aussitôt

Aucun abonnement, aucun engagement. Paye selon tes habitudes par carte de crédit et télécharge ton document PDF instantanément.

Student with book image

“Acheté, téléchargé et réussi. C'est aussi simple que ça.”

Alisha Student

Foire aux questions