100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.2 TrustPilot
logo-home
Summary

Summary Practicals Data Mining

Rating
-
Sold
8
Pages
118
Uploaded on
03-05-2025
Written in
2024/2025

All practicals of the data mining course fully detailed (achieved 10/10 on the practical test)

Institution
Course











Whoops! We can’t load your doc right now. Try again or contact support.

Written for

Institution
Study
Course

Document information

Uploaded on
May 3, 2025
File latest updated on
May 25, 2025
Number of pages
118
Written in
2024/2025
Type
Summary

Subjects

Content preview

Prac%cal 1
Introduc)on
1) Basic Syntax of R
In general: don’t type directly at the prompt, but save/edit your commands in an editor (text file)

R can calculate à when you have a certain number, you can assign it to an object
• e.g.: 1+1 à assign it to an object: a<-1+1 OR a<-2
• then you can calculate with a à a+1=3
• Or you can make it more difficult and calculate with logarithm of a à log(a) = 0,301

In R, the logarithm is a funcNon à the objects that calculate, plot, and perform staNsNcal funcNons. ‘a’
is here the input for the log funcNon, and in that context, the input of a R-funcNon is referred to as an
argument à a is the argument of the funcNon log




• FuncNons in R are always wriQen with brackets
• Here, the log has only 1 argument, but someNmes the funcNons take many more arguments




• Seq generates a sequence of numbers à in this example, we have applied 3 numbers to that
seq-funcNon
– From: it tells R where to start
– To: it tells R where to stop
– By: how large should the steps be
• If a funcNon has mulNple arguments, these need to be separated by a comma.
Decimal numbers = point !!
• The output (of myX) is again an object à special object = vector = one-dimensional matrix = a
column or a row of numbers
à you can use this vector as an input for the next funcNon (to calculate for instance the mean
or the sum)
• If you can to generate a plot à you always need to address a X-axis and a Y-axis.
– plot(x=myX,y=myY)

2) Ge4ng help in R
• Google = main source of help
• ? Help funcNon = build in

, • Tutorials on the internet
• R-mailing list
• Several blogs (R-blogger…)
• TwiQer/X (#rstats, @RLangTip, @insideR…)
• ChatGPT

Formula: myX<-seq(from=-3,to=+3,by=0.5)
Suppose someone asks you to generate this formula, but you have never worked with R before.... è
go to Google: “create a sequence of numbers in R” à you will find a typical help-page (there is one for
each funcNon in R)
• hQps://www.math.ucla.edu/~anderson/rw1001/library/base/html/seq.html




You can also ask help directly in R à you need to type: ? + name of the funcNon = e.g. ?seq
à you can see that there are also other arguments which aren’t mandatory
à an example that the help-page gives: seq(0,1,length.out=11) à what does this mean?
• This generated a sequence starNng from 0 to 1, with a total of 11 numbers/entries
• 0,1 are no longer named, but they are the same as from=0,to=1 à there is a default order in
the arguments, so if you don’t tell R what the arguments mean, R assumes that the order is
the same as the default (so the first number will be ‘from’, and the second will be ‘to’)

3) Rstudio
• In R, you rarely
– Work at the prompt
o More serious analysis, you will write down the R-commands in a script
– Save the output of an analysis
• Generate R-script file with commands
– Add comments
– Save for later use (the file/script with the commands)
• Rstudio = dedicated R-editor with
– Syntax highlighNng
– Auto compleNon
– See workspace, graphs, help…

, – 4 parts
o Script = text file where you
write down the commands
o Prompt = enter commands
and they are executed
o Work space
o Various = including graphs
and help

When you open Rstudio for the first Nme, you will
not have the script-screen (you will have 1 large
screen on the lel)
à go to: file à new file à R-script

= you open a blank file à here, you will write you script
The result will appear at the prompt (below the script-screen)

You can add comments (will have a different color)à proceeded by ‘#’




When you address your script to an object, this will appear in the work space = list of all the objects
that are currently stored in R’s memory

4) The working directory
à whenever you are doing a complicated analysis with several input files, and more than 1 output file,
and some text files that contain your R-script, it is a good idea to put them into the same directory
(same folder)
• One folder for each analysis
– Containing input and output files
• Set working directory at start of your analysis
• Advantages
– All files from 1 project grouped
– No need to specify full path when reading in/wriNng out files
• Specify the working directory
– Via menu
– Via script
• FuncNons
– getwd() # current working dir
– setwd(“c:\myDir\...”) # change working dir
à PC CLASS à DON’T PUT THE WORKING DIRECTORY ON THE DESKTOP

à getwd = NO arguments à prompt: you get an object within quotes = text string (all text strings in R
are quoted)
à if you are not saNsfied with your working directory, you can change it à setwd(“…”) à you have to
specify this (where the working directory lies)
• !!!!!! In windows, folders are separated by a backslash = \ à in Rstudion, this needs to be a
forward slash = /
– You can manually change this, OR à ctr + F = find and replace
• You sNll need to run the script, to change the working directory

, à there is another way to change the working directory
• Go to upper row à file, edit, code, view …
• Go to session à set working directory à choose directory à you can navigate to the windows
explorer




• Working with the working directory
– list.files(getwd()) # files in current dir
– read.table ( “myInput.txt”) # no need to specify the enNre path!
– write.table( “myOutput.txt”)

• list files: list all the files within the brackets à if you combine this with the getwd(), you will
get a list of all the files currently present in the working directory
• read.table: if you have assigned a working directory, and the input file is in that working
directory, the only thing that you need to supply is the name of that input file
à if you don’t specify the working directory, you need to supply the enNre path within the
brackets

Online material data reading, handling and plo5ng
ImporCng data part 1
Scanning your file
• Header line
– Line with the variable names should not have special characters à no special
characters in headers (# - & % ? @ ;)
o Each variable name should start with a leQer and it should only contain leQers,
numbers and underscores(_)
– Each column should have a header
• Cells with formula
– Copy, PasteSpecial, Values
• Empty cells à is this missing data? What do you do with that?
• Mixing numeric and character data

Other things you need to be aware of à they are not problemaNc
• Missing value indicator à which symbol or character has been used as missing value indicator
(e.g. NA,?)
• How are decimal numbers showing = comma? point?
• Delimiters à which character is separaNng the columns à whitespaces?
– When you have already whitespaces inside the cell, you must separate the columns by
a tab

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
goormansamber1 Universiteit Antwerpen
Follow You need to be logged in order to follow users or courses
Sold
288
Member since
2 year
Number of followers
92
Documents
51
Last sold
3 days ago

4.2

29 reviews

5
14
4
9
3
3
2
3
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions