Garantie de satisfaction à 100% Disponible immédiatement après paiement En ligne et en PDF Tu n'es attaché à rien 4.2 TrustPilot
logo-home
Notes de cours

Data Analytics (ITNPBD6) Class Notes

Note
-
Vendu
1
Pages
74
Publié le
12-07-2021
Écrit en
2020/2021

Handwritten notes related to the Data Analytics (ITNPBD6) course at the University of Stirling. I obtained a high 1st in the module. The notes cover all the basics Machine Learning principles and techniques. The outline is as follows: Chapter 1: Introduction to Data Analytics Chapter 2: Model Accuracy Chapter 3: Classification Chapter 4: Regression Chapter 5: Clustering Chapter 6: Neural Networks and Deep Learning Chapter 7: Metaheuristics, hyper-parameters and dimensionality reduction Chapter 8: Natural Language Processing Chapter 9: Visualization, ethics, trust and explainability The notes are 74 pages in total and contain graphs and figures.

Montrer plus Lire moins
Établissement
Cours











Oups ! Impossible de charger votre document. Réessayez ou contactez le support.

École, étude et sujet

Établissement
Cours
Cours

Infos sur le Document

Publié le
12 juillet 2021
Nombre de pages
74
Écrit en
2020/2021
Type
Notes de cours
Professeur(s)
Multiple
Contient
Toutes les classes

Sujets

Aperçu du contenu

ITNBD6




DATA ANALYTICS

,CHAPTER 1

INTRODUCTION TO DATA ANALYTICS

Objectives:

• Describe CRISP-DM and how it can be applied to real-world problems
• Recognise the differences between variable types
• Discuss the differences between continuous and discrete distributions
• Identify the need for data cleaning
• Load a dataset and use visualisations to clean the data in both Orange and Python


1.1 Data Analysis
1.1.1 Model
In data analysis the approach
,
is driven by learning something
about data that would have been hard or even impossible to write

computer code for by hand .




The knowledge learnt is then embedded in what is called a model ,
a general framework capable of performing a particular
task Typically the model will take in data points and output
.
,


predictions or estimates It has a number of parameters that
.




are determined as part of the learning process and is a
representation of what has been learned about a data set .




The functionality of the model is determined by the data and
not by pre programmed rules
-
.




• Data mining :
process of learning patterns , making predictions and
building the model .





Hyper parameters :
settings that control how the model learns
and operates .





Learning 1 training :
process by which a model 's parameters are

determined .





Inference :
process of providing previously unseen data to a
trained model and making predictions or estimates
about them .

,1.1.2 Data
Data is the raw material used for machine learning consisting
of a set of variables .
Each variable can take a range of values
known as its domain .




Water volume = C ?) minutes
-


b
-


k d
variable parameter variable



The data in question is a snapshot of real world and data
mining assumes that whatever produced the data will in some

way continue to produce it in the same way in the future .




We might encounter problems with this approach as the data
we're provided with might :




• have errors
• be incorrect

be missing parts

be insufficient in quantity



A collection of data ,
known as a data set ,
contains a set of
values for a number of variables It .
is often represented in
tabular format in which one row is a single data point ( or
instance ) and is made up of a value for each of the variables
in table
the .
A column of the table corresponds to a single
variable .




1.1.3 Supervised vs. Unsupervised Learning
In supervised learning the data the model is trying learn
to
from is marked with the correct values and it can be used to
test the model
quality of a .




It involves data that describes both the inputs and outputs
to the system and requires a
mapping to be learned from
the inputs to the outputs .




In unsupervised learning there is no existing set of clusters to
compare against .




It involves only the inputs and requires the algorithm to
organize and characterize the data in some way .

, 1.1.4 Tasks performed with Data Mining
SUPERVISED LEARNING



Classification :



An inputpattern is classified as belonging to one of a

number of possible classes The output variable is .




nominal and the inputs can be a mix of numerical
and nominal .





Prediction 1 Regression :



A continuous output value is calculated from an
input pattern The learning task is to find the
.




relationship between the input variables and one or
more output variables The inputs can be a mixture .




of numeric and nominal variables but the output of ,


a regression task is always numeric .




UNSUPERVISED LEARNING



Clustering :


Data points that are close to each other , by some

distance metric ,
are assigned to one of a number of
clusters so that members of different clusters are

far apart .




The input variables can be numeric or nominal .




Clustering is similar to classification except that the
class labels are not given by the training data but ,



they are inferred from the distribution of points in the
input data .





Novelty detection :



Requires the system to spot patterns of data that
have not been seen before There is no output variable .




in the training data but the resulting system will have
a binary output that classifies each input pattern as

novel or not .





probability distribution estimation :


Build a model that takes a single data point as input
and produces an estimate of the density of the
population data at that point .




A model is built from the data in the form of a

function from the inputs X to a probability estimate ,


which is not known and must be inferred
p CX ) ,
.
€18,56
Accéder à l'intégralité du document:

Garantie de satisfaction à 100%
Disponible immédiatement après paiement
En ligne et en PDF
Tu n'es attaché à rien

Faites connaissance avec le vendeur
Seller avatar
clacc

Faites connaissance avec le vendeur

Seller avatar
clacc The University of Stirling
S'abonner Vous devez être connecté afin de suivre les étudiants ou les cours
Vendu
1
Membre depuis
4 année
Nombre de followers
1
Documents
1
Dernière vente
4 année de cela

0,0

0 revues

5
0
4
0
3
0
2
0
1
0

Récemment consulté par vous

Pourquoi les étudiants choisissent Stuvia

Créé par d'autres étudiants, vérifié par les avis

Une qualité sur laquelle compter : rédigé par des étudiants qui ont réussi et évalué par d'autres qui ont utilisé ce document.

Le document ne convient pas ? Choisis un autre document

Aucun souci ! Tu peux sélectionner directement un autre document qui correspond mieux à ce que tu cherches.

Paye comme tu veux, apprends aussitôt

Aucun abonnement, aucun engagement. Paye selon tes habitudes par carte de crédit et télécharge ton document PDF instantanément.

Student with book image

“Acheté, téléchargé et réussi. C'est aussi simple que ça.”

Alisha Student

Foire aux questions