100% de satisfacción garantizada Inmediatamente disponible después del pago Tanto en línea como en PDF No estas atado a nada 4.2 TrustPilot
logo-home
Notas de lectura

Data Analytics (ITNPBD6) Class Notes

Puntuación
-
Vendido
1
Páginas
74
Subido en
12-07-2021
Escrito en
2020/2021

Handwritten notes related to the Data Analytics (ITNPBD6) course at the University of Stirling. I obtained a high 1st in the module. The notes cover all the basics Machine Learning principles and techniques. The outline is as follows: Chapter 1: Introduction to Data Analytics Chapter 2: Model Accuracy Chapter 3: Classification Chapter 4: Regression Chapter 5: Clustering Chapter 6: Neural Networks and Deep Learning Chapter 7: Metaheuristics, hyper-parameters and dimensionality reduction Chapter 8: Natural Language Processing Chapter 9: Visualization, ethics, trust and explainability The notes are 74 pages in total and contain graphs and figures.

Mostrar más Leer menos
Institución
Grado











Ups! No podemos cargar tu documento ahora. Inténtalo de nuevo o contacta con soporte.

Escuela, estudio y materia

Institución
Estudio
Grado

Información del documento

Subido en
12 de julio de 2021
Número de páginas
74
Escrito en
2020/2021
Tipo
Notas de lectura
Profesor(es)
Multiple
Contiene
Todas las clases

Temas

Vista previa del contenido

ITNBD6




DATA ANALYTICS

,CHAPTER 1

INTRODUCTION TO DATA ANALYTICS

Objectives:

• Describe CRISP-DM and how it can be applied to real-world problems
• Recognise the differences between variable types
• Discuss the differences between continuous and discrete distributions
• Identify the need for data cleaning
• Load a dataset and use visualisations to clean the data in both Orange and Python


1.1 Data Analysis
1.1.1 Model
In data analysis the approach
,
is driven by learning something
about data that would have been hard or even impossible to write

computer code for by hand .




The knowledge learnt is then embedded in what is called a model ,
a general framework capable of performing a particular
task Typically the model will take in data points and output
.
,


predictions or estimates It has a number of parameters that
.




are determined as part of the learning process and is a
representation of what has been learned about a data set .




The functionality of the model is determined by the data and
not by pre programmed rules
-
.




• Data mining :
process of learning patterns , making predictions and
building the model .





Hyper parameters :
settings that control how the model learns
and operates .





Learning 1 training :
process by which a model 's parameters are

determined .





Inference :
process of providing previously unseen data to a
trained model and making predictions or estimates
about them .

,1.1.2 Data
Data is the raw material used for machine learning consisting
of a set of variables .
Each variable can take a range of values
known as its domain .




Water volume = C ?) minutes
-


b
-


k d
variable parameter variable



The data in question is a snapshot of real world and data
mining assumes that whatever produced the data will in some

way continue to produce it in the same way in the future .




We might encounter problems with this approach as the data
we're provided with might :




• have errors
• be incorrect

be missing parts

be insufficient in quantity



A collection of data ,
known as a data set ,
contains a set of
values for a number of variables It .
is often represented in
tabular format in which one row is a single data point ( or
instance ) and is made up of a value for each of the variables
in table
the .
A column of the table corresponds to a single
variable .




1.1.3 Supervised vs. Unsupervised Learning
In supervised learning the data the model is trying learn
to
from is marked with the correct values and it can be used to
test the model
quality of a .




It involves data that describes both the inputs and outputs
to the system and requires a
mapping to be learned from
the inputs to the outputs .




In unsupervised learning there is no existing set of clusters to
compare against .




It involves only the inputs and requires the algorithm to
organize and characterize the data in some way .

, 1.1.4 Tasks performed with Data Mining
SUPERVISED LEARNING



Classification :



An inputpattern is classified as belonging to one of a

number of possible classes The output variable is .




nominal and the inputs can be a mix of numerical
and nominal .





Prediction 1 Regression :



A continuous output value is calculated from an
input pattern The learning task is to find the
.




relationship between the input variables and one or
more output variables The inputs can be a mixture .




of numeric and nominal variables but the output of ,


a regression task is always numeric .




UNSUPERVISED LEARNING



Clustering :


Data points that are close to each other , by some

distance metric ,
are assigned to one of a number of
clusters so that members of different clusters are

far apart .




The input variables can be numeric or nominal .




Clustering is similar to classification except that the
class labels are not given by the training data but ,



they are inferred from the distribution of points in the
input data .





Novelty detection :



Requires the system to spot patterns of data that
have not been seen before There is no output variable .




in the training data but the resulting system will have
a binary output that classifies each input pattern as

novel or not .





probability distribution estimation :


Build a model that takes a single data point as input
and produces an estimate of the density of the
population data at that point .




A model is built from the data in the form of a

function from the inputs X to a probability estimate ,


which is not known and must be inferred
p CX ) ,
.
$20.99
Accede al documento completo:

100% de satisfacción garantizada
Inmediatamente disponible después del pago
Tanto en línea como en PDF
No estas atado a nada

Conoce al vendedor
Seller avatar
clacc

Conoce al vendedor

Seller avatar
clacc The University of Stirling
Seguir Necesitas iniciar sesión para seguir a otros usuarios o asignaturas
Vendido
1
Miembro desde
4 año
Número de seguidores
1
Documentos
1
Última venta
4 año hace

0.0

0 reseñas

5
0
4
0
3
0
2
0
1
0

Recientemente visto por ti

Por qué los estudiantes eligen Stuvia

Creado por compañeros estudiantes, verificado por reseñas

Calidad en la que puedes confiar: escrito por estudiantes que aprobaron y evaluado por otros que han usado estos resúmenes.

¿No estás satisfecho? Elige otro documento

¡No te preocupes! Puedes elegir directamente otro documento que se ajuste mejor a lo que buscas.

Paga como quieras, empieza a estudiar al instante

Sin suscripción, sin compromisos. Paga como estés acostumbrado con tarjeta de crédito y descarga tu documento PDF inmediatamente.

Student with book image

“Comprado, descargado y aprobado. Así de fácil puede ser.”

Alisha Student

Preguntas frecuentes