100% de satisfacción garantizada Inmediatamente disponible después del pago Tanto en línea como en PDF No estas atado a nada 4.2 TrustPilot
logo-home
Resumen

Samenvatting - Foundations of Data Analytics (2IAB0)

Puntuación
-
Vendido
-
Páginas
82
Subido en
21-04-2024
Escrito en
2023/2024

A summary of the course 2IAB0 - Foundations of Data Analytics (2023/2024). Every lecture is summarized and organized.

Institución
Grado











Ups! No podemos cargar tu documento ahora. Inténtalo de nuevo o contacta con soporte.

Escuela, estudio y materia

Institución
Estudio
Grado

Información del documento

Subido en
21 de abril de 2024
Número de páginas
82
Escrito en
2023/2024
Tipo
Resumen

Temas

Vista previa del contenido

SUMMARY FOUNDATIONS
OF DATA ANALYTICS
2IAB0 – 2023/2024




Biermans, Cas


,Contents
Lecture 1: (Monday 13-11-2023) ........................................................................................................ 2
Lecture 2: (Monday 20-11-2023) ...................................................................................................... 11
Lecture 3: (Monday 27-11-2023) ...................................................................................................... 26
Lecture 4: (Monday 4-12-2023) ........................................................................................................ 33
Lecture 5: (Monday 11-12-2023) ...................................................................................................... 45
Lecture 6: (Monday 18-12-2023) ...................................................................................................... 57




1

,Lecture 1: (Monday 13-11-2023)
Types of data analytics:
- Descriptive: insight into the past
- Predictive: looking into the future
- Prescriptive: data-driven advice on how to take action to influence or change the future

Data analytics life cycle:




Data: raw numbers, facts, etc.
Information: structured, meaningful, and useful numbers and facts

Data types:
- Categorical data: data that has no intrinsic value:
o Nominal: two or more outcomes that have no natural order
e.g. movie genre or hair color
o Ordinal: two or more outcomes that have a natural order
e.g. movie ratings (bad, neutral or good), level of education
- Numerical (quantitative) data: data that has an intrinsic numerical value:
o Continuous: data that can attain any value on a given measurement scale
▪ Interval data: equal intervals represent equal differences, there is no fixed
“zero point”
e.g. clock time, birth year
▪ Ratio data: both differences and ratios make sense, there is a fixed "zero
point"
e.g. movie budget, distance, time duration
o Discrete: data that can only attain certain values (typically integers)
e.g. the number of days with sunshine in a certain year, the number of traffic
incidents

Reference table: to store “all” data in a table so that it can be looked up easily.
Demonstration table: to illustrate a point (with just enough data, or with a specific summary)




EDA: exploratory data analytics
Key features of EDA:
• getting to know the data before doing further analysis

2

, • extensively using plots
• generating questions
• detecting errors in data

Plots are a useful tool to discover unexpected relations and insights. Plots help us to explore and give
clues. Numerical summaries like averages help us to document essential features of data sets.
One should use both plots and numerical summaries. They complement each other. Numerical
summaries are often called statistics or summary statistics (note the double meaning of the word:
both a scientific field and computed numbers).

Summary statistics:
- Level: location summary statistics (typical values)
- Spread: scale summary statistics (how much do values vary?)
- Relation: association summary statistics (how do values of different quantities vary
simultaneously)

Location summary statistics:
- Mean/average:
1 𝑛
𝑥̄ = ∑ 𝑥𝑖
𝑛 𝑖=1
- Median: the value separating the higher half from the lower half of a data set
o Median computation:
▪ Order series of observations from small to large.
▪ If the number of observations is odd, take the middle value.
▪ If the number of observations is even, take the average of the two middle
values.
- Mode: most frequently occurring value, may be non-unique
The mean is sensitive to “outliers”, the median is not.
The mean can be misleading and difficult to interpret for non-symmetric data sets.




3
$9.69
Accede al documento completo:

100% de satisfacción garantizada
Inmediatamente disponible después del pago
Tanto en línea como en PDF
No estas atado a nada

Conoce al vendedor
Seller avatar
biermanscas

Conoce al vendedor

Seller avatar
biermanscas
Seguir Necesitas iniciar sesión para seguir a otros usuarios o asignaturas
Vendido
0
Miembro desde
3 año
Número de seguidores
0
Documentos
1
Última venta
-

0.0

0 reseñas

5
0
4
0
3
0
2
0
1
0

Recientemente visto por ti

Por qué los estudiantes eligen Stuvia

Creado por compañeros estudiantes, verificado por reseñas

Calidad en la que puedes confiar: escrito por estudiantes que aprobaron y evaluado por otros que han usado estos resúmenes.

¿No estás satisfecho? Elige otro documento

¡No te preocupes! Puedes elegir directamente otro documento que se ajuste mejor a lo que buscas.

Paga como quieras, empieza a estudiar al instante

Sin suscripción, sin compromisos. Paga como estés acostumbrado con tarjeta de crédito y descarga tu documento PDF inmediatamente.

Student with book image

“Comprado, descargado y aprobado. Así de fácil puede ser.”

Alisha Student

Preguntas frecuentes