100% de satisfacción garantizada Inmediatamente disponible después del pago Tanto en línea como en PDF No estas atado a nada 4.2 TrustPilot
logo-home
Notas de lectura

ALL lecture notes of the course Data Analytics (P. Snoeren) (Grade: 8,5)

Puntuación
4.0
(1)
Vendido
9
Páginas
37
Subido en
10-03-2020
Escrito en
2019/2020

ALL Lecture notes of P. Snoeren INCLUDING: - Extra slides that he didn't include on canvas - Notes of what he said during the lectures - Exam info which he told us at the last lecture: which topics get how many questions

Institución
Grado











Ups! No podemos cargar tu documento ahora. Inténtalo de nuevo o contacta con soporte.

Escuela, estudio y materia

Institución
Estudio
Grado

Información del documento

Subido en
10 de marzo de 2020
Número de páginas
37
Escrito en
2019/2020
Tipo
Notas de lectura
Profesor(es)
Desconocido
Contiene
Todas las clases

Temas

Vista previa del contenido

Notes strategy analytics lectures

Lecture 1

2 phenomena why data science is important
1. The possibility of data collection in every aspect of business
2. There is huge technological development

Big data = very large data set with 3 distinct characteristics
1. Volume = quantity of generated & stored data
2. Variety = type & nature of the data
3. Velocity = speed at which the data is generated & processed

You can own and recombine the data

Data science = involves principles, processes, and techniques for understanding phenomena
via the analysis of data
Business understanding→ data collection→ data storage→ data analysis→ implementation
➢ We focus on data analysis

Data mining = the extraction of knowledge from data, via technologies that incorporate
these principles
Data driven decision making (DDD) = refers to the practice of basing decisions on the
analysis of dtaa, rather than purely intuition
2 decisions of interest
1. Need discovery (find patterns in the data that help you understand the business)
a. E.g. Walmart after a hurricane looked at data and looked at changes in
demand after a hurricane. Saw that water was in more demand so had more
water in stock.
2. Repetitive decisions (happen on large scale)
a. E.g. when you have a contract with telecom provider at one point you want
to switch to another provider for a better offer. If the first provider can
predict when you will switch they can retain you with a better offer.

Marketing
- Online advertising (whenever you click on a link with an advert, and the page loads,
there is a bidding war going on how much people want to pay for your click)
- Recommendations for cross-selling (amazon does this when you want to buy your
photo camera, you can also buy an SD card) Things that are bought together
- Customer relationship management (Easyjet tries to give you info about how much
you travel to give you a warm feeling)

Finance
- Credit scoring and trading
- Fraud detection
- Workforce management

,Retail
- Marketing (AH bonus weeks are determined by customer behavior in the store)
- Supply chain management (predict which products are going to be bank ordered and
prevent this from happening)

Data analytics = the process of examining datasets in order to draw conclusions about the
useful info they may contain
3 types of data analytics
1. Descriptive analytics (BI): What has happened?
a. Simple descriptive statistics, dashboards, charts, diagrams
b. Simple correlational methods
2. Predictive analytics: What could happen?
a. Regression, classification
b. Advanced correlation methods
3. Prescriptive analytics: What should we do?
a. A-B testing, advanced econometric techniques
b. Causality
We focus on the first 2

Data science can help generate & sustain a CA if you align:
- Human capital
o Incentives
- Organization
o Center of excellence + local implementation (you need data scientist who can
do all the magic and local implementation with people who can speak to data
scientist and TM team)
- Culture
o Data science at core of strategy making
- Infrastructure
o No data, no DDD

Challenges in data science
From a large mass of data, you can always find something but it’s not always 100% clear if
this is generalizable to the big crowd
➢ Risk of over-fitting

Data mining process
Cross industry standards process for data mining/ analytics
➢ Also the core of the course make sure you structure your assignments according to
this model
Data analytic thinking
- Routinely transform business problems into data science problems
- Tacit skill that is only learned through trial & error

Supervised learning
Training data has one feature that is the target

,Supervised = classification, regression, similarity matching
Unsupervised = clustering, profiling, co-occurrence grouping
Both = similarity matching, link prediction, data reduction




Boundaries
- Knowledge discovery and data mining (KDD) is a subfield of machine learning
- Data science (prediction) is not econometrics (correlation & causality) is not a field of
statistics (interested if a observed distribution is likely to come from a random
distribution)
o Therefore, rely heavily on business understanding
o Always separate training, test and use data
o Also, this is why we are not interested in R2 or P-values (though we will use
other tools to evaluate models)

Case 1: Capital One
Right now very data driven company
Invest in high quality data
- Give customers random terms for their credit cards
- Allowed data on customers that normally weren’t given credit cards
- These turned out to be very profitable, i.e. those that pay off their det just enough
that they are not defaulting but Capital One still gets loads of interest

What can they do that other banks can’t?
- Customer acquisition
o Provide data driven services before they even spoke to them
- Product customization
o Differentiate interest rates for credit cards (make custom made products for
each individual customer)
- Customer retention
o Invested heavily in both IT and data analysts

What is required for Capital One to translate the business problem of fraud detection into a
data science task?

, Drawbacks of data driven strategy
- Cost and risk in data acquisition
o Providing customers with random terms for their credit cards is risky and in
short term likely to lead to losses
o Signet bank incurred losses for several years
- Capital One found out nobody recognized their brand
o Target variables generally short-term
o What is profitable in the short run does not necessarily help in the long run
- Might weed out certain customers
o Reciprocators vs. self-regulating stakeholders
o Customers who are likely to leave if someone else gives cheaper offer

Lecture 2

Datasets contain entities with certain attributes
Dataset = sample, population, data, set, work set
Entity = object, instance, observation, element, example, line, row, feature vector
Attribute = feature, characteristic, variable, column
- Predicted attribute = dependent, explained
- Predicting attribute = independent, explanatory

Model = a simplified representation of reality created to serve a purpose (abstraction of
irrelevant details)
Purpose
- Unsupervised setting: to identify (classes, group, patterns) → descriptive
- Supervised setting: to predict (try to estimate an unknown value) → predictive
o What is the value of this house?
Induction = generalizing from specific cases to general rules
e.g. developing classification and regression models
Deduction = applying general rules and specific facts to create other specific facts
e.g. using classification and regression models

Supervised & unsupervised not directly related to induction/ deduction, both can be both

Supervised segmentation
Objective: How can we segment the population into groups that differ from each with
respect to some quantity of interest?
Inputs: Informative attributes (have to be knowable beforehand, you can’t use the
value of an acquisition as input that still has to happen)
Knowable attributes that correlate with the target of interest
Outputs: Segments that are pure/ less impure in the quantity of interest
$9.03
Accede al documento completo:

100% de satisfacción garantizada
Inmediatamente disponible después del pago
Tanto en línea como en PDF
No estas atado a nada

Reseñas de compradores verificados

Se muestran los comentarios
5 año hace

4.0

1 reseñas

5
0
4
1
3
0
2
0
1
0
Reseñas confiables sobre Stuvia

Todas las reseñas las realizan usuarios reales de Stuvia después de compras verificadas.

Conoce al vendedor

Seller avatar
Los indicadores de reputación están sujetos a la cantidad de artículos vendidos por una tarifa y las reseñas que ha recibido por esos documentos. Hay tres niveles: Bronce, Plata y Oro. Cuanto mayor reputación, más podrás confiar en la calidad del trabajo del vendedor.
hannah2501 Universiteit van Amsterdam
Seguir Necesitas iniciar sesión para seguir a otros usuarios o asignaturas
Vendido
288
Miembro desde
10 año
Número de seguidores
229
Documentos
51
Última venta
8 meses hace

3.7

32 reseñas

5
8
4
11
3
9
2
2
1
2

Recientemente visto por ti

Por qué los estudiantes eligen Stuvia

Creado por compañeros estudiantes, verificado por reseñas

Calidad en la que puedes confiar: escrito por estudiantes que aprobaron y evaluado por otros que han usado estos resúmenes.

¿No estás satisfecho? Elige otro documento

¡No te preocupes! Puedes elegir directamente otro documento que se ajuste mejor a lo que buscas.

Paga como quieras, empieza a estudiar al instante

Sin suscripción, sin compromisos. Paga como estés acostumbrado con tarjeta de crédito y descarga tu documento PDF inmediatamente.

Student with book image

“Comprado, descargado y aprobado. Así de fácil puede ser.”

Alisha Student

Preguntas frecuentes