Garantie de satisfaction à 100% Disponible immédiatement après paiement En ligne et en PDF Tu n'es attaché à rien 4.2 TrustPilot
logo-home
Notes de cours

Summary of Business Analytics

Note
-
Vendu
4
Pages
34
Publié le
12-01-2021
Écrit en
2020/2021

This document is a summary of the lectures by dr. Bettina Siflinger. The course is a mayor course in the joint Data Science bachelor of Tilburg University and Technische Universiteit Eindhoven. It contains a summary of the lecture notes and remarks from the teacher.

Montrer plus Lire moins
Établissement
Cours











Oups ! Impossible de charger votre document. Réessayez ou contactez le support.

Livre connecté

École, étude et sujet

Établissement
Cours
Cours

Infos sur le Document

Publié le
12 janvier 2021
Fichier mis à jour le
1 février 2021
Nombre de pages
34
Écrit en
2020/2021
Type
Notes de cours
Professeur(s)
Dr. betinna siflinger
Contient
Toutes les classes

Sujets

Aperçu du contenu

JBM040: Business Analytics
Quartile 2: 2020 – 2021
Teacher: dr. B.M. Siflinger



Business analysis is the ability of firms/organizations to collect, analyse and act
on data




1

,Introduction & Important concepts in probability and statistics
Part 1. Introduction
There are two estimation problems:
1. Prediction: Develop a formula for making predictions about the dependent
variable, based on the observed values of the independent variables.
General question you ask yourself: What happens?

2. Causal analysis: Independent variables are regarded as causes of the
dependent variable. The goal is to determine whether a particular independent
variable really affect the dependent variable, and to estimate the magnitude of
that effect, if any.
General question you ask yourself: Why does it happen?

Now, consider the data generation process for a linear model: y=β 0 + β 1 x 1 +..+ β k x k
with outcome: y , regressors: x 1 , .. x k, and “true” parameters: β 0 ,.. , β k . Its error
term is: u N ( 0 , σ 2 I ) .You should make an assumption for relationship of x=x 1 ,… , x k
and u : E ( u|x )=0.
- E(u∨x) indicates if x and u are dependent or not.
- I is the identity matrix, with only 1’s in the diagonal.

The main goal of OLS is to obtain the estimates ^β 0 , β^ 1 , … , β^ k that minimize sum of
squared residuals.

OLS has two goals with respect to the two estimation problems. They have
different quantities of interest, but the same calculations are involved:

Predictive modelling: Estimate conditional mean E( y∨x) .
^
E ( y∨x )= ^β 0 + ^β 1 x 1+ …+ ^β k x k

Causal estimation: Estimate partial derivative (slope parameter) with respect
to some x j .
^
∂ E ( y|x ) ^
=β j
∂ xj

Both of the goals can be achieved simultaneously by OLS under the condition of
the assumption of zero conditional mean: E ( u|x )=0 .
E ( y|x )=E ( xβ +u|x )=xβ + E(u∨x)

The prediction procedure is interested in the regression line that fits the data as
close as possible. E(u∨x) does not play a role because the prediction is based on
the things that you observe, which E(u∨x) is not. Now it is possible to obtain the
best fit to the data according to least squares criterion

Causal estimation is interested in a particular β j . The causal interpretation of β j
fails if E ( u|x j ) ≠ 0, because the partial derivative with respect to E(u∨x) must be
zero because otherwise we do not get β . Instead get a biased estimate of β j .

All these methods can be used in econometrics.
Econometrics: “based upon the development of statistical methods for estimating
economic relationships, testing economic theories, and evaluating and implementing


2

,government and business policy” . It has the goal to infer that one variable has a
causal effect on another variable. You can use the ceteris paribus analysis.
Investigate the effect of x j on y when all the other factors are fixed. For example:
Problem: There is mostly observational data available
Solution: Impose assumptions to simulate ceteris paribus analysis.
Make sure
that x j and u are independent.
In an exercise, it could be that a regression can be found based on two
parameters. However, there can be other factors that influence the outcome.
Due to omitted variables bias, the estimated regression coefficient b is. This b^ is
only unbiased if cov ( x , u )=0:

^ cov ( x , y ) =cov ( x , xb+u ) =b+ cov ( x ,u )
b=
cov (x , x ) cov ( x , x) cov ( x , x)

Part 2. Probability theory: Random variables
The probability distribution is a function that describes the probability of
obtaining possible values that a random variable X can take on. In addition, the
discrete random variable is a list of outcomes x 1 , … , x k with their probabilities
p1 , … , pk . The continuous random variable is a variable that takes value in a
continuum.

These random variable have an expected value E( X) or μ, which is the average
of all possible values of X . The calculation of the estimate value is different for
different type of random variables:
k
- Discrete RV: E ( x )=∑ x j p j
j=1

- Continuous RV: E ( x )= ∫ x f ( x ) dx
−∞


This calculation has some properties:
 A constant c : E ( c )=c
 Constants a and b : E ( aX +b )=aE ( X ) +b
 (a 1 , … , ak ) are constants, (X 1 , … , X k ) are random variables:
n n
E
(∑ ) ∑
i=1
ai X i =
i=1
a i E( X i)


The variance says something about the distance from X to its mean μ.
2 2 2 2
Var ( X )=σ =E [ ( X−μ ) ] = E ( X )−μ
It has properties:
 Constant X : Var ( X )=0
 Constants a and b : Var ( a+ bX )=b 2 Var ( X )
 Standard deviation: sd ( X )=√ Var ( X)

The covariance measures the linear dependence between the random variables
X and Y .
Cov ( X , Y )=σ xy =E [ ( X−μ x )( Y −μY ) ]=E ( XY )−E ( X ) E(Y )



3

, It has properties:
 If X and Y are independent: Cov ( X , Y )=0
 Constants a 1 , b1 , a2 , b2: Cov ( a1 X+ b1 , a2 Y + b2 )=a1 a2 Cov ( X , Y )

The correlation coefficient is an indicator of how much two random variables
correlate. This value always lays within the range [−1 ,1].
Cov ( X , Y ) σ xy
Corr ( X , Y )= =
sd ( X ) sd (Y ) σ x σ y
It has properties:
 Cov ( X , Y ) and Corr ( X , Y ) have the same sign
 Cov ( X , Y )=0→ Corr ( X , Y )=0

The properties of the variance of sums of random variables:
 Constants a and b : Var ( aX + bY )=a 2 Var ( X ) +b 2 Var ( Y ) +2 ab Cov ( X , Y )
 X and Y uncorrelated: Var ( X +Y )=Var ( X ) + Var ( Y )=Var ( X−Y )
 X 1 , … , X n parwise uncorrelated random variable and a i :i=1 , … , n are
n
2
constants: Var ( a1 X 1+ …+ an X n ) =∑ ai Var( X i)
i=1
The conditional expectation of the relationship between X and Y is denoted by
E(Y ∨X ). Calculate Y which is related to X .

It has properties:
 Function c ( X): E ( c ( X )|X )=c ( X)
 Functions a ( X ) and b (X ): E [ a ( X ) Y +b ( X )| X ] =a ( X ) E ( Y | X ) +b( X)
 X and Y are independent: E ( Y |X )=E(Y )

The Law of iterated expectations (LIE): E ( E ( Y |X ) )=E(Y ). The E(Y ) is a
n
weighted average of the E(Y ∨X =x j) with weights p j → E (Y )=∑ pk E(Y ∨X=x k ).
k=1


Part 3. Finite sample properties
From here on, random variables are also notated as lower case letters. Finite
sample properties are the properties of an estimator that holds for any sample
size. Take a random sample ( y 1 , y2 , … , y n) from a population distribution
depending on unknown parameter θ . An estimator of θ is a rule that assigns each
possible outcome of the sample a value of θ :
n
1
 Natural estimator for μ (mean): y= ∑y
n i=1 i
^
 Estimator θ^ for θ : θ=h ( y 1 , y 2 , … , y n ) where h is some function of RV
^
The estimator θ is a RV because it depends on a random sample. It is an
unbiased estimator if E ( θ^ ) =θ for all possible θ . This indicates that unbiasedness
does not depend on the sample size. The bias of an estimator θ^ :Bias ( θ^ )=E ( θ)−θ
^ .

σ2
The sample variance of an estimator is Var ( y )= . In a sequence of unbiased
n
estimators, the one with the smallest variance is preferred.



4
€4,74
Accéder à l'intégralité du document:

Garantie de satisfaction à 100%
Disponible immédiatement après paiement
En ligne et en PDF
Tu n'es attaché à rien

Faites connaissance avec le vendeur

Seller avatar
Les scores de réputation sont basés sur le nombre de documents qu'un vendeur a vendus contre paiement ainsi que sur les avis qu'il a reçu pour ces documents. Il y a trois niveaux: Bronze, Argent et Or. Plus la réputation est bonne, plus vous pouvez faire confiance sur la qualité du travail des vendeurs.
datasciencestudent Technische Universiteit Eindhoven
S'abonner Vous devez être connecté afin de suivre les étudiants ou les cours
Vendu
39
Membre depuis
5 année
Nombre de followers
31
Documents
15
Dernière vente
8 mois de cela

3,5

2 revues

5
1
4
0
3
0
2
1
1
0

Récemment consulté par vous

Pourquoi les étudiants choisissent Stuvia

Créé par d'autres étudiants, vérifié par les avis

Une qualité sur laquelle compter : rédigé par des étudiants qui ont réussi et évalué par d'autres qui ont utilisé ce document.

Le document ne convient pas ? Choisis un autre document

Aucun souci ! Tu peux sélectionner directement un autre document qui correspond mieux à ce que tu cherches.

Paye comme tu veux, apprends aussitôt

Aucun abonnement, aucun engagement. Paye selon tes habitudes par carte de crédit et télécharge ton document PDF instantanément.

Student with book image

“Acheté, téléchargé et réussi. C'est aussi simple que ça.”

Alisha Student

Foire aux questions