100% tevredenheidsgarantie Direct beschikbaar na je betaling Lees online óf als PDF Geen vaste maandelijkse kosten 4,6 TrustPilot
logo-home
Samenvatting

Summary Cheat sheet for Applied Statistics and R

Beoordeling
-
Verkocht
-
Pagina's
2
Geüpload op
18-04-2024
Geschreven in
2023/2024

This document serves as a quick reference guide for students and professionals involved in applied statistics and business forecasting using R. It compiles essential statistical concepts, R programming syntax, and functions relevant to data analysis, hypothesis testing, regression, and time series analysis. The cheat sheet includes references to R packages and functions for data manipulation, visualisation, and machine learning, as well as statistical formulas critical for accurate business cycle forecasting. Whether for academic purposes, interview preparation, or practical application in business settings, this cheat sheet is a valuable tool for efficient and effective statistical analysis and predictive modelling in R.

Meer zien Lees minder
Instelling
Vak

Voorbeeld van de inhoud

-




Dis X
:
exp .prob
# ·

element , observation
.




Rows :
I # value the random -

5
Data Probability dist .
var may assume -
(numericorend
- .




categorical >
-
uniform prob dist -




I
-
, .



columns features




·
-
:
f(x) :
prob. of each random var
.
paia = b
names/labels
&
x =
nominal f(x)
-
:
=



Scale of measurement
a
ordinal : can be ordered
· discrete uniform :
f(x) =
* weight O ; otherwise
-




f(x))
-




has property interval diff is imp & expressed in fixed E(x) M Ex E(x) (a + b) /2
= =
: =
.
-
.



of the rest Unit
ratio ratio is meaningful, + Of Var(x) (b -a) 2/12
Varcx =3M
:
=


absolute power of e
: measured in NATURAL UNIT -
O Normal prob dist
O
interval instead
.
.



Co has use
specific meaning z(jy)
-


of Specific point e
-




Quantitative f(x)
~
a
=
e
defined & chande
E
can't be or
Within X 42 AUcoff(x b + wX, +2 2π
prob dist Prob
= -


· cont . .
.
,, e = 2 . 71828
.


# 3 74159
F
=


defined by M , o
E I
.




-




"Sin
0
oE(p) 15
=




sampling dist of p
=
x
Stat
Descriptive ·
-
Class
.

total AUC =
1 0 25 ↑ -



~
. - =




· E(X) =
M (if E(X) =
prob param
. >
-
Point est . is UNBIASED) A standard determine


Yategoricalnon-overlappingcategoryeast-min
.




pas
normal
In -
width
frequency Ot treat as

[ finite
dist SD
If A
· · : =
. >
- 0 05 + 5: 1 ,
INFINITE
.




# class infinite & zere St normal mean t
d =
z ↳ finite
pop correction factor
: .




red
median-
freq
= ,




rela X M
-

=
standard of meand z
-




o
M
=

err mode
.
.




·
.


O
↳ convert to
~




E
Size (for both T , 2 test)
· pCt freg =
rela freg x 100 Sampling
·
St normal




1
.




normal ad
.



assumed
.




can de
.




① most if n > 30 -
cases , pointest
outlier > 50 point est
highly skewed use
if pop have n Var
, is pop .
,
.




X
Quantitative M
categorical fixeth ②
If pop is not normal but roughly symmetric n =
15 suffice
· S

can
,




histogram used E & proportion
T


TT
no natural ⑤ is believed approx be
separation if pop
-


bar chart : NORMAL n X15 &



E
- : , .




# merd e

axis :
-category
# Ar central
>
-
sample
limit theorem
w/ size n can
: use if pop

be approx
,
,
is normally dist.
not

normal If n becomes large
o + al
+




&
i 5
frea /pct
·
freg /rela skewness AUC btw E(X)
prob of sample within A from mean use A E(E) + A

E
· +
-
. .
.
,

frea M14
↳ confidence confidenceofainterval
-




-

pie Chart :
display rela. ↓ skewed Symmetric R-skewed
~
higher deg of confidence + higher MoE
+
:j horizontalivarofintea
.
-
axis
freg / pCt freg
·

[pointest error]



·
.
· Interval Est I margin of -




d)
. .

5
& t-test (uses to est
.


unknown : .
+

·
of circle + 3500 =
100% % Frea
5 known : 2-test
pie is selected as
3
*
- in-2)[(ti) + id f
* if pos of skewness =
= 0 .
=
n -
1
. .




show 9th
.
in
est Can be obtained based M
wisely , it can for good ·
↑ d 0 f. + ↑ dispersion
②F T
moderately L-skewed data info t-dist depend d f -
>
. .
-

Co :
on hist. - other ) on 0
meaningful - d closer
.

- ↑ .f to
.



0 +

↳d
.




:
o symmetric L
. .
:
-
ex 95% C e a 0 05 normal

moderatelyR-skewed
. =




2012
.




-
20 [ : * = >
-
+a +0 .
0 . f ? 100 .
=
,
/2 825
.

OK to assume normal
(1-tailed)
Numerical Measures
e affected by
outliers
diff btw . two means (X-12) interval est
test stat ,
for hyp test M . -M2 ; known . I
quartile
.




of location o mean F 3x; opercentile & (1 x2) Do
=
: =
measure
GF
- -




2 = *2 2x2(d x) z =
n
(0)(n + 1) =
-



(p
-

=


=
-




&
, 4
M Exi > Location of
-
-

E * just replace
pth percentile
=

"Tailed lower-Ho : Me-Mz Do Ha : Mi-M2 < D
D
* w/x xz
,
hyp test
-
:



-
z . -


-
upper-Ho : M1 M2 Do , Ha My-M2) Do
-

:
Ewixi
-




o weighted mean * GW ,
0x
=

ECX-E2)
-
:



identify
=

Mc-M2 2-tailed -

Mo :
M M2 -
=
Do Ha :
Mc-M2 #Do
-

x2
Growth Factor EWi
-
,
note : rate

I
mean
= Return
+ 1 · geometric mean :
Fg = ~
X , xz xn ...


over
of change
period # FORM lower-Ho :
M > Mo , Ha : M < Mo
100 sensitive to outlier
one-tailed


·
e
standard deviation upper - Ho M < Mo , Ha :
M) Mo
variability
:
measure of o Range = max-min o I

↓ not sensitive to outlier
S 52 deasier Hypothesis Testing two-tailed




·
· IQR =
&3- &I
=

to interpret
-
Ho :
M Mo =
, Ha :
M + Mo
6 =
52 than variance Error : Type I : REJECT Ho when it's TRUE - easier
variance
'do
o
same unit as data)
useful for
Ho When it's FALSE avoid by using not rejectHo' instead
nocoef
comparing variability 452 3[xi x of variation Type I : ACCEPT - -




·
-
=



can be controlled at &
compareentity
of 2 + data most 1 .
er r at a time evidence
not enough 7

( +100) %
↑ val approach Reject Ho if P-val - &
to reject
in434
G2 3(Xi
:
-




1
=
Of
-




(100)
was
evenfotain a
y data




M
-


&




1
WI diff




M
= &
O & Y lower-tailed two-tailed :




:
min
measure of dist Shape o outliers :

111
.




s 2 Score

Zi
>S

Xi
or

-
2- score <- 3

d
p-val AUGowert =
AUC
:
reject Ho
=

I smaller



·
: area
upper
cases : incorrectly recorded
=

Reject Ho Reject Ho if p-ralX(
M- 32 M-20 M
critical value approach
-2 M M + 2 M + 20 M + 3




-
correctly recorded
111 58 26 %
.



* some casesCan't be removed
Rejection Rule :
just at least 1 side is
Ho (but both is better
enough for
95 44 %
fraud detections rejecting
~
lex
.



.




99 72 %
upper-tailed 2 Za two-tailed : 2- 2212
Lower limit Q1 -1 5/1QR) lower-tailed 2-22 T
.

: .
: :




·1824242
measure of assoc btw 2 wars. Boxplot Upper limit :
Q3 + 1 5 (1QR)



1
I 1
.
. .
-




Reject


Ho Reject Ho &
·
u
do not reject do not
- - - - - - - - -


Ho
covariance (measure of linear assoc )
min max &
Q2 &3
E
.
-




reject Ho
n

"II >
In
*

&
F
Sxy =
E(xi -

)(yi -

y) Oppos
t rela
outliers
-


5
(lower upper IIII
.




E
.
=

n
O + neg rela
-
1 ~

Oxy


sh
. .



3(Xi Mx)(Yi -

My)
22 2212
=

strong-linear rela
2012
-
-

· near-1 : .
3
N

· E
linear rela * but
-
correlation coef ( person corr near 1 :
strong + .
A swapping
de careful !! *
.




·
the closer to 0
bir upper a lower is .
possible if it's still ans. the questions
,
the weaker rela
.




7
r ·
IV & V Hyp testD Ho :
B1 0 Ha : Be = 0 ⑳ -
E

Testing
=
Linear Regression for Significant
.
:
x = independent var
, y :
dependent var ,
↓ E

F-test -




F
regressor T-test ,
simple linear reg
. multiple linear reg.
2 + ind
-
use est of 82 (var (3)
& 1 y only vars
.


1x . .




*
normally adding more var =
better est. * For Simple LR : T-test & F-test are the same !
(err deSSEd) .




fortoomanadver
but simple reg is not designed
multiple LR
/F-test
D for overall significant
.

:


Straight line
graph hyperplane result significant (Hest/var!
:

may give diff .
T-test D for individual
model :
y =
Bo B1 + + E y =
Bo +
B x1 B2xz
, + +... +
BpXp + 3
Bo Mean Square corresponding d f

+ Bij + in var
o



Be0
. .




Ms
mean sq Sum of squares
I




Bo
=
. or
y SST n 1
.
= -





*





E(y) Bo corresponding d o f
E
=
+ . .



SSR


P
E(y) =

Bo + BeX 1 + B2xc +... + BpXp
y bo byx Bi pop param * for simple ; D 1 SSE n p 1
&
- -



+

,
= = =
1
.




bo + byx+ + 02x2 +... + Dpxp

MSE = 4
y
·
=
Of Bi N -



bi = est




-
.

E(3) 0, Var (5) 82
#
=
: O
P =H ind Var n = #ODs
· E
= -Bi

Least .,
.




314 ; -ji)




me
Square min S (Bo , Br , B2 1 .... Bpl 2
method : ↓ be minimized we respect
i=
T- test Ho : Bi = 0 -
to the coef.
5)
+

Bit &
---

minimize 3(xi -
1) /Di -
Ldiff = o Ha :
n p 1
y 5 =5
= -




aims tofind
-


x
eco i I
diff btw
Top Y ; and
1
.
.




3(X : -
1/2 least S zi ·
est .
~
Y, bo =
y -
b, 3 =
(x'x) Xy
+
t = 2 ; Sbi =
S


E
- 5


reg . model j =
X'B =
Bo + B I




Sbi Ja(xi -
=)
I F
E
*, y = mean of X , y -




n
x (1 , x1 , /2 Xk]


+
y at the ithobs Reject Ho !
=
, ...,

Xi , Yi = val of X, .



- xB x(x(x) x y
if p-value & + -tap or
- = =

Hy ,
-

H at matrix ↓
H = x(X(x)
-
X based on t-dist W/d 0 f = n -p -
1
#
. .
.




coef Of SSR SSE R-Syntax : LM assumption Chec k
SST +
; SS Sum of squares
-
. = :



determination (r))
↑ same for both
31Y :
total

-[12 =
reg

3) Y:
.
-


y,
-




+ 3(Yi
err .




- Yi) Adjusted MUL coef .
of det -
= adding var -derr
. .
+ ISSE + SSR =
SST-SSESSRd]
-
↓R2 SSRA
-

Normality
plot (Im-name which 2)
-




,
Q
= ...... .......
X


& aims to Compensate #added ind
=


simple multiple adjusted R2 Var
.


Heterosedstcityplotpredyagainstthe standardizeda
.




higher , the better
.
>
- the
linear
R4(m)
reg
.
ANOVA result
r2 =
SSR
know much reg help .
defining Ra = 1 -

-
avar if d depends on i + hetero
Sst the data any trends (ex low i low var
source
PF
.



plot (Im-name, which
explainedSample
3)
corr Coef (Exy) * Simple reg * violation
=
=

r % of var can be
· . .


= + 100
residual err. vstd 61)
·




Snow P (which Resid val us fitted , 2 QQ , 3 fitted residuals , (max
V 1 =

linear assoc btw
= =

of D ) > X X : =



by the model rxy (Sign r2 -
. ...

,
total =
,
. .

Geschreven voor

Instelling
Studie
Onbekend
Vak

Documentinformatie

Geüpload op
18 april 2024
Aantal pagina's
2
Geschreven in
2023/2024
Type
SAMENVATTING

Onderwerpen

€7,14
Krijg toegang tot het volledige document:

100% tevredenheidsgarantie
Direct beschikbaar na je betaling
Lees online óf als PDF
Geen vaste maandelijkse kosten

Maak kennis met de verkoper
Seller avatar
ajaychoudhari

Maak kennis met de verkoper

Seller avatar
ajaychoudhari The University of Manchester
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
-
Lid sinds
1 jaar
Aantal volgers
0
Documenten
1
Laatst verkocht
-

0,0

0 beoordelingen

5
0
4
0
3
0
2
0
1
0

Populaire documenten

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via Bancontact, iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo eenvoudig kan het zijn.”

Alisha Student

Veelgestelde vragen