100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.6 TrustPilot
logo-home
Summary

Summary Cheat sheet for Applied Statistics and R

Rating
-
Sold
-
Pages
2
Uploaded on
18-04-2024
Written in
2023/2024

This document serves as a quick reference guide for students and professionals involved in applied statistics and business forecasting using R. It compiles essential statistical concepts, R programming syntax, and functions relevant to data analysis, hypothesis testing, regression, and time series analysis. The cheat sheet includes references to R packages and functions for data manipulation, visualisation, and machine learning, as well as statistical formulas critical for accurate business cycle forecasting. Whether for academic purposes, interview preparation, or practical application in business settings, this cheat sheet is a valuable tool for efficient and effective statistical analysis and predictive modelling in R.

Show more Read less
Institution
Course

Content preview

-




Dis X
:
exp .prob
# ·

element , observation
.




Rows :
I # value the random -

5
Data Probability dist .
var may assume -
(numericorend
- .




categorical >
-
uniform prob dist -




I
-
, .



columns features




·
-
:
f(x) :
prob. of each random var
.
paia = b
names/labels
&
x =
nominal f(x)
-
:
=



Scale of measurement
a
ordinal : can be ordered
· discrete uniform :
f(x) =
* weight O ; otherwise
-




f(x))
-




has property interval diff is imp & expressed in fixed E(x) M Ex E(x) (a + b) /2
= =
: =
.
-
.



of the rest Unit
ratio ratio is meaningful, + Of Var(x) (b -a) 2/12
Varcx =3M
:
=


absolute power of e
: measured in NATURAL UNIT -
O Normal prob dist
O
interval instead
.
.



Co has use
specific meaning z(jy)
-


of Specific point e
-




Quantitative f(x)
~
a
=
e
defined & chande
E
can't be or
Within X 42 AUcoff(x b + wX, +2 2π
prob dist Prob
= -


· cont . .
.
,, e = 2 . 71828
.


# 3 74159
F
=


defined by M , o
E I
.




-




"Sin
0
oE(p) 15
=




sampling dist of p
=
x
Stat
Descriptive ·
-
Class
.

total AUC =
1 0 25 ↑ -



~
. - =




· E(X) =
M (if E(X) =
prob param
. >
-
Point est . is UNBIASED) A standard determine


Yategoricalnon-overlappingcategoryeast-min
.




pas
normal
In -
width
frequency Ot treat as

[ finite
dist SD
If A
· · : =
. >
- 0 05 + 5: 1 ,
INFINITE
.




# class infinite & zere St normal mean t
d =
z ↳ finite
pop correction factor
: .




red
median-
freq
= ,




rela X M
-

=
standard of meand z
-




o
M
=

err mode
.
.




·
.


O
↳ convert to
~




E
Size (for both T , 2 test)
· pCt freg =
rela freg x 100 Sampling
·
St normal




1
.




normal ad
.



assumed
.




can de
.




① most if n > 30 -
cases , pointest
outlier > 50 point est
highly skewed use
if pop have n Var
, is pop .
,
.




X
Quantitative M
categorical fixeth ②
If pop is not normal but roughly symmetric n =
15 suffice
· S

can
,




histogram used E & proportion
T


TT
no natural ⑤ is believed approx be
separation if pop
-


bar chart : NORMAL n X15 &



E
- : , .




# merd e

axis :
-category
# Ar central
>
-
sample
limit theorem
w/ size n can
: use if pop

be approx
,
,
is normally dist.
not

normal If n becomes large
o + al
+




&
i 5
frea /pct
·
freg /rela skewness AUC btw E(X)
prob of sample within A from mean use A E(E) + A

E
· +
-
. .
.
,

frea M14
↳ confidence confidenceofainterval
-




-

pie Chart :
display rela. ↓ skewed Symmetric R-skewed
~
higher deg of confidence + higher MoE
+
:j horizontalivarofintea
.
-
axis
freg / pCt freg
·

[pointest error]



·
.
· Interval Est I margin of -




d)
. .

5
& t-test (uses to est
.


unknown : .
+

·
of circle + 3500 =
100% % Frea
5 known : 2-test
pie is selected as
3
*
- in-2)[(ti) + id f
* if pos of skewness =
= 0 .
=
n -
1
. .




show 9th
.
in
est Can be obtained based M
wisely , it can for good ·
↑ d 0 f. + ↑ dispersion
②F T
moderately L-skewed data info t-dist depend d f -
>
. .
-

Co :
on hist. - other ) on 0
meaningful - d closer
.

- ↑ .f to
.



0 +

↳d
.




:
o symmetric L
. .
:
-
ex 95% C e a 0 05 normal

moderatelyR-skewed
. =




2012
.




-
20 [ : * = >
-
+a +0 .
0 . f ? 100 .
=
,
/2 825
.

OK to assume normal
(1-tailed)
Numerical Measures
e affected by
outliers
diff btw . two means (X-12) interval est
test stat ,
for hyp test M . -M2 ; known . I
quartile
.




of location o mean F 3x; opercentile & (1 x2) Do
=
: =
measure
GF
- -




2 = *2 2x2(d x) z =
n
(0)(n + 1) =
-



(p
-

=


=
-




&
, 4
M Exi > Location of
-
-

E * just replace
pth percentile
=

"Tailed lower-Ho : Me-Mz Do Ha : Mi-M2 < D
D
* w/x xz
,
hyp test
-
:



-
z . -


-
upper-Ho : M1 M2 Do , Ha My-M2) Do
-

:
Ewixi
-




o weighted mean * GW ,
0x
=

ECX-E2)
-
:



identify
=

Mc-M2 2-tailed -

Mo :
M M2 -
=
Do Ha :
Mc-M2 #Do
-

x2
Growth Factor EWi
-
,
note : rate

I
mean
= Return
+ 1 · geometric mean :
Fg = ~
X , xz xn ...


over
of change
period # FORM lower-Ho :
M > Mo , Ha : M < Mo
100 sensitive to outlier
one-tailed


·
e
standard deviation upper - Ho M < Mo , Ha :
M) Mo
variability
:
measure of o Range = max-min o I

↓ not sensitive to outlier
S 52 deasier Hypothesis Testing two-tailed




·
· IQR =
&3- &I
=

to interpret
-
Ho :
M Mo =
, Ha :
M + Mo
6 =
52 than variance Error : Type I : REJECT Ho when it's TRUE - easier
variance
'do
o
same unit as data)
useful for
Ho When it's FALSE avoid by using not rejectHo' instead
nocoef
comparing variability 452 3[xi x of variation Type I : ACCEPT - -




·
-
=



can be controlled at &
compareentity
of 2 + data most 1 .
er r at a time evidence
not enough 7

( +100) %
↑ val approach Reject Ho if P-val - &
to reject
in434
G2 3(Xi
:
-




1
=
Of
-




(100)
was
evenfotain a
y data




M
-


&




1
WI diff




M
= &
O & Y lower-tailed two-tailed :




:
min
measure of dist Shape o outliers :

111
.




s 2 Score

Zi
>S

Xi
or

-
2- score <- 3

d
p-val AUGowert =
AUC
:
reject Ho
=

I smaller



·
: area
upper
cases : incorrectly recorded
=

Reject Ho Reject Ho if p-ralX(
M- 32 M-20 M
critical value approach
-2 M M + 2 M + 20 M + 3




-
correctly recorded
111 58 26 %
.



* some casesCan't be removed
Rejection Rule :
just at least 1 side is
Ho (but both is better
enough for
95 44 %
fraud detections rejecting
~
lex
.



.




99 72 %
upper-tailed 2 Za two-tailed : 2- 2212
Lower limit Q1 -1 5/1QR) lower-tailed 2-22 T
.

: .
: :




·1824242
measure of assoc btw 2 wars. Boxplot Upper limit :
Q3 + 1 5 (1QR)



1
I 1
.
. .
-




Reject


Ho Reject Ho &
·
u
do not reject do not
- - - - - - - - -


Ho
covariance (measure of linear assoc )
min max &
Q2 &3
E
.
-




reject Ho
n

"II >
In
*

&
F
Sxy =
E(xi -

)(yi -

y) Oppos
t rela
outliers
-


5
(lower upper IIII
.




E
.
=

n
O + neg rela
-
1 ~

Oxy


sh
. .



3(Xi Mx)(Yi -

My)
22 2212
=

strong-linear rela
2012
-
-

· near-1 : .
3
N

· E
linear rela * but
-
correlation coef ( person corr near 1 :
strong + .
A swapping
de careful !! *
.




·
the closer to 0
bir upper a lower is .
possible if it's still ans. the questions
,
the weaker rela
.




7
r ·
IV & V Hyp testD Ho :
B1 0 Ha : Be = 0 ⑳ -
E

Testing
=
Linear Regression for Significant
.
:
x = independent var
, y :
dependent var ,
↓ E

F-test -




F
regressor T-test ,
simple linear reg
. multiple linear reg.
2 + ind
-
use est of 82 (var (3)
& 1 y only vars
.


1x . .




*
normally adding more var =
better est. * For Simple LR : T-test & F-test are the same !
(err deSSEd) .




fortoomanadver
but simple reg is not designed
multiple LR
/F-test
D for overall significant
.

:


Straight line
graph hyperplane result significant (Hest/var!
:

may give diff .
T-test D for individual
model :
y =
Bo B1 + + E y =
Bo +
B x1 B2xz
, + +... +
BpXp + 3
Bo Mean Square corresponding d f

+ Bij + in var
o



Be0
. .




Ms
mean sq Sum of squares
I




Bo
=
. or
y SST n 1
.
= -





*





E(y) Bo corresponding d o f
E
=
+ . .



SSR


P
E(y) =

Bo + BeX 1 + B2xc +... + BpXp
y bo byx Bi pop param * for simple ; D 1 SSE n p 1
&
- -



+

,
= = =
1
.




bo + byx+ + 02x2 +... + Dpxp

MSE = 4
y
·
=
Of Bi N -



bi = est




-
.

E(3) 0, Var (5) 82
#
=
: O
P =H ind Var n = #ODs
· E
= -Bi

Least .,
.




314 ; -ji)




me
Square min S (Bo , Br , B2 1 .... Bpl 2
method : ↓ be minimized we respect
i=
T- test Ho : Bi = 0 -
to the coef.
5)
+

Bit &
---

minimize 3(xi -
1) /Di -
Ldiff = o Ha :
n p 1
y 5 =5
= -




aims tofind
-


x
eco i I
diff btw
Top Y ; and
1
.
.




3(X : -
1/2 least S zi ·
est .
~
Y, bo =
y -
b, 3 =
(x'x) Xy
+
t = 2 ; Sbi =
S


E
- 5


reg . model j =
X'B =
Bo + B I




Sbi Ja(xi -
=)
I F
E
*, y = mean of X , y -




n
x (1 , x1 , /2 Xk]


+
y at the ithobs Reject Ho !
=
, ...,

Xi , Yi = val of X, .



- xB x(x(x) x y
if p-value & + -tap or
- = =

Hy ,
-

H at matrix ↓
H = x(X(x)
-
X based on t-dist W/d 0 f = n -p -
1
#
. .
.




coef Of SSR SSE R-Syntax : LM assumption Chec k
SST +
; SS Sum of squares
-
. = :



determination (r))
↑ same for both
31Y :
total

-[12 =
reg

3) Y:
.
-


y,
-




+ 3(Yi
err .




- Yi) Adjusted MUL coef .
of det -
= adding var -derr
. .
+ ISSE + SSR =
SST-SSESSRd]
-
↓R2 SSRA
-

Normality
plot (Im-name which 2)
-




,
Q
= ...... .......
X


& aims to Compensate #added ind
=


simple multiple adjusted R2 Var
.


Heterosedstcityplotpredyagainstthe standardizeda
.




higher , the better
.
>
- the
linear
R4(m)
reg
.
ANOVA result
r2 =
SSR
know much reg help .
defining Ra = 1 -

-
avar if d depends on i + hetero
Sst the data any trends (ex low i low var
source
PF
.



plot (Im-name, which
explainedSample
3)
corr Coef (Exy) * Simple reg * violation
=
=

r % of var can be
· . .


= + 100
residual err. vstd 61)
·




Snow P (which Resid val us fitted , 2 QQ , 3 fitted residuals , (max
V 1 =

linear assoc btw
= =

of D ) > X X : =



by the model rxy (Sign r2 -
. ...

,
total =
,
. .

Written for

Institution
Study
Unknown
Course

Document information

Uploaded on
April 18, 2024
Number of pages
2
Written in
2023/2024
Type
SUMMARY

Subjects

$8.17
Get access to the full document:

100% satisfaction guarantee
Immediately available after payment
Both online and in PDF
No strings attached

Get to know the seller
Seller avatar
ajaychoudhari

Get to know the seller

Seller avatar
ajaychoudhari The University of Manchester
Follow You need to be logged in order to follow users or courses
Sold
-
Member since
1 year
Number of followers
0
Documents
1
Last sold
-

0.0

0 reviews

5
0
4
0
3
0
2
0
1
0

Trending documents

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions