100% de satisfacción garantizada Inmediatamente disponible después del pago Tanto en línea como en PDF No estas atado a nada 4,6 TrustPilot
logo-home
Resumen

Summary Comprehensive final exam review: EVERYTHING you need to know from student who got 96% in Stats 2244. Includes notes from all prep 101 sessions.

Puntuación
-
Vendido
-
Páginas
42
Subido en
12-08-2023
Escrito en
2023/2024

Comprehensive final exam review: EVERYTHING you need to know from student who got 96% in Stats 2244. Includes notes from all prep 101 sessions.

Institución
Grado

Vista previa del contenido

STATS 2244 FINAL STUDY
REVIEW
December 10 2022

,Summarizing and Exploring Data
Data Stage: collect, monitor the quality of, and conduct a preliminary exploration of the data
Does the data collection method need “tweaking” to ensure quality (monitoring)?
Are there patterns, trends, or associations apparent in the data?
Are there any outliers or missing values? If so, how will you handle them?

Selecting a Summary
 How many variables do you have?
o Univariate: 1 variable
 Will describe the distribution of this one variable
o Bivariate: 2 variables
o Multivariate: three or more variables
 Can explore relationships between variables
 What types of variables do you have?
o Explanatory / response
o Quantitative / categorical
 What characteristic(s) or relationship do you want to emphasize?
o Parameter, Measures of Spread, Relationship

Measures of Spread
Measures of Spread: characterize the variability in a distribution
Range
Range = maximum – minimum
 Inflated by outliers and skew
5-Number Summary
 5-number summary splits a distribution into 4 quarters
Minimum, Q1, x̃, Q3, maximum
 Q1 = 25th percentile
 X̃ = median
o Centermost value: order the dataset smallest→largest then take the middle value
 Q3 = 75th percentile
Interquartile Range (IQR): Q3-Q1
IQR = Q3 – Q1
 Q3 = third quartile = 75th percentile
 Q1 = first quartile = 25th percentile
 IQR contains the 50% of the data surrounding the median (25% above, 25% below)
1

,Percentiles
Percentile: a value below which a particular percentage of the distribution lies
 Quartiles are percentiles which divide the distribution into 4 equal size sections
o Q1 = first quartile = 25th percentile = 25% of distribution lies below this value
o Q2 = second quartile = 50th percentile = 50% of distribution lies below this value
o Q3 = third quartile = 75th percentile = 75% of distribution lies below this value
 If a value is in the 90th percentile, it is in the top 10% of the distribution




Variance
 Takes into account all the data we have
Sample variance
 Sample variance is a statistic
 The larger the s2, the more variable the data (wider the spread)
 Calculates the average of the square differences from the sample mean
 R automatically uses this equation to calculate variance (assumes we’re working with
sample data, not population data)




Population variance
 Population variance is a parameter
 The larger the σ2, the more variable the data (wider spread)
 Calculates the average of the square differences from the population mean (µ)
o Takes every value in the distribution and subtracts it from the population mean
o Squares the differences (between values and mean) to get rid of the negatives
o Divides by the total number of values in the distribution (N)




Standard Deviation
 Square root of the sample variance

2

, o Gets rid of the squaring and returns variance to original units
 Suitable for use with distributions without extreme outliers and/or skew
o Extreme outliers can make it seem like data has a wide variation, but really just
due to outliers




Measures of Center
Measures of center: tell us the “typical” value of a distribution
Mean
Mean (average): add up all the values and divide by the total number of values
 Affected by outliers
Median
Median: arrange values smallest → largest and take centermost value
 50th percentile: 50% of distribution below, 50% of the distribution above
 Is not affected by outliers / extreme values

Describing Shape of a Distribution
 Can describe the shape of a distribution when it is represented as a histogram
o Histogram: shows frequency distribution for univariate quantitative data
 All values for variable on x-axis; frequency on y-axis
Symmetry
Symmetry: the degree to which the distribution looks like a mirror image when split down the
center




 Opposite of symmetric is skewed


3

Escuela, estudio y materia

Institución
Estudio
Grado

Información del documento

Subido en
12 de agosto de 2023
Número de páginas
42
Escrito en
2023/2024
Tipo
Resumen

Temas

$40.99
Accede al documento completo:

100% de satisfacción garantizada
Inmediatamente disponible después del pago
Tanto en línea como en PDF
No estas atado a nada

Conoce al vendedor
Seller avatar
oawn18

Conoce al vendedor

Seller avatar
oawn18 University of Western Ontario
Seguir Necesitas iniciar sesión para seguir a otros usuarios o asignaturas
Vendido
0
Miembro desde
2 año
Número de seguidores
0
Documentos
8
Última venta
-

0.0

0 reseñas

5
0
4
0
3
0
2
0
1
0

Recientemente visto por ti

Por qué los estudiantes eligen Stuvia

Creado por compañeros estudiantes, verificado por reseñas

Calidad en la que puedes confiar: escrito por estudiantes que aprobaron y evaluado por otros que han usado estos resúmenes.

¿No estás satisfecho? Elige otro documento

¡No te preocupes! Puedes elegir directamente otro documento que se ajuste mejor a lo que buscas.

Paga como quieras, empieza a estudiar al instante

Sin suscripción, sin compromisos. Paga como estés acostumbrado con tarjeta de crédito y descarga tu documento PDF inmediatamente.

Student with book image

“Comprado, descargado y aprobado. Así de fácil puede ser.”

Alisha Student

Preguntas frecuentes