100% de satisfacción garantizada Inmediatamente disponible después del pago Tanto en línea como en PDF No estas atado a nada 4.2 TrustPilot
logo-home
Resumen

Summary Advanced Statistics

Puntuación
-
Vendido
-
Páginas
10
Subido en
21-03-2025
Escrito en
2024/2025

Summary of the advanced statistics course.

Institución
Grado









Ups! No podemos cargar tu documento ahora. Inténtalo de nuevo o contacta con soporte.

Escuela, estudio y materia

Institución
Estudio
Grado

Información del documento

Subido en
21 de marzo de 2025
Número de páginas
10
Escrito en
2024/2025
Tipo
Resumen

Temas

Vista previa del contenido

Summary Advanced Statistics
Some definitions
P-value => the chance of observing a difference from H0 at least as extreme as the one in you sample
 P-hacking: Performing a large number of statistical tests, only reporting the ones that are
statistically significant, thereby increasing the risk of false positive results.

Standard Error (SE) => a measure of uncertainty of an estimate, so how much the estimate is
expected to vary from the estimate of the true population.
 It helps understand how reliable or representative our sample is as an estimate of the
population.
 A smaller standard error suggests a more reliable estimate, while a larger one indicates more
uncertainty.
Standard deviation (SD) => tells us how spread out or varied a set of data points is from the average
(mean).
 It helps understand the degree of variability or dispersion in a dataset.
 A larger standard deviation means the data points are more spread out, while a smaller one
indicates they are closer to the mean.
Degrees of freedom (DF) => represent the number of values in the final calculation of a statistic that
are free to vary.
 It measures the flexibility or constraints in data.
 It's the number of data points minus the number of parameters estimated or restrictions
imposed in a statistical analysis.

Tidy data => Every row is one is measurement in space and time, columns are variables with meaning
in the context of a hypothesis or model. Long format!
o Minimal number of columns = Degrees of freedom Model
Power => the probability that a statistical test or analysis will correctly detect a true effect or
difference when it exists. It measures the ability of a test to avoid a "false negative" or Type II error,
indicating the test's sensitivity to finding real effects.

Type I Error => incorrectly rejecting a true null hypothesis. In other words, it's a false positive,
indicating that there is an effect or difference when there isn't one. Underestimate of SE.
Type II Error => failing to reject a false null hypothesis. In other words, it's a false negative, indicating
that there is no effect or difference when there actually is one. Overestimate of SE.

Null deviance => measure for the deviance of the null model (maximal deviance explained by model).
Residual deviance => measure for deviance of the residuals (variance not explained by model).
- Residual deviance should be the same or close to degrees of freedom = model fits good
Deviance explained: (Null deviance - residual deviance)/Null deviance
- Overdispersion => having more variation or "spread" in the data than the model predicts,
which can lead to inaccurate model results and conclusions.
o You can deal with this in different ways:
 The dispersion parameter can be used to correct for the
underestimate/overestimate of SE.
 Quasipoisson (poisson but with more variance)
 Negative binomial (poisson but with more variance, more complex, separate
parameters for mean and variance)
 Mixed Models, but only if there is a random effect factor.
- Under dispersion => having less variation or "spread" than the model predicts, which can
also affect the accuracy of model results and conclusions.

, Fisher scoring => how many steps it took to find the best fit (4-8 is good, above 15 bad).

Studies where data is not independent:
- Longitudinal studies: Subject is measured over time
- Repeated measurement: Subject receives multiple treatments.
- Nested designs: One subject nested in treatment (not a factorial design).
- Split plot design: Combination of factorial and nested design.

Statistical Considerations of Study Design
 Balance => Equal sample size per category
o Not always possible => but increases power and simplicity of the analysis
 Replication => true replication is absolutely essential
o The required sample size depends on…
 Variance (the stochastic part of the process)
 Effect size (how large the true differences are)
 Model complexity (more parameters require more samples)
o Variance and effect size can be determined from a pilot study, previous research, or
expert knowledge.
o Model complexity depends on what kind of comparison you want to make, what
distribution you think the outcome has conditional on the explanatory variables,
whether you believe there to be potential confounders that have to be included, etc.
o No pseudo replications (measurements on the same experimental units, like leaves
on one tree instead of multiple trees).
 Randomization => random allocation of treatments, locations, or even the order in which you
process samples.
o Avoiding confounding effects
o Without randomization, samples run first will have slightly different measurement
error than samples run last.
 Blocking => a way to group similar things or subjects together.
o For estimating confounding effects
o A block is a subset of the experimental material within which experimental units are
expected to be homogeneous (e.g., a microarray is a block).
o Blocking can make it easier to detect the true effects of the factors you're studying by
reducing the influence of other variables that could muddy the result.
o Nested mixed models can use blocking (blocks nested in blocks).




Required sample size (n) depends on the complexity of the study design:
- Groups
- Natural variability
- Experimental techniques

 Small n => significance testing
 Medium n => regularized linear models (also small)
 Large n => predictive models
$8.48
Accede al documento completo:

100% de satisfacción garantizada
Inmediatamente disponible después del pago
Tanto en línea como en PDF
No estas atado a nada

Conoce al vendedor
Seller avatar
mayastelzer

Documento también disponible en un lote

Conoce al vendedor

Seller avatar
mayastelzer Universiteit Leiden
Seguir Necesitas iniciar sesión para seguir a otros usuarios o asignaturas
Vendido
2
Miembro desde
9 meses
Número de seguidores
0
Documentos
9
Última venta
1 mes hace

0.0

0 reseñas

5
0
4
0
3
0
2
0
1
0

Recientemente visto por ti

Por qué los estudiantes eligen Stuvia

Creado por compañeros estudiantes, verificado por reseñas

Calidad en la que puedes confiar: escrito por estudiantes que aprobaron y evaluado por otros que han usado estos resúmenes.

¿No estás satisfecho? Elige otro documento

¡No te preocupes! Puedes elegir directamente otro documento que se ajuste mejor a lo que buscas.

Paga como quieras, empieza a estudiar al instante

Sin suscripción, sin compromisos. Paga como estés acostumbrado con tarjeta de crédito y descarga tu documento PDF inmediatamente.

Student with book image

“Comprado, descargado y aprobado. Así de fácil puede ser.”

Alisha Student

Preguntas frecuentes