100% de satisfacción garantizada Inmediatamente disponible después del pago Tanto en línea como en PDF No estas atado a nada 4.2 TrustPilot
logo-home
Resumen

Summary Introduction to the practice of Statistics

Puntuación
-
Vendido
-
Páginas
27
Subido en
24-11-2020
Escrito en
2020/2021

Samenvatting van de voor het tentamen benodigde delen van het boek 'Introduction to the Practice of Statistics' van Moore, McCabe en Craig.

Institución
Grado










Ups! No podemos cargar tu documento ahora. Inténtalo de nuevo o contacta con soporte.

Libro relacionado

Escuela, estudio y materia

Institución
Estudio
Grado

Información del documento

¿Un libro?
No
¿Qué capítulos están resumidos?
H7, h10, h11, h12, h13, h15
Subido en
24 de noviembre de 2020
Número de páginas
27
Escrito en
2020/2021
Tipo
Resumen

Temas

Vista previa del contenido

7.1 Inference for the mean of a population
Both confidence intervals and tests of significance for the mean μ of a Normal population are
based on the sample mean x , which estimates the unknown μ. The sampling distribution of x
depends on σ. This fact causes no difficulty when σ is unknown, however, we must estimate
σ even though we are primarily interested in μ.
The t distributions
Suppose that we have a simple random sample (SRS) of size n from a Normally distributed
population with mean μ and standard deviation σ. The sample mean x is then Normally
distributed with mean μ and standard deviation σ / √ ❑. When σ is not known, we estimate it
with the sample standard deviation s, and then we estimate the standard deviation of x by
s/ √ ❑. This quantity is called the standard error of the sample mean x , and we denote it by
SEx. So, SEx = s/ √ ❑.
The standardized sample mean, or one-sample z statistic z = (x̄ - μ) / ( - μ) / () / (σ / √ ❑) is the basis for
inference about μ when σ is known. This statistic has the standard Normal distribution
N(0,1). However, when we substitute the standard error s/ √ ❑ for the standard deviation of
x , the statistic does not have a Normal distribution. It has a distribution that is new to us,
called a t distribution. A particular t distribution is specified by giving the degrees of freedom.
Suppose that an SRS of size n is drawn from an N(μ, σ) population, then the one-sample t
statistic t = (x̄ - μ) / ( - μ) / () / ( s/ √ ❑) has the t distribution with n-1 degrees of freedom.
We use t(k) to stand for the t distribution with k degrees of freedom. The degrees of freedom
for this t statistic come from the sample standard deviation s in the denominator of t. As s
has n-1 degrees of freedom, there is a different t distribution for each sample size.
The density curves of the t(k) distributions are similar in shape to the standard Normal curve.
That is, they are symmetric about 0 and are bell-shaped. However, the t distribution have
more probability in the tails and less in the center. In reference to the standardized sample
mean, this greater spread is due to the extra variability caused by substituting the random
variable s for the fixed parameter σ. As the degrees of freedom k increase, the t(k) density
gets closer to the N(0,1) curve. This reflects the fact that s will be closer to σ (more precise)
as the sample size increases.
With the t distributions to help us, we can now analyze a sample from a Normal population
with unknown σ or a large sample from a non-Normal population with unknown σ. Table D of
the book gives critical values t* for the t distributions.
The one-sample t confidence interval
The one-sample t confidence interval is similar in both reasoning and computational detail to
the z confidence interval. There, the margin of error for the population mean was z*(σ / √ ❑).
When σ is unknown, we replace it with its estimate s and switch from z* to t*. This means
that the margin of error for the population means when we use the data to estimate σ is t*(
σ / √❑). So, a level C confidence interval for μ is x ± t*(σ / √❑).
The one-sample t test
Significance tests using the standard error are also very similar to the z test that we studied
earlier. We still carry out the four steps common to all significance tests, but because we use
s in place of σ, we use a t distribution to find the P-value.
Ha: μ > μ0 is P(T ≥ t)
Ha: μ < μ0 is P(T ≤ t)
Ha: μ ≠ μ0 is 2P(T ≥ |t|)
When in doubt, always use a two-sided test.
Matched pairs t procedures

,Inference about a parameter of a single distribution is less common than comparative
inference. One common comparative design, however, makes use of single-sample
procedures. In a matched pairs study, subjects are matched in pairs, and their outcomes are
compared within each matched pair. For example, an experiment to compare two
smartphone packages might use pairs of subjects who are the same age, sex, and income
level. The idea is that matched subjects are more similar than unmatched subjects, so
comparing outcomes within each pair is more efficient. Matched pairs are also common
when randomization is not possible. For example, one situation calling for matched pairs is
when observations are taken on the same subjects under two different conditions or before
and after some intervention.
Robustness of the t procedure
All inference procedures are based on some conditions, such as Normality. Procedures that
are not strongly affected by violations of a condition are called robust. Robust procedures
are very useful in statistical practice because they can be used over a wide range of
conditions with good performance.
The assumption that the population is Normal rules out outliers, so the presence of outliers
shows that this assumption is not valid. The t procedures are not robust against outliers
because x and s are not resistant to outliers. Fortunately, the t procedures are quite robust
against non-Normality of the population except in the case of outliers or strong skewness.
Larger samples improve the accuracy of P-values and critical values from the t distributions
when the population is not Normal. This is true for two reasons:
1. The sampling distribution of the sample mean x from a large sample is close to
Normal (that’s the central limit theorem). Normality of the individual observations is of
little concern when the sample is large.
2. As the sample size n grows, the sample standard deviation s will be an accurate
estimate of σ whether or not the population has a Normal distribution. This fact is
closely related to the law of large numbers.
Except in the case of small samples, the assumption that the data are an SRS from the
population of interest is more crucial than the assumption that the population distribution is
Normal. Here are practical guidelines for inference on a single mean:
- Sample size less than 15: Use t procedures if the data are close to Normal. If the
data are clearly non-Normal or if outliers are present, do not use t.
- Sample size at least 15 and less than 40: The t procedures can be used except in the
presence of outliers or strong skewness.
- Large samples: The t procedures can be used even for clearly skewed distributions
when the sample is large.




7.2 Comparing two means
Two-sample problems are among the most common situations encountered in statistical
practice. With two-sample problems, the goal of inference is to compare the means of the

, response variable in two groups. Each group is considered to be a sample from a distinct
population. The responses in each group are independent of those in the other group.
A two-sample problem can arise from a randomized comparative experiment that randomly
divides the subjects into two groups and exposes each group to a different treatment. It can
also arise when comparing random samples separately selected from two populations.
Unlike the matched pairs designs studied earlier, there is no matching of the units in the two
samples, and the two samples may be of different sizes. When both population distributions
are symmetric, and especially when they are at least approximately Normal, a comparison of
the mean responses in the two populations is most often the goal of inference.
The two-sample z statistic
The natural estimator of the difference μ1 - μ2 is the difference between the sample means,
x 1 - x 2. If we are to base inference on this statistic, we must know its sampling distribution.
Here are some facts from our study of probability:
- The mean of the difference x 1 - x 2 is the sum of their variances. This follows from the
addition rule for variances. Because the samples are independent, their sample
means are independent random variables.
- If the two population distributions are both Normal, then the distribution of x 1 - x 2 is
also Normal. This is true because each sample mean alone is Normally distributed
and because a difference between independent Normal random variables is also
Normal.
The two-sample z statistic:
In the unlikely event that both population standard
deviations are known, the two-sample z statistic is the
basis for inference about μ1 - μ2. Exact z procedures are
seldom used, however, because σ1 and σ2 are rarely
known.

The two-sample t procedures
Suppose now that the population standard deviations σ1 and σ2 are not known. We estimate
them by the sample standard deviations s1 and s2 from our two samples. Following the
pattern of the one-sample case, we substitute the standard errors for the standard deviations
used in the two-sample z statistic. The result is the two-sample t statistic:
Unfortunately, this statistic does not have a t
distribution. Nonetheless, we can approximate the
distribution of the two-sample t statistic by using the t(k)
distribution with an approximation for the degrees of
freedom k. We use these approximations to find
approximate values of t* for confidence intervals and to
find approximate P-values for significance tests. Here
are two approximations:
1. Use an approximation known as the Satterthwaite approximation for the value of k. It
is calculated from the data and, in general, will not be a whole number.
2. Use k equal to the smaller of n1 - 1 and n2 - 1.
The two-sample t confidence interval
We now apply the basic ideas about t procedures to the problem of comparing two means
when the standard deviations are unknown. We start with confidence intervals.
Suppose that an SRS of size n1 is drawn from a Normal population with unknown mean μ1
and that an independent SRS of size n2 is drawn from another Normal population with
$7.78
Accede al documento completo:

100% de satisfacción garantizada
Inmediatamente disponible después del pago
Tanto en línea como en PDF
No estas atado a nada

Conoce al vendedor

Seller avatar
Los indicadores de reputación están sujetos a la cantidad de artículos vendidos por una tarifa y las reseñas que ha recibido por esos documentos. Hay tres niveles: Bronce, Plata y Oro. Cuanto mayor reputación, más podrás confiar en la calidad del trabajo del vendedor.
ahuisman00 Universiteit Leiden
Seguir Necesitas iniciar sesión para seguir a otros usuarios o asignaturas
Vendido
234
Miembro desde
6 año
Número de seguidores
164
Documentos
37
Última venta
3 semanas hace
Samenvattingen Pedagogische Wetenschappen Leiden

Hoi! Ik ben student Pedagogische Wetenschappen aan de Universiteit Leiden en maak altijd samenvattingen voor de tentamens. Tot nu toe heb ik geen herkansingen hoeven doen, dus voor mij werken ze goed. Ik hoop dat jij er ook wat aan hebt en wens je veel succes met je studie!

4.2

23 reseñas

5
9
4
10
3
3
2
1
1
0

Recientemente visto por ti

Por qué los estudiantes eligen Stuvia

Creado por compañeros estudiantes, verificado por reseñas

Calidad en la que puedes confiar: escrito por estudiantes que aprobaron y evaluado por otros que han usado estos resúmenes.

¿No estás satisfecho? Elige otro documento

¡No te preocupes! Puedes elegir directamente otro documento que se ajuste mejor a lo que buscas.

Paga como quieras, empieza a estudiar al instante

Sin suscripción, sin compromisos. Paga como estés acostumbrado con tarjeta de crédito y descarga tu documento PDF inmediatamente.

Student with book image

“Comprado, descargado y aprobado. Así de fácil puede ser.”

Alisha Student

Preguntas frecuentes