Statistic 7.3
7.3 Optional Topics in Comparing Distributions
Inference for population spread
Most basic descriptive features of a distribution are center and
spread
F test for comparing the spread of two Normal populations
Unlike the t procedures for means, the F test and other
procedures for standard deviations are extremely sensitive to
non-Normal distributions
The lack of robustness does not improve in large samples
The F test for equality of spread
The F statistic and F distributions – when s12 and s22 are
sample variances from independent SRSs of sizes n1∧n2
drawn from Normal populations, the F statistic
s12
o F= 2
s2
o has the F distribution with n1−1∧n2−1 degrees of
freedom when H 0 : σ 1=σ 2 is true
The F distributions are a family of distributions with two
parameters: the degrees of freedom of the sample variances
in the numerator and denominator of the F statistic
F(j,k) for the F distribution with j degrees of freedom in the
numerator and k degrees of freedom in the denominator
The F distributions are not symmetric, but right-skewed
Because the sample variances cannot be negative, the F
statistic takes only positive values and F distribution has no
probability below 0
The peak of the F distribution is near 1
o Values far from 1 in either direction provide evidence
against the hypothesis of equal standard deviations
Table E
In F distributions the point with probability 0.05 below it is not
just the negative of the point with the probability 0.05 above it
2
larger s
1. Take the test statistic to be F= → this
smaller s2
amounts to naming the populations so that s 21 is the
larger of the observed sample variances. The resulting F
is always 1 or greater
2. Compare the value of F with the critical values from
Table E. Then double the probabilities obtained from the
table to get the significance level for two-sided F test.
The idea is that we calculate the probability in the upper tail
and double to obtain the probability of all ratios on either side
of 1 that are least as improbable as that observed?
, Robustness of Normal Inference procedures
The F test and other procedures about variances are so
lacking in robustness as to be of little use in practice
The robustness of the one-sample and two-sample t
procedures is remarkable
The t test and the corresponding CIs are among the most
reliable tools that statisticians use
Outliers can greatly disturb the t procedures
Two-sample procedures are less robust when the sample sizes
are not similar
The lack of robustness of the tests for variances is equally
remarkable
The power of the two-sample t test
Noncentral t distribution –
To find the power for the pooled two-sample t test, we
consider only H 0 : μ 1−μ2=0
1. Specify
a. An alternative value for μ1−μ2 that you consider
important to detect;
b. The sample size, n1∧n2 ;
c. A fixed significance level, ∝ ;
d. A guess at the standard deviation, σ
2. Find the degrees of freedom df =n1 +n2−2 and the value
of t* that will lead to rejection of H 0
|μ1−μ 2|
δ=
3. Calculate the noncentrality parameter
σ
√ 1 1
+
n1 n 2
4. Find the power as the probability that a noncentral t
random variable with degrees of freedom df and
noncentrality parameter δ will greater than t*. The
denominator in the noncentrality parameter, σ
√ 1 1
+
n 1 n2
is our guess at the standard error for the difference in
the sample means
Lecture 23
Independent groups t test
There are many kinds of independent groups tests:
o Wilcoxon Rank Sum Test
o Permutation Test
o T test (equal variance NOT assumed)
o T test (equal variance assumed)
7.3 Optional Topics in Comparing Distributions
Inference for population spread
Most basic descriptive features of a distribution are center and
spread
F test for comparing the spread of two Normal populations
Unlike the t procedures for means, the F test and other
procedures for standard deviations are extremely sensitive to
non-Normal distributions
The lack of robustness does not improve in large samples
The F test for equality of spread
The F statistic and F distributions – when s12 and s22 are
sample variances from independent SRSs of sizes n1∧n2
drawn from Normal populations, the F statistic
s12
o F= 2
s2
o has the F distribution with n1−1∧n2−1 degrees of
freedom when H 0 : σ 1=σ 2 is true
The F distributions are a family of distributions with two
parameters: the degrees of freedom of the sample variances
in the numerator and denominator of the F statistic
F(j,k) for the F distribution with j degrees of freedom in the
numerator and k degrees of freedom in the denominator
The F distributions are not symmetric, but right-skewed
Because the sample variances cannot be negative, the F
statistic takes only positive values and F distribution has no
probability below 0
The peak of the F distribution is near 1
o Values far from 1 in either direction provide evidence
against the hypothesis of equal standard deviations
Table E
In F distributions the point with probability 0.05 below it is not
just the negative of the point with the probability 0.05 above it
2
larger s
1. Take the test statistic to be F= → this
smaller s2
amounts to naming the populations so that s 21 is the
larger of the observed sample variances. The resulting F
is always 1 or greater
2. Compare the value of F with the critical values from
Table E. Then double the probabilities obtained from the
table to get the significance level for two-sided F test.
The idea is that we calculate the probability in the upper tail
and double to obtain the probability of all ratios on either side
of 1 that are least as improbable as that observed?
, Robustness of Normal Inference procedures
The F test and other procedures about variances are so
lacking in robustness as to be of little use in practice
The robustness of the one-sample and two-sample t
procedures is remarkable
The t test and the corresponding CIs are among the most
reliable tools that statisticians use
Outliers can greatly disturb the t procedures
Two-sample procedures are less robust when the sample sizes
are not similar
The lack of robustness of the tests for variances is equally
remarkable
The power of the two-sample t test
Noncentral t distribution –
To find the power for the pooled two-sample t test, we
consider only H 0 : μ 1−μ2=0
1. Specify
a. An alternative value for μ1−μ2 that you consider
important to detect;
b. The sample size, n1∧n2 ;
c. A fixed significance level, ∝ ;
d. A guess at the standard deviation, σ
2. Find the degrees of freedom df =n1 +n2−2 and the value
of t* that will lead to rejection of H 0
|μ1−μ 2|
δ=
3. Calculate the noncentrality parameter
σ
√ 1 1
+
n1 n 2
4. Find the power as the probability that a noncentral t
random variable with degrees of freedom df and
noncentrality parameter δ will greater than t*. The
denominator in the noncentrality parameter, σ
√ 1 1
+
n 1 n2
is our guess at the standard error for the difference in
the sample means
Lecture 23
Independent groups t test
There are many kinds of independent groups tests:
o Wilcoxon Rank Sum Test
o Permutation Test
o T test (equal variance NOT assumed)
o T test (equal variance assumed)