ISYE 6414 - MIDTERM 1 PREP EXAM
QUESTIONS WITH CORRECT
DETAILED ANSWERS
Prediction interval is wider than then the confidence interval for the mean response. -
Answer-This is because we have additional
uncertainty due to predicting under a new setting whereas the confidence intervals
under estimation are reflecting an average across all settings for that specific value
If the scatter plot of the residuals is not random around zero line - Answer-relationship
between x and y may not be linear, or the variances of the error terms are
not equal, and the response data or the error terms are not independent
residuals are clustered in two separated clusters - Answer-means that the residuals
may be correlated due to some clustering effect
independence - Answer-residual analysis cannot be used to check for the
___ assumption
For checking normality, - Answer-we can use the quantile plot, or normal probability plot
If some of the assumptions do not hold - Answer-then we interpret that the model fit is
inadequate, but it does not mean that the regression is not useful.
to model the nonlinear relationship - Answer-we can transform X by some nonlinear
function such as f(x) = x^a or f(x) = log(x)
If λ=0 - Answer-we actually use
the normal logarithmic transformation
If λ=-1 - Answer-use the inverse of y,
this is called the Box-Cox Transformation.
outliers, - Answer-are data points
far from the majority of the data in x and/or y
leverage points - Answer-Data points that are far from the
mean of the x's are
influential point - Answer-A data point that is far from the mean of
the x's and/or the y's and influences the regression model fit significantly
,When outliers belong in the data, - Answer-you will have to
perform the statistical analysis with and without the outliers and inform the reader
about how an outlier influences the regression fit
To check outliers, - Answer-a very simple approach is to use the standardized residuals
and compare the standardized residuals to the -2 and 2 band or even tighter, the -1 and
1
band.
coefficient of determination - Answer-r^2. whether the linear model is useful to predict.
correlation coefficient - Answer-approach to establish the linear relationship between
two variables.
the square of the correlation coefficients is actually - Answer-R squared
To evaluate the constant variance and
the assumption of uncorrelated errors, - Answer-we can use a scatter plot of the
residuals vs
fitted values, which is the second plot
Testing for ß0 equal to zero means - Answer-testing for statistical significance
We do not
see a grouping of the residuals, - Answer-meaning that the assumption of uncorrelated
error
possibly holds.
If there's no pattern in this plot, - Answer-we conclude the linearity assumption
holds.
using the correlation - Answer-approach to identify a transformation that will improve the
linearity between two factors is ___
In one-way ANOVA - Answer-we have k different
populations or groups, and for each population we observe a sample of data for the
response variable Y.
mu_k and sigma_k square for kth population - Answer-true mean and variance for the
response
variable are ___
Y bar in ANOVA - Answer-the mean
estimate is the sample mean
S squared in ANOVA - Answer-the variance estimate is the sample
, variance
- Answer-The overarching objective in the ANOVA is to compare the means across the
k
populations
The overarching objective in the ANOVA - Answer-is to compare the means across the
k
populations
the within-variability to the between variability of the response data. - Answer-in
ANOVA, we compare
The
between variability - Answer-is the variability between the means across the groups,
proxied by
the middle lines in the boxplots
The within variability - Answer-is the variability
within regions, visually can be assessed that by the variability within each box
if
the between-variability is larger than the within-variability - Answer-We will find
significant differences across the means
In ANOVA, the primary objectives are to: - Answer-1) Analyze the variability in the data
using the ANOVA table
2) Use this analysis of variance
3) Estimate confidence intervals
Analyze the variability in the data using the ANOVA table - Answer-That means we
compare the variability within each group to the variability between the means. We
represent all the information in a table, laying out all the components needed to make
the comparison.
Testing for equal means - Answer-Specifically, we will test the null hypothesis that all
means are equal versus the alternative that at least two of the means are not equal.
Estimate confidence intervals - Answer-for all the pairs of means, in order to identify
which of the means are not equal, or which of the means are statistically significantly
different. Specifically, we will consider a hypothesis test for each pair of means with the
null hypothesis that the means in the pair are equal versus the alternative that are not
equal. We will perform all the hypothesis tests across all pair jointly
j - Answer-is the index within group
QUESTIONS WITH CORRECT
DETAILED ANSWERS
Prediction interval is wider than then the confidence interval for the mean response. -
Answer-This is because we have additional
uncertainty due to predicting under a new setting whereas the confidence intervals
under estimation are reflecting an average across all settings for that specific value
If the scatter plot of the residuals is not random around zero line - Answer-relationship
between x and y may not be linear, or the variances of the error terms are
not equal, and the response data or the error terms are not independent
residuals are clustered in two separated clusters - Answer-means that the residuals
may be correlated due to some clustering effect
independence - Answer-residual analysis cannot be used to check for the
___ assumption
For checking normality, - Answer-we can use the quantile plot, or normal probability plot
If some of the assumptions do not hold - Answer-then we interpret that the model fit is
inadequate, but it does not mean that the regression is not useful.
to model the nonlinear relationship - Answer-we can transform X by some nonlinear
function such as f(x) = x^a or f(x) = log(x)
If λ=0 - Answer-we actually use
the normal logarithmic transformation
If λ=-1 - Answer-use the inverse of y,
this is called the Box-Cox Transformation.
outliers, - Answer-are data points
far from the majority of the data in x and/or y
leverage points - Answer-Data points that are far from the
mean of the x's are
influential point - Answer-A data point that is far from the mean of
the x's and/or the y's and influences the regression model fit significantly
,When outliers belong in the data, - Answer-you will have to
perform the statistical analysis with and without the outliers and inform the reader
about how an outlier influences the regression fit
To check outliers, - Answer-a very simple approach is to use the standardized residuals
and compare the standardized residuals to the -2 and 2 band or even tighter, the -1 and
1
band.
coefficient of determination - Answer-r^2. whether the linear model is useful to predict.
correlation coefficient - Answer-approach to establish the linear relationship between
two variables.
the square of the correlation coefficients is actually - Answer-R squared
To evaluate the constant variance and
the assumption of uncorrelated errors, - Answer-we can use a scatter plot of the
residuals vs
fitted values, which is the second plot
Testing for ß0 equal to zero means - Answer-testing for statistical significance
We do not
see a grouping of the residuals, - Answer-meaning that the assumption of uncorrelated
error
possibly holds.
If there's no pattern in this plot, - Answer-we conclude the linearity assumption
holds.
using the correlation - Answer-approach to identify a transformation that will improve the
linearity between two factors is ___
In one-way ANOVA - Answer-we have k different
populations or groups, and for each population we observe a sample of data for the
response variable Y.
mu_k and sigma_k square for kth population - Answer-true mean and variance for the
response
variable are ___
Y bar in ANOVA - Answer-the mean
estimate is the sample mean
S squared in ANOVA - Answer-the variance estimate is the sample
, variance
- Answer-The overarching objective in the ANOVA is to compare the means across the
k
populations
The overarching objective in the ANOVA - Answer-is to compare the means across the
k
populations
the within-variability to the between variability of the response data. - Answer-in
ANOVA, we compare
The
between variability - Answer-is the variability between the means across the groups,
proxied by
the middle lines in the boxplots
The within variability - Answer-is the variability
within regions, visually can be assessed that by the variability within each box
if
the between-variability is larger than the within-variability - Answer-We will find
significant differences across the means
In ANOVA, the primary objectives are to: - Answer-1) Analyze the variability in the data
using the ANOVA table
2) Use this analysis of variance
3) Estimate confidence intervals
Analyze the variability in the data using the ANOVA table - Answer-That means we
compare the variability within each group to the variability between the means. We
represent all the information in a table, laying out all the components needed to make
the comparison.
Testing for equal means - Answer-Specifically, we will test the null hypothesis that all
means are equal versus the alternative that at least two of the means are not equal.
Estimate confidence intervals - Answer-for all the pairs of means, in order to identify
which of the means are not equal, or which of the means are statistically significantly
different. Specifically, we will consider a hypothesis test for each pair of means with the
null hypothesis that the means in the pair are equal versus the alternative that are not
equal. We will perform all the hypothesis tests across all pair jointly
j - Answer-is the index within group