Applied Healthcare Statistics Final
Exam with Complete Solutions
discrete data - ANSWER-distinct values, can be counted, unconnected points (ex.
number of students)
continuous data - ANSWER-values are within a range, measured not counted, no gaps
between data points (ex. time in a race)
grand total - ANSWER-the bottom right corner of a two way frequency table, total size of
the data set
conditional percentages - ANSWER-computed by dividing the joint frequency by the
corresponding marginal frequency in a two way frequency table
overall percentages - ANSWER-computed by dividing each frequency by the grand total
side by side box plots - ANSWER-C -> Q, a box plot is displayed for each category of
the explanatory variable on the same graph
scatterplot - ANSWER-Q -> Q
data create ordered pairs that are graphed ont he coordinate plane
positive correlation - ANSWER-scatterplot, Q -> Q
as the explanatory variable increases, the response variable increases
negative correlation - ANSWER-scatterplot, Q -> Q
as the explanatory variable increases, the response variable decreases
no correlation - ANSWER-scatterplot, Q -> Q
no trends between variables
non-linear relationship - ANSWER-scatterplot, Q -> Q
scatterplot reveals a trend that is NOT a straight line
second quartile - ANSWER-Q2
the median
first quartile - ANSWER-Q1
the median of the data below Q2
third quartile - ANSWER-Q3
, the median of the data aboe Q3
standard deviation - ANSWER-the average distance each data point is from the mean
Empirical Rule - ANSWER-for normal distributions
68% of data is within 1 standard deviation of the mean
95% of data is within 2 standard deviations of the mean
99.7% of data is within 3 standard deviations of the mean
Simpson's Paradox - ANSWER--occurs when a result that appears in groups of data
disappears when the groups are combines
-can only occur when the sizes of the groups are inconsistent
lurking variables - ANSWER-variable not included in the study but affects the variable
that are included in the study
regression equation - ANSWER-an equation modeling the relationship between two
quantitative variables
simple linear equation - ANSWER-AKA regression line or line of best fit
-x is the explanatory variable
-y is the response variable
-equation is y = mx + b
-used to predict data (plug in values for x and find corresponding values for y)
linear interpolation - ANSWER-predictions between known data points
linear extrapolation - ANSWER-predictions larger or small than the known data points
p-value - ANSWER-the probability of data occurring by chance
significant level - ANSWER-the probability threshold, below which we consider events
not happening by chance
p-value < significance level - ANSWER-results are significant
p-value > significance level - ANSWER-results are not significant
less than or greater than - ANSWER-marked with (parenthesis), open circle on number
line
less than or equal to
greater than or equal to - ANSWER-marked with {brackets}, closed/shaded circle on
number line
Exam with Complete Solutions
discrete data - ANSWER-distinct values, can be counted, unconnected points (ex.
number of students)
continuous data - ANSWER-values are within a range, measured not counted, no gaps
between data points (ex. time in a race)
grand total - ANSWER-the bottom right corner of a two way frequency table, total size of
the data set
conditional percentages - ANSWER-computed by dividing the joint frequency by the
corresponding marginal frequency in a two way frequency table
overall percentages - ANSWER-computed by dividing each frequency by the grand total
side by side box plots - ANSWER-C -> Q, a box plot is displayed for each category of
the explanatory variable on the same graph
scatterplot - ANSWER-Q -> Q
data create ordered pairs that are graphed ont he coordinate plane
positive correlation - ANSWER-scatterplot, Q -> Q
as the explanatory variable increases, the response variable increases
negative correlation - ANSWER-scatterplot, Q -> Q
as the explanatory variable increases, the response variable decreases
no correlation - ANSWER-scatterplot, Q -> Q
no trends between variables
non-linear relationship - ANSWER-scatterplot, Q -> Q
scatterplot reveals a trend that is NOT a straight line
second quartile - ANSWER-Q2
the median
first quartile - ANSWER-Q1
the median of the data below Q2
third quartile - ANSWER-Q3
, the median of the data aboe Q3
standard deviation - ANSWER-the average distance each data point is from the mean
Empirical Rule - ANSWER-for normal distributions
68% of data is within 1 standard deviation of the mean
95% of data is within 2 standard deviations of the mean
99.7% of data is within 3 standard deviations of the mean
Simpson's Paradox - ANSWER--occurs when a result that appears in groups of data
disappears when the groups are combines
-can only occur when the sizes of the groups are inconsistent
lurking variables - ANSWER-variable not included in the study but affects the variable
that are included in the study
regression equation - ANSWER-an equation modeling the relationship between two
quantitative variables
simple linear equation - ANSWER-AKA regression line or line of best fit
-x is the explanatory variable
-y is the response variable
-equation is y = mx + b
-used to predict data (plug in values for x and find corresponding values for y)
linear interpolation - ANSWER-predictions between known data points
linear extrapolation - ANSWER-predictions larger or small than the known data points
p-value - ANSWER-the probability of data occurring by chance
significant level - ANSWER-the probability threshold, below which we consider events
not happening by chance
p-value < significance level - ANSWER-results are significant
p-value > significance level - ANSWER-results are not significant
less than or greater than - ANSWER-marked with (parenthesis), open circle on number
line
less than or equal to
greater than or equal to - ANSWER-marked with {brackets}, closed/shaded circle on
number line