One of the more challenging aspects of data analysis is determining which statistical tests
to run (given the circumstances) and performing the statistical software steps correctly.
There are several types of decision trees you can use to select a statistical test, but we will
look at just one type in this assignment.
At the most fundamental level, statistical tests are usually chosen according to:
- The nature of the data you have collected to answer the research question in your study
(nominal, ordinal, or interval/ratio).
- The number of samples being analyzed for a given variable (often described by
groupings).
- What you wish the test to do (find differences between samples/groups, explore
relationships between variables, make predictions using different variables).
Before choosing a test for interval/ratio data, there is one final characteristic of the data
that must be determined, which is whether the data is 'normally' distributed. If the data
distribution violates the assumption of normality, a nonparametric equivalent test must be
selected for the analysis.
There are many other issues that can influence the analytical technique (sample size,
variability of the data, inter-relatedness of the variables, et cetera), but these challenges
are for another time, another course.
, Instructions:
Use the Framingham study data set to perform and interpret statistical tests that answer
the following research questions:
Q1- Smoking and total cholesterol: Compare smokers and nonsmokers in the
Framingham study to determine whether there was a significant difference in baseline
cholesterol levels.
Q2- BMI categories and baseline glucose levels: Compare baseline glucose levels across
four BMI categories to determine if there is a significant difference.
Q3- Smoking and heart rate: Determine whether there is a significant difference in
baseline heart rate between smokers and nonsmokers.