Anatomy of a Statistical Study:
Defining Scales for Study Designs:
Study designs are the base blueprint for statistical projects. It is comprising all the details about how you
plan to collect data, including:
- How you intend to select what to observe
- What you will measure
- How many observations to make
- How many times you will take measurements
This stage is meant to align your data with your question.
Five Hierarchical Scales of a Study:
These scales simplify the process of designing a study.
1. Observation Unit: Scale for data collection. May be the same as Sampling unit
2. Sampling Unit: The unit being selected at random.
3. Sample: The collection of your randomly selected sample units
4. Statistical Population: Collection of all sampling units that could have been in your sample.
Defined by sampling design.
5. Population of interest: The collection of sampling units you hope to draw a conclusion about.
Defined by scope of research question.
Other Terms:
Measurement Variable: The type of data being collected
Measurement unit: Scale of the measurement unit
Descriptive and Inferential Statistics:
Descriptive statistics are used to characterize data using averages, tables, and graphs.
Inferential statistics uses information from your sample to make probabilistic statements about the
statistical population.
Takeaway:
Descriptive Stats: Make statements about sample using data
Inferential Stats: Make statements about the statistical population using data
Framework of Statistics:
Sampling:
The step of creating your study and collecting your samples
Measuring:
The step of taking measurements from your observation units, which gives you the data with which to
work. This can be a singular variable from the observation unit or many variables.
, Calculating Descriptive Statistics:
The step where you describe the data in your sample. This can include calculating the average value of a
variable, calculating the variation amongst measurements, or creating graphs.
Calculating Inferential Statistics:
The final step where you use the information contained in your data to draw a conclusion about the
statistical population.
Single Groups vs Multiple Groups:
The statistical population is often made up of subgroups that we may want to compare. In these cases,
the process is split between both groups. Each subgroup undergoes sampling, measuring, descriptive,
and inferential statistics. This allows us to analyse trends within each subgroup. Inferential statistics are
then used to compare the subgroups to each other, and make statements about the statistical
population as a whole.
Study Designs and Sampling:
Sampling Design:
Sampling design ensures that the statistical population is fair and accurate. There are 4 goals of an ideal
sampling design:
1. All sampling units must have a probability of being included in the sample
2. The selection of sampling units must be unbiased
3. Selection of sampling units are independent of each other
4. Each possible sample has an equal chance of being selected
Ideal sampling designs are not always possible. This means that realistic study designs are often quite
different than the ideal.
Designing Observational Studies:
Observational studies are used to get “real world” information because they snapshot how variables
may be interrelated. They are based on observations of a statistical population and are often in news
headlines. The primary goal of a observation study is to characterize something about a statistical
population. There are many common study methods, but the overarching goal is to collect data from an
existing population. A good use for observational studies is when experimental studies are unethical (i.e.
when one condition is thought to be detrimental). The main drawback of observational studies is that,
while they are great at finding correlation, they cannot prove causation.
Observational Study Designs:
Simple Random Survey:
The easiest design to envision. It is done by identifying every sampling unit in the statistical population
and selecting a random subset of them. The easiest way to do this is to create a list of all possible
sampling units and choose at random from the list.
Stratified Survey:
Defining Scales for Study Designs:
Study designs are the base blueprint for statistical projects. It is comprising all the details about how you
plan to collect data, including:
- How you intend to select what to observe
- What you will measure
- How many observations to make
- How many times you will take measurements
This stage is meant to align your data with your question.
Five Hierarchical Scales of a Study:
These scales simplify the process of designing a study.
1. Observation Unit: Scale for data collection. May be the same as Sampling unit
2. Sampling Unit: The unit being selected at random.
3. Sample: The collection of your randomly selected sample units
4. Statistical Population: Collection of all sampling units that could have been in your sample.
Defined by sampling design.
5. Population of interest: The collection of sampling units you hope to draw a conclusion about.
Defined by scope of research question.
Other Terms:
Measurement Variable: The type of data being collected
Measurement unit: Scale of the measurement unit
Descriptive and Inferential Statistics:
Descriptive statistics are used to characterize data using averages, tables, and graphs.
Inferential statistics uses information from your sample to make probabilistic statements about the
statistical population.
Takeaway:
Descriptive Stats: Make statements about sample using data
Inferential Stats: Make statements about the statistical population using data
Framework of Statistics:
Sampling:
The step of creating your study and collecting your samples
Measuring:
The step of taking measurements from your observation units, which gives you the data with which to
work. This can be a singular variable from the observation unit or many variables.
, Calculating Descriptive Statistics:
The step where you describe the data in your sample. This can include calculating the average value of a
variable, calculating the variation amongst measurements, or creating graphs.
Calculating Inferential Statistics:
The final step where you use the information contained in your data to draw a conclusion about the
statistical population.
Single Groups vs Multiple Groups:
The statistical population is often made up of subgroups that we may want to compare. In these cases,
the process is split between both groups. Each subgroup undergoes sampling, measuring, descriptive,
and inferential statistics. This allows us to analyse trends within each subgroup. Inferential statistics are
then used to compare the subgroups to each other, and make statements about the statistical
population as a whole.
Study Designs and Sampling:
Sampling Design:
Sampling design ensures that the statistical population is fair and accurate. There are 4 goals of an ideal
sampling design:
1. All sampling units must have a probability of being included in the sample
2. The selection of sampling units must be unbiased
3. Selection of sampling units are independent of each other
4. Each possible sample has an equal chance of being selected
Ideal sampling designs are not always possible. This means that realistic study designs are often quite
different than the ideal.
Designing Observational Studies:
Observational studies are used to get “real world” information because they snapshot how variables
may be interrelated. They are based on observations of a statistical population and are often in news
headlines. The primary goal of a observation study is to characterize something about a statistical
population. There are many common study methods, but the overarching goal is to collect data from an
existing population. A good use for observational studies is when experimental studies are unethical (i.e.
when one condition is thought to be detrimental). The main drawback of observational studies is that,
while they are great at finding correlation, they cannot prove causation.
Observational Study Designs:
Simple Random Survey:
The easiest design to envision. It is done by identifying every sampling unit in the statistical population
and selecting a random subset of them. The easiest way to do this is to create a list of all possible
sampling units and choose at random from the list.
Stratified Survey: