Stat 101 Exam 1
Dataset - ANSHow data is stored and presented, comprised of variables measured on cases
Cases/ units - ANSwhat we obtain information about, makes up a row in a dataset
Variable - ANSany characteristic that is recorded for each case, makes up a column in a dataset
Categorical vs. Quantitative - ANSdivides cases into groups
measures a numerical quantity for each case
Population vs. Sample - ANSincludes all individuals or objects of interest
all the cases that we have collected data on (subset of the population)
Statistical inference - ANSthe process of using data from a sample to gain info using the
population
Sampling Bias - ANSoccurs when the method of selecting a sample causes the sample to differ
from the population in some relevant way, cannot trust generalizations from sample to the
population
Random samples vs. Non-random samples - ANSonly way to really avoid sampling bias, have
averages that are centered around the correct number
may suffer from sampling bias, and averages may not be centered around the correct number
Simple random sample - ANSeach unit of the population has the same chance of being
selected, regardless of the other units chosen for the sample
Volunteer bias - ANSletting people choose to participate
Explanatory variable - ANSindependent variable
using one variable to help us understand values of another variable
shown on x-axis
Response variable - ANSdependent variable
using one variable to help us predict values of another variable
shown on y-axis
, Associated vs. Causally associated - ANSvalues of one variable tend to be related to values of
the other variable
changing the value of the explanatory variable influences the response variable
Confounding variable - ANSa third variable that is associated with both the explanatory variable
and the response variable
Observational study - ANSa study in which the researcher does not actively control the value of
any variable, but simply observes the values as they exist naturally
**cannot be used to establish causation**
Experiment - ANSa study in which the researcher actively controls one or more of the
explanatory variables
ASSOCIATION DOES NOT IMPLY.... - ANSCAUSATION
Randomized experiment - ANSthe explanatory variable for each unit is determined randomly,
before the response variable is measured; eliminates confounding variables
**allow you to infer causality**
Treatments - ANSdifferent levels of the explanatory variable
groups within should all look similar
Control group - ANSa comparison group necessary to determine whether a treatment is
effective
Placebo - ANSa fake treatment that resembles the active treatment as much as possible
Double-blinded - ANSneither the participants or the researchers involved should know which
treatment the patients are actually getting
**randomized experiments should be this way**
Random sample vs. Random experiment - ANSgeneralization
must randomize explanatory variable, infer causality
Descriptive statistics - ANSexploratory data analysis; summarizing and visualizing variables and
relationships between two variables
Frequency table - ANSshows the number of cases that fall in each category
Proportion - ANS(number in category)/(total sample size)
sample: p^
Dataset - ANSHow data is stored and presented, comprised of variables measured on cases
Cases/ units - ANSwhat we obtain information about, makes up a row in a dataset
Variable - ANSany characteristic that is recorded for each case, makes up a column in a dataset
Categorical vs. Quantitative - ANSdivides cases into groups
measures a numerical quantity for each case
Population vs. Sample - ANSincludes all individuals or objects of interest
all the cases that we have collected data on (subset of the population)
Statistical inference - ANSthe process of using data from a sample to gain info using the
population
Sampling Bias - ANSoccurs when the method of selecting a sample causes the sample to differ
from the population in some relevant way, cannot trust generalizations from sample to the
population
Random samples vs. Non-random samples - ANSonly way to really avoid sampling bias, have
averages that are centered around the correct number
may suffer from sampling bias, and averages may not be centered around the correct number
Simple random sample - ANSeach unit of the population has the same chance of being
selected, regardless of the other units chosen for the sample
Volunteer bias - ANSletting people choose to participate
Explanatory variable - ANSindependent variable
using one variable to help us understand values of another variable
shown on x-axis
Response variable - ANSdependent variable
using one variable to help us predict values of another variable
shown on y-axis
, Associated vs. Causally associated - ANSvalues of one variable tend to be related to values of
the other variable
changing the value of the explanatory variable influences the response variable
Confounding variable - ANSa third variable that is associated with both the explanatory variable
and the response variable
Observational study - ANSa study in which the researcher does not actively control the value of
any variable, but simply observes the values as they exist naturally
**cannot be used to establish causation**
Experiment - ANSa study in which the researcher actively controls one or more of the
explanatory variables
ASSOCIATION DOES NOT IMPLY.... - ANSCAUSATION
Randomized experiment - ANSthe explanatory variable for each unit is determined randomly,
before the response variable is measured; eliminates confounding variables
**allow you to infer causality**
Treatments - ANSdifferent levels of the explanatory variable
groups within should all look similar
Control group - ANSa comparison group necessary to determine whether a treatment is
effective
Placebo - ANSa fake treatment that resembles the active treatment as much as possible
Double-blinded - ANSneither the participants or the researchers involved should know which
treatment the patients are actually getting
**randomized experiments should be this way**
Random sample vs. Random experiment - ANSgeneralization
must randomize explanatory variable, infer causality
Descriptive statistics - ANSexploratory data analysis; summarizing and visualizing variables and
relationships between two variables
Frequency table - ANSshows the number of cases that fall in each category
Proportion - ANS(number in category)/(total sample size)
sample: p^