Statistics 1
- Terms;
Statistics: knowledge acquisition on the basis of data
Population: the group that you want to describe, the entire set of elements
Sample: the group for which you have data, a subset of elements from the
population taken with the intention of making inferences/statement(s) about the
whole population
Parameter: numerical property of the population
Statistic: numerical property of a sample
Sampling distribution: describes how the statistic varies when sampling is repeated,
describes variability. This is the basis for inference
Inference: making statements about the population based on a sample
Z-score: standardized value of X
- Central limit theorem:
Even if a variable X is not normally distributed in the population à we may assume
that, under certain conditions (such as a large number of cases [>30] and a fixed
standard deviation s). The sampling distribution of the mean is approximately
normal with standard error ( s / Ön )
- Sampling error (difference between the value of a parameter and the statistic
computed to estimate that parameter):
1. Variability: phenomenon in which repeated sampling results in different results
for the statistic, by increasing a sample size the variability will decrease
2. Sampling bias: a bias in which data is collected in such a way that some members
of the intended population are less likely to be inclined than others
3. Non-sampling error: measurement error, a combination of measurement
problems
- Minimizing non-sampling error;
(measurement) validity: refers to the degree of correspondence between the
concept being addressed and the variable being used to measure that concept
(measurement) accuracy: refers to the absence of error, or degree of agreement
between measurement and true value, which does not imply validity
(measurement) precision: refers to the level of exactness or to the range of values
possible in the measurement process
Prevent coding errors
Prevent interpretation errors
Good labelling
, - Sources of data:
Internal data: data available from
existing records or files of an
institution undertaking a study from
an internal source
External data: data obtained from an
organization external to the
institution undertaking the study are
data from an external source
Primary data: obtained from the
organization or institution that
originally collected the information
Secondary data: obtained from a
source other than the primary data
source
Pg. 17 Elementary Statistics for Geographers
- Types of samples:
Pg. 261 Elementary statistics for Geographers
Probability sample: sample in which the probability of any individual member of the
population being picked for the sample can be determined
- Probability samples;
Simple random sample: sample in which each possible sample of a given size n has
an equal probability of being selected. A way of executing this is to number all
individuals and then randomly pick numbers from the population until the sample is
full. A way to produce random numbers is needed, nowadays that is often done
mathematically/by computers
- Terms;
Statistics: knowledge acquisition on the basis of data
Population: the group that you want to describe, the entire set of elements
Sample: the group for which you have data, a subset of elements from the
population taken with the intention of making inferences/statement(s) about the
whole population
Parameter: numerical property of the population
Statistic: numerical property of a sample
Sampling distribution: describes how the statistic varies when sampling is repeated,
describes variability. This is the basis for inference
Inference: making statements about the population based on a sample
Z-score: standardized value of X
- Central limit theorem:
Even if a variable X is not normally distributed in the population à we may assume
that, under certain conditions (such as a large number of cases [>30] and a fixed
standard deviation s). The sampling distribution of the mean is approximately
normal with standard error ( s / Ön )
- Sampling error (difference between the value of a parameter and the statistic
computed to estimate that parameter):
1. Variability: phenomenon in which repeated sampling results in different results
for the statistic, by increasing a sample size the variability will decrease
2. Sampling bias: a bias in which data is collected in such a way that some members
of the intended population are less likely to be inclined than others
3. Non-sampling error: measurement error, a combination of measurement
problems
- Minimizing non-sampling error;
(measurement) validity: refers to the degree of correspondence between the
concept being addressed and the variable being used to measure that concept
(measurement) accuracy: refers to the absence of error, or degree of agreement
between measurement and true value, which does not imply validity
(measurement) precision: refers to the level of exactness or to the range of values
possible in the measurement process
Prevent coding errors
Prevent interpretation errors
Good labelling
, - Sources of data:
Internal data: data available from
existing records or files of an
institution undertaking a study from
an internal source
External data: data obtained from an
organization external to the
institution undertaking the study are
data from an external source
Primary data: obtained from the
organization or institution that
originally collected the information
Secondary data: obtained from a
source other than the primary data
source
Pg. 17 Elementary Statistics for Geographers
- Types of samples:
Pg. 261 Elementary statistics for Geographers
Probability sample: sample in which the probability of any individual member of the
population being picked for the sample can be determined
- Probability samples;
Simple random sample: sample in which each possible sample of a given size n has
an equal probability of being selected. A way of executing this is to number all
individuals and then randomly pick numbers from the population until the sample is
full. A way to produce random numbers is needed, nowadays that is often done
mathematically/by computers