CHAPTER I RANDOM SAMPLING
In statistics we obtain a sample of values from a population
For the results to be valid the observations in the sample
must be chosen randomly so that they're independent of
each other
If sampling is not random any inferences drawn from the
sample may be unreliable or incorrect
1 Random number generators
Generated by a computer pseudo random numbers
Not absolutely random
They repeat
Purpose To select random samples from a population
to create hypothetical datasets with specific
characteristics when actual datasets are
hard to obtain
In Excel
1 RANDBETWEEN function
This generates random numbers but they recalculate
each time the sheet changes
In order keep them fixed we need to copy paste
to
them as values
2 Another method to generate the same random number
sequence each time is to use a random seed
steps Data Data analysis toolpack random number
generation enter data seed ok
3 When it comes to generating random
integers we run into
a problem
Most random number generators in Excel generate real
nummboers between o 1
Lets say we want to generate random integers between
20 we need to take the
1
following steps
1 Generate real numbers between o 1
tseeiebdeEInffinctidnetoi.fiEGup to get integers
, 9
4 The last method is simple Random sampling SRS It's a
method of selecting a sample from a population so that
every member has an equal chance of being chosen
Unbiased
Eg Draw a SRS of size n 30 from a finite set of N 76
1 Number the population 1 76
2 Use the random number generator with a seed
NOTE generate more than 30 random numbers so
that duplicates can be deleted
3 Convert these random numbers to integers
4 Remove duplicates generate more random numbers
if needed to get to 30
5 The final random 30 numbers is your SRS
In R
1 Random numbers can begeneratedfrom a uniform distribution
all values between o I are equally likely to be generated
using the runif n min max function
n how many random numbers you want
min the smallest possible value
max the largest possible value
R doesn't recalculate random numbers automatically like Excel
does but if you close R and come back to your code later you'l
get different numbers
To solve this we use the set seed function
2 By default runif produces random numbers between 0 1
but if we'd like to generate for example 30 random integers
the following steps need to be taken
1 Generate random numbers
integer vs number
between 0 1
2 Multiply them by 30
3 Use ceiling function to round
them up into whole numbers
In statistics we obtain a sample of values from a population
For the results to be valid the observations in the sample
must be chosen randomly so that they're independent of
each other
If sampling is not random any inferences drawn from the
sample may be unreliable or incorrect
1 Random number generators
Generated by a computer pseudo random numbers
Not absolutely random
They repeat
Purpose To select random samples from a population
to create hypothetical datasets with specific
characteristics when actual datasets are
hard to obtain
In Excel
1 RANDBETWEEN function
This generates random numbers but they recalculate
each time the sheet changes
In order keep them fixed we need to copy paste
to
them as values
2 Another method to generate the same random number
sequence each time is to use a random seed
steps Data Data analysis toolpack random number
generation enter data seed ok
3 When it comes to generating random
integers we run into
a problem
Most random number generators in Excel generate real
nummboers between o 1
Lets say we want to generate random integers between
20 we need to take the
1
following steps
1 Generate real numbers between o 1
tseeiebdeEInffinctidnetoi.fiEGup to get integers
, 9
4 The last method is simple Random sampling SRS It's a
method of selecting a sample from a population so that
every member has an equal chance of being chosen
Unbiased
Eg Draw a SRS of size n 30 from a finite set of N 76
1 Number the population 1 76
2 Use the random number generator with a seed
NOTE generate more than 30 random numbers so
that duplicates can be deleted
3 Convert these random numbers to integers
4 Remove duplicates generate more random numbers
if needed to get to 30
5 The final random 30 numbers is your SRS
In R
1 Random numbers can begeneratedfrom a uniform distribution
all values between o I are equally likely to be generated
using the runif n min max function
n how many random numbers you want
min the smallest possible value
max the largest possible value
R doesn't recalculate random numbers automatically like Excel
does but if you close R and come back to your code later you'l
get different numbers
To solve this we use the set seed function
2 By default runif produces random numbers between 0 1
but if we'd like to generate for example 30 random integers
the following steps need to be taken
1 Generate random numbers
integer vs number
between 0 1
2 Multiply them by 30
3 Use ceiling function to round
them up into whole numbers