Describing the distribution of a variable in probabilistic terms
Random variable = a numerical measurement of the outcome of a random phenomenon
Use of random sampling or performing a randomized experiment and as a consequence, the
values that a random variable takes on are determined by chance
Denoted by a capital letter: X
o Possible values of a random variable is denoted as a low-capital letter: x
The randomness allows for the possible values of a random variable to specify the
probabilities (in the long run)
o Discrete = takes on values form a set of separate values (0, 1, 2, 3, etc)
o Continuous = takes on values on an infinite number of possible values in an interval
1. Probability distribution = specifies the possible values and their probabilities of a random
variable
Discrete random variable
o Assigning a probability to each possible value
Each probability falls between 0 and 1
Sum of the probabilities of all possible values equals 1
Probability of a possible value (x) is denoted by P(x)
For each x, the probability P(x) falls between 0 and 1
Continuous random variable
o Assigning a probability to any interval of the possible value
Each probability falls between 0 and 1
Sum of the probabilities in an internval of all possible values equals 1
As the number of intervals increases, their width narrows, the shape
of the histogram approaches a smooth curve
You need to round off measurements as probabilities of given for intervals of
values instead of individual values
In practice, continuous variables are measured in a discrete manner
because of rounding
Characteristics of a probability distribution
o Parameter = a numerical summary of the population
Population distribution = a type of probability distribution that applies for selecting a
subject at random from a population
The distribution of the variable of interest in the population from which we
sample
μ = mean of a probability distribution
Weighted average = each x value is not equally likely
If a particular x value is more likely to occur, it ha larger influence on
the mean
Balance point of the distribution
Equally likely outcomes (such as rolling a die)
ΣxP(x) = Σx(1/6) = (1 + 2 + 3 + 4 + 5 + 6) / 6 = 3.5
Expected value of X = reflects what we expect for the average in a long run of
observations to be
μ = ΣxP(x)
Multiplying each possible value of the random variable by its
probability and then adding all these products
Generalizes the ordinary formula for the mean to allow for outcomes
that are not equally likely
E.g. the number of games played in a best of seven series
σ = standard deviation of a probability distribution
, Measures the variability from the mean (center)
Larger values for σ refer to greater variability
Describes how far values of the random variable fall, on the average,
from the mean of the distribution
Variance of a probability distribution = the squared deviations from the
mean
Σ2 = Σ(x - μ)2p(x)
Calculate the mean, μ, of the random variable
For each value xi, subtract the mean and square the result: (x i - μ)2
Sum all of these products to get the variance
Standard deviation of a probability distribution = describes the typical
distance for the values of the random variable X from their mean
σ = √Σ(x - μ)2P(x)
Take the square root of the variance to get the standard deviation
The smaller the standard deviation, the closer the values of the
random variable tend to fall to the mean
o Normal distribution = a probability distribution that is used for continuous random variables
Plays a key role in statistical inference
Many variables have approximately normal distributions
Approximates many discrete distributions when there are a large number of possible
outcomes
Bell-shape
Parameters
Mean, μ: expresses the center
Standard deviation, σ: expresses the variability
The probability of falling within 1, 2, or 3 standard deviations of the mean equals 0.68, 0.95
and 0.997, respectively (because of the empirical rule)
For any value of μ
For any value of σ > 0
μ - zσ and μ + zσ
The number of standard deviations from the mean are denoted by z
Z-score = number of standard deviations that x falls from the mean
of the probability distribution
You are given a value x and need to find a probability
Convert x to a z-score: z = (x - μ) / σ
You are give a probability and need to find the value of x
Convert the probability to the related cumulative
probability
Find the z-score
Evaluate: x = μ + zσ
When we are given the value of x for some normal random
variable and need to find a probability relating to that value
Can be used by any distribution
To express how far an observation in the sample is from the
sample mean
How far a value of a random variable is from the mean of the
probability distribution
And to compare values from two different normal
distributions
E.g. μ - 2σ and μ + 2σ = gives the probability of falling within 2 standard
deviations of the mean for 95%