Introduction to Learning
Schedules of Reinforcement
RATIO SCHEDULES: the
behaviour is rewarded after a
certain number of responses
has occurred.
o Fixed Ratio: subject
must perform a fixed
number of responses to
achieve a reward.
o Variable Ratio: the
number of responses a
subject must perform to
achieve a reward is variable.
INTERVAL SCHEDULES: the behaviour is rewarded after a certain
time period has elapsed, irrespective of the number of responses in
that time period.
o Fixed Interval: a fixed amount of time passes before the
subject receives a reward for the next behaviour.
o Variable Interval: a variable amount of time passes before
the subject receives the reward.
Generally, unpredictable, intermittent reinforcement works best
with humans.
o Gambling on one-armed-bandits is a variable ratio schedule
and can be highly reinforcing.
o Fixed odds betting terminals – the ‘crack cocaine’ of gambling.
DURATION SCHEDULES: reinforcement is contingent on
performing a behaviour continuously throughout a period of time.
o On a fixed duration schedule the behaviour must be
performed continuously for a fixed, predictable period of time.
o On a variable duration schedule the behaviour must be
performed continuously for a varying unpredictable period of
time.
o May be good for some behaviours, i.e. studying, but otherwise
quite imprecise.
RESPONSE RATE SCHEDULES: reinforcement is directly
contingent upon organism’s rate of response.
o Differentiation reinforcement of high rates: reinforcement is
contingent upon emitting at least a certain number of
responses in a certain period of time, or at a fast rate.
o Differentiation reinforcement of low rates: a minimum
amount of time must pass between each response before the
reinforcer will be delivered, reinforcement is provided for
, responding at a slow rate (e.g. a rat will only receive food if it
waits).
o Differentiation reinforcement of paced responding:
reinforcement is contingent upon emitting a series of
responses at a set rate, reinforcement is provided for
responding neither too fast nor too slow.
NONCONTINGENT SCHEDULES: reinforcer is delivered following a
variable period of time irrespective of organism’s response. This
may account for superstitious behaviour. The reinforcer is delivered
independently of any response – a response is not required for the
reinforcer to be obtained.
o Fixed time schedule: the reinforcer is delivered following a
fixed, predictable amount of time regardless of the organisms’
behaviour.
o Variable time schedule: the reinforcer is delivered following
a varying unpredictable period of time, regardless of the
organisms’ behaviour.
Superstitious behaviour
‘A response acquired as a result of its accidental contiguity with a
reinforcer’ (Lieberman, 2000).
Skinner (1948) pigeons developed highly stereotyped behaviours
when rewarded every 15 seconds irrespective of their behaviour.
Ono (1987): similar effects in humans.
Factors affecting operant conditioning
Contiguity: the closeness in time of stimuli and response, or
response and effect. For example, saving the punishment till you
exit the supermarket is not as effective.
Effect: whether the outcome of the response is positive or negative.
For example, a child’s bad behaviour gains attention, even if it is
punished.
Practice: opportunity to rehearse responses & modify them on the
basis of feedback. For example, a child quickly learns to perfect
supermarket tantrums due to lots of practice.
Motivation: as motivation and level of arousal increase, learning
becomes more effective. For example, when the rewards are
potentially very high, child’s behaviour escalates.
Theories of reinforcement
DRIVE REDUCTION THEORY: Hull, 1943
An event is reinforcing to the extent that it is associated with a
reduction in a physiological drive.
However, some reinforcers do not appear to be linked to drive
reduction.
Schedules of Reinforcement
RATIO SCHEDULES: the
behaviour is rewarded after a
certain number of responses
has occurred.
o Fixed Ratio: subject
must perform a fixed
number of responses to
achieve a reward.
o Variable Ratio: the
number of responses a
subject must perform to
achieve a reward is variable.
INTERVAL SCHEDULES: the behaviour is rewarded after a certain
time period has elapsed, irrespective of the number of responses in
that time period.
o Fixed Interval: a fixed amount of time passes before the
subject receives a reward for the next behaviour.
o Variable Interval: a variable amount of time passes before
the subject receives the reward.
Generally, unpredictable, intermittent reinforcement works best
with humans.
o Gambling on one-armed-bandits is a variable ratio schedule
and can be highly reinforcing.
o Fixed odds betting terminals – the ‘crack cocaine’ of gambling.
DURATION SCHEDULES: reinforcement is contingent on
performing a behaviour continuously throughout a period of time.
o On a fixed duration schedule the behaviour must be
performed continuously for a fixed, predictable period of time.
o On a variable duration schedule the behaviour must be
performed continuously for a varying unpredictable period of
time.
o May be good for some behaviours, i.e. studying, but otherwise
quite imprecise.
RESPONSE RATE SCHEDULES: reinforcement is directly
contingent upon organism’s rate of response.
o Differentiation reinforcement of high rates: reinforcement is
contingent upon emitting at least a certain number of
responses in a certain period of time, or at a fast rate.
o Differentiation reinforcement of low rates: a minimum
amount of time must pass between each response before the
reinforcer will be delivered, reinforcement is provided for
, responding at a slow rate (e.g. a rat will only receive food if it
waits).
o Differentiation reinforcement of paced responding:
reinforcement is contingent upon emitting a series of
responses at a set rate, reinforcement is provided for
responding neither too fast nor too slow.
NONCONTINGENT SCHEDULES: reinforcer is delivered following a
variable period of time irrespective of organism’s response. This
may account for superstitious behaviour. The reinforcer is delivered
independently of any response – a response is not required for the
reinforcer to be obtained.
o Fixed time schedule: the reinforcer is delivered following a
fixed, predictable amount of time regardless of the organisms’
behaviour.
o Variable time schedule: the reinforcer is delivered following
a varying unpredictable period of time, regardless of the
organisms’ behaviour.
Superstitious behaviour
‘A response acquired as a result of its accidental contiguity with a
reinforcer’ (Lieberman, 2000).
Skinner (1948) pigeons developed highly stereotyped behaviours
when rewarded every 15 seconds irrespective of their behaviour.
Ono (1987): similar effects in humans.
Factors affecting operant conditioning
Contiguity: the closeness in time of stimuli and response, or
response and effect. For example, saving the punishment till you
exit the supermarket is not as effective.
Effect: whether the outcome of the response is positive or negative.
For example, a child’s bad behaviour gains attention, even if it is
punished.
Practice: opportunity to rehearse responses & modify them on the
basis of feedback. For example, a child quickly learns to perfect
supermarket tantrums due to lots of practice.
Motivation: as motivation and level of arousal increase, learning
becomes more effective. For example, when the rewards are
potentially very high, child’s behaviour escalates.
Theories of reinforcement
DRIVE REDUCTION THEORY: Hull, 1943
An event is reinforcing to the extent that it is associated with a
reduction in a physiological drive.
However, some reinforcers do not appear to be linked to drive
reduction.