Introduction to Learning
Operant Conditioning
An association between the stimulus (situation), response
(behaviour) and reward or punishment that becomes learned.
When a behaviour is performed in a certain situation and elicits a
reward of some kind, the likelihood of the behaviour being
performed again in the same situation is increased.
However, if the behaviour leads to a punishment of some sort, the
likelihood of the behaviour occurring in the same situation is again
reduced.
For example, if a baby smiles (accidental response), the father picks
up the baby (reinforcement), the baby keeps smiling (deliberate
response).
Thorndike (1874-1949) and his cats
Thorndike (1911) placed hungry cats in a ‘puzzle box’ with food just
outside.
Cats learned that if they escape from the box then they got food.
They learned this through ‘trial and error’.
The ‘reward’ of the food made those behaviours more likely to occur
in the future.
There was a gradual improvement. Not a sudden “flash of insight”.
What will happen if the cat no longer receives food when it escapes?
Thorndike’s law of effect
Responses that are rewarded (reinforced) become stronger –
“satisfying” behaviours are “stamped in”.
Responses that are ignored or punished become weaker –
“annoying” behaviour is “stamped out”.
It is consequences that determine whether behaviour will be
repeated.
Skinner (1904-1990) and his rats
Skinner, in the 1930s, developed Thorndike’s work on reinforcement of
voluntary ‘random’ behaviours. He developed the famous ‘Skinner box’.
Initially Skinner thought all behaviours were reflexive. The observations of
animals in the Skinner box led Skinner to divide the behaviour into two
categories:
Involuntary, reflexive behaviours (respondent or Pavlovian)
Operant behaviours
Skinner’s radical (and wrong) view was that all human behaviour could be
understood as combinations of operant behaviours.
Operant Conditioning
An association between the stimulus (situation), response
(behaviour) and reward or punishment that becomes learned.
When a behaviour is performed in a certain situation and elicits a
reward of some kind, the likelihood of the behaviour being
performed again in the same situation is increased.
However, if the behaviour leads to a punishment of some sort, the
likelihood of the behaviour occurring in the same situation is again
reduced.
For example, if a baby smiles (accidental response), the father picks
up the baby (reinforcement), the baby keeps smiling (deliberate
response).
Thorndike (1874-1949) and his cats
Thorndike (1911) placed hungry cats in a ‘puzzle box’ with food just
outside.
Cats learned that if they escape from the box then they got food.
They learned this through ‘trial and error’.
The ‘reward’ of the food made those behaviours more likely to occur
in the future.
There was a gradual improvement. Not a sudden “flash of insight”.
What will happen if the cat no longer receives food when it escapes?
Thorndike’s law of effect
Responses that are rewarded (reinforced) become stronger –
“satisfying” behaviours are “stamped in”.
Responses that are ignored or punished become weaker –
“annoying” behaviour is “stamped out”.
It is consequences that determine whether behaviour will be
repeated.
Skinner (1904-1990) and his rats
Skinner, in the 1930s, developed Thorndike’s work on reinforcement of
voluntary ‘random’ behaviours. He developed the famous ‘Skinner box’.
Initially Skinner thought all behaviours were reflexive. The observations of
animals in the Skinner box led Skinner to divide the behaviour into two
categories:
Involuntary, reflexive behaviours (respondent or Pavlovian)
Operant behaviours
Skinner’s radical (and wrong) view was that all human behaviour could be
understood as combinations of operant behaviours.