Neural Circuits
Reinforcement Learning
Classical conditioning
A neutral stimulus becomes associated with a reflex
How does an animal maximise the total amount of reward it receives when
interacting with a complex, uncertain environment?
One way in which animals acquire complex behaviours is by learning to
obtain rewards and to avoid punishments.
Reinforcement learning theory is a formal model of this type of learning.
In many environments animals need to perform unrewarded or
unpleasant preparatory actions in order to obtain some later reward.
For example, a mouse may need to leave the warmth and safety of its
burrow to go on a cold and initially unrewarded search for food.
Reinforcement learning – learning what actions lead to positive and what actions
lead to negative outcomes
Proposed by Burrhus Frederic Skinner
An animal is rewarded for a correct outcome.
When the reward is given, the animal knows that something has been
done correctly.
However, it isn’t told which of its actions in the immediate past led to the
reward.
Unproblematic when the set of possible actions is small, it is more difficult
if the action set is large and if a sequence has to be learned.
When a mistake is made, no information is given about the nature of the
error.
All the animal knows is that the reinforcement is absent.
Reinforcement learning is broadly a process of exploration followed by
selection through reward
Operant conditioning