The Adolescent Brain
Reinforcement learning
● Pavlovian fear conditioning (learning a stimulus - outcome association)
● Instrumental conditioning (learning a stimulus - action - outcome
association)
● Probabilistic reinforcement learning; the same stimulus or action does
not always lead to the same outcome:
○ Reward (R) = experienced outcome (mPFC of OFC)
○ Value (V) = expectation of the outcome
● Use reinforcement learning to update expectations over the course of
multiple experiences
● Prediction error (δ) = the difference between the experienced reward and
expected value: δ = R - V; can be positive or negative
● Prediction error updates expectations of the outcome: ν𝑡 + 1 = ν𝑡 + δ𝑡
● Learning speed/learning rate (∝); ν𝑡 + 1 = ν𝑡 + α± δ𝑡
● Some people learn faster from bad experiences than positive experiences
● These five components are reflected in brain function during learning
● Striatum; necessary for learning from
feedback, Parkinson’s patients show a
thinning of the striatum: they struggle to
learn from feedback but not from
observation;
● Ventral striatum; responds to reward
magnitude, reward probability and there
is greater reactivity when a reward is different than expected (prediction
errors) - important for learning from reward (implicated in Pavlovian
conditioning)
, ● Dorsal striatum; responds to reward but especially early in learning and
when there is a mapping between actions and outcomes (implicated in
instrumental conditioning)
● Orbitofrontal cortex (OFC); processes outcome
value - after being conditioned ventral striatum
responds to stimulus in anticipation of reward
but the OFC responds to experienced outcome
Cohen et al., 2010
● Striatal prediction error signals peaked in adolescence
● Prediction error signals were more dorsally located in adolescents
● Value signals (mPFC) were highest in children
● Value and prediction errors show a different developmental trajectory
● Larger striatal prediction errors might reflect a greater effect of positive
outcomes in adolescents
Van den Bos et al., 2012
● Adults learn faster than
children or adolescents with
positive prediction errors,
while children learn faster with
negative prediction errors
● Ventral striatum (prediction
errors) and medial prefrontal
cortex (outcome); people who learn faster from negative feedback also
show stronger functional connectivity between these two brain areas
● Different developmental patterns exist for different components of
reinforcement learning, including: value and positive prediction errors
(Cohen et al., 2010) and positive versus negative learning rates (Van den
Bos et al., 2012)
Reinforcement learning
● Pavlovian fear conditioning (learning a stimulus - outcome association)
● Instrumental conditioning (learning a stimulus - action - outcome
association)
● Probabilistic reinforcement learning; the same stimulus or action does
not always lead to the same outcome:
○ Reward (R) = experienced outcome (mPFC of OFC)
○ Value (V) = expectation of the outcome
● Use reinforcement learning to update expectations over the course of
multiple experiences
● Prediction error (δ) = the difference between the experienced reward and
expected value: δ = R - V; can be positive or negative
● Prediction error updates expectations of the outcome: ν𝑡 + 1 = ν𝑡 + δ𝑡
● Learning speed/learning rate (∝); ν𝑡 + 1 = ν𝑡 + α± δ𝑡
● Some people learn faster from bad experiences than positive experiences
● These five components are reflected in brain function during learning
● Striatum; necessary for learning from
feedback, Parkinson’s patients show a
thinning of the striatum: they struggle to
learn from feedback but not from
observation;
● Ventral striatum; responds to reward
magnitude, reward probability and there
is greater reactivity when a reward is different than expected (prediction
errors) - important for learning from reward (implicated in Pavlovian
conditioning)
, ● Dorsal striatum; responds to reward but especially early in learning and
when there is a mapping between actions and outcomes (implicated in
instrumental conditioning)
● Orbitofrontal cortex (OFC); processes outcome
value - after being conditioned ventral striatum
responds to stimulus in anticipation of reward
but the OFC responds to experienced outcome
Cohen et al., 2010
● Striatal prediction error signals peaked in adolescence
● Prediction error signals were more dorsally located in adolescents
● Value signals (mPFC) were highest in children
● Value and prediction errors show a different developmental trajectory
● Larger striatal prediction errors might reflect a greater effect of positive
outcomes in adolescents
Van den Bos et al., 2012
● Adults learn faster than
children or adolescents with
positive prediction errors,
while children learn faster with
negative prediction errors
● Ventral striatum (prediction
errors) and medial prefrontal
cortex (outcome); people who learn faster from negative feedback also
show stronger functional connectivity between these two brain areas
● Different developmental patterns exist for different components of
reinforcement learning, including: value and positive prediction errors
(Cohen et al., 2010) and positive versus negative learning rates (Van den
Bos et al., 2012)