Dumb and dumber
What happens if the feature distribution does
not allow simple classifiers to work well?
Simple classifiers (few parameters, simple
structure …)
1) Are good: do not usually overfit
2) Are bad: can not solve hard problems
,Exploiting weak classifiers
Instead of learning a single classifier
Learn many weak classifiers that are good at
difference parts of the input space
Output class: vote of each classifier
, Ensamble methods
We like that:
- Classifiers that are most ‘sure’ will vote with more
conviction
- Classifiers will be most ‘sure’ about a particular part of the
space
- On average, do better than single classifiers
How?
-Force a classifier ℎ𝑡 to learn different part of the input space?
- Weight the votes of each classifier 𝛼𝑡 ?
What happens if the feature distribution does
not allow simple classifiers to work well?
Simple classifiers (few parameters, simple
structure …)
1) Are good: do not usually overfit
2) Are bad: can not solve hard problems
,Exploiting weak classifiers
Instead of learning a single classifier
Learn many weak classifiers that are good at
difference parts of the input space
Output class: vote of each classifier
, Ensamble methods
We like that:
- Classifiers that are most ‘sure’ will vote with more
conviction
- Classifiers will be most ‘sure’ about a particular part of the
space
- On average, do better than single classifiers
How?
-Force a classifier ℎ𝑡 to learn different part of the input space?
- Weight the votes of each classifier 𝛼𝑡 ?