\
Machine Learning
Introduction
1. What is machine learning?
↳ and
give the computer examples let it
figure out the code
( i. e. ,
the
way of
solving) itself
what makes a suitable ML problem ?
-
we can't solve the problem explicitly
-
approximate solutions are fine
-
limited reliability ,
predictability interpretability
,
is fine
-
plenty examples to learn from
where do we use ML ?
-
inside other software
in science
analytics data
mining data
-
, ,
-
in science / Statistics
-
machine learning provides systems the ability to
automatically
learn and improve from experience without
being explicitly
programmed .
in
* reinforcement
learning :
taking actions a world based on
feedback
delayed
* online learning :
predicting &
learning at the same time
offline
*
learning :
separate learning ,
predicting &
acting
1 . take a fixed dataset of examples ( =
instances )
2. train a model to learn from examples
test it works its
3. the model to see if
by checking predictions
4 .
If the model works , put it into production
( i. e. .
Use its predictions to take actions )
we do not want to find a solution for each problem individually
(in isolation ) ,
we want
generic solutions !
problem → abstract task →
algorithm
, abstract tasks
supervised explicit examples of input & output
-
:
↳ learn to input
predict the outcome for an unseen
* classification : assign a class to each example
*
regression :
assign a number to each example
unsupervised only inputs provided
-
:
↳ find any pattern that explains the data
something about
2. Classification ( table)
i. start with data : 3 0
Spam ← instance
2 0 ham
the features ham
are the
things we 0 I
T ✗
measure about the instances 4
feature label
2 .
the dataset is fed through
learner
a learning algorithm
3 .
the learning algorithm outputs model
a model ( classifier )
the model is constructed so that when it sees a new instance ,
( with the same features as fed to the learner ) ,
it can produce
( i. e.
a class for us , classify the data)
how to build a classifier ?
-
linear classifier : cut the feature space in two
4 line
1-D: dot 2D 3D plane 4Dt
hyperplane
: : :
, , .
every point in the model space is a line in the feature space
↳ using the definition of a line ( axtbytc =D
loss data ( Modell =
performance of model on the data
→ the tower ,
the better
, -
-
decision tree classifier :
. -
start at the top and node
at every
in the tree we look at one feature
&
depending on
higher or lower .
we move
to the left or right . The leaves are the decision
labelled with classes .
boundary is the shape
in feature space
K Nearest
Neighbours
-
-
:
doesn't do
any learning ,
but remembers the
whole dataset When it point it looks at the K
.
gets a new ,
points that it and point
nearest knows
assigns to the new
the class that is most frequent in this set of
neighbours .
↳ of
K is the
hyper parameter the
algorithm
variations
-
features : usually numerical or
categorical
-
binary classification :
two classes
-
multi class classification : more than two classes
multi label classification all classes be true
: none , some or
may
-
-
class probabilities / scores :
the classifier reports a
probability
or score for each class
offline machine
learning steps :
I. abstract the problem to a standard task (e. g. ,
classification )
2. choose instances and their features
↳ for supervised
learning ,
choose
target .
3. Choose model class ( i. e. ,
linear model ,
decision tree ,
KNN )
4 for
. Search a
good model
✗i features of instance i
yi true label for ✗i
3. Other abstract tasks
FIX it model
regression
where in classification, the target is a class .
In
regression ,
the is number We have to predict the number
target a . ,
given the features .
data → learner → model
, the loss function for
regression
:
'
loss (f) 1h ( f ( ✗i )
Yi )Z the mean
squared errors ( MSE) loss
= -
-
-
;
the regression tree and KNN regression can also be used
clustering data
+
we are
given features ,
but no
target values .
The learner has to decide based on patterns learner
found how to the dataset in clusters tr
separate
model
-
K -
means :
picks three random values and colours all points in
the data
according to which of the 3 means is closest .
Recompile the location of the mean values
by averaging the
locations of all the points .
Then recover the points . Iterate
these two steps ( recomputetrecolourl.tn the end ,
the data
/ the feature space is separated into three natural
clustering regions .
density estimation
Density is a lot like
clustering ,
but the task of the learner is to
a model that indicates whether that
produce outputs a number that
Instance is likely according to the distribution of the data .
the output is a probability (
categorical features) or a
probability density ( numerical features )
generative modeling
↳
building a model from which
you can sample new
examples ( sampling)
semi supervised learning
-
✗L Small set of labelled data
✗u large set of unlabelled data
self train classifier
training C ✗a
:
-
on
loop :
label ✗u with C
retrain C on Xut ✗ i
Machine Learning
Introduction
1. What is machine learning?
↳ and
give the computer examples let it
figure out the code
( i. e. ,
the
way of
solving) itself
what makes a suitable ML problem ?
-
we can't solve the problem explicitly
-
approximate solutions are fine
-
limited reliability ,
predictability interpretability
,
is fine
-
plenty examples to learn from
where do we use ML ?
-
inside other software
in science
analytics data
mining data
-
, ,
-
in science / Statistics
-
machine learning provides systems the ability to
automatically
learn and improve from experience without
being explicitly
programmed .
in
* reinforcement
learning :
taking actions a world based on
feedback
delayed
* online learning :
predicting &
learning at the same time
offline
*
learning :
separate learning ,
predicting &
acting
1 . take a fixed dataset of examples ( =
instances )
2. train a model to learn from examples
test it works its
3. the model to see if
by checking predictions
4 .
If the model works , put it into production
( i. e. .
Use its predictions to take actions )
we do not want to find a solution for each problem individually
(in isolation ) ,
we want
generic solutions !
problem → abstract task →
algorithm
, abstract tasks
supervised explicit examples of input & output
-
:
↳ learn to input
predict the outcome for an unseen
* classification : assign a class to each example
*
regression :
assign a number to each example
unsupervised only inputs provided
-
:
↳ find any pattern that explains the data
something about
2. Classification ( table)
i. start with data : 3 0
Spam ← instance
2 0 ham
the features ham
are the
things we 0 I
T ✗
measure about the instances 4
feature label
2 .
the dataset is fed through
learner
a learning algorithm
3 .
the learning algorithm outputs model
a model ( classifier )
the model is constructed so that when it sees a new instance ,
( with the same features as fed to the learner ) ,
it can produce
( i. e.
a class for us , classify the data)
how to build a classifier ?
-
linear classifier : cut the feature space in two
4 line
1-D: dot 2D 3D plane 4Dt
hyperplane
: : :
, , .
every point in the model space is a line in the feature space
↳ using the definition of a line ( axtbytc =D
loss data ( Modell =
performance of model on the data
→ the tower ,
the better
, -
-
decision tree classifier :
. -
start at the top and node
at every
in the tree we look at one feature
&
depending on
higher or lower .
we move
to the left or right . The leaves are the decision
labelled with classes .
boundary is the shape
in feature space
K Nearest
Neighbours
-
-
:
doesn't do
any learning ,
but remembers the
whole dataset When it point it looks at the K
.
gets a new ,
points that it and point
nearest knows
assigns to the new
the class that is most frequent in this set of
neighbours .
↳ of
K is the
hyper parameter the
algorithm
variations
-
features : usually numerical or
categorical
-
binary classification :
two classes
-
multi class classification : more than two classes
multi label classification all classes be true
: none , some or
may
-
-
class probabilities / scores :
the classifier reports a
probability
or score for each class
offline machine
learning steps :
I. abstract the problem to a standard task (e. g. ,
classification )
2. choose instances and their features
↳ for supervised
learning ,
choose
target .
3. Choose model class ( i. e. ,
linear model ,
decision tree ,
KNN )
4 for
. Search a
good model
✗i features of instance i
yi true label for ✗i
3. Other abstract tasks
FIX it model
regression
where in classification, the target is a class .
In
regression ,
the is number We have to predict the number
target a . ,
given the features .
data → learner → model
, the loss function for
regression
:
'
loss (f) 1h ( f ( ✗i )
Yi )Z the mean
squared errors ( MSE) loss
= -
-
-
;
the regression tree and KNN regression can also be used
clustering data
+
we are
given features ,
but no
target values .
The learner has to decide based on patterns learner
found how to the dataset in clusters tr
separate
model
-
K -
means :
picks three random values and colours all points in
the data
according to which of the 3 means is closest .
Recompile the location of the mean values
by averaging the
locations of all the points .
Then recover the points . Iterate
these two steps ( recomputetrecolourl.tn the end ,
the data
/ the feature space is separated into three natural
clustering regions .
density estimation
Density is a lot like
clustering ,
but the task of the learner is to
a model that indicates whether that
produce outputs a number that
Instance is likely according to the distribution of the data .
the output is a probability (
categorical features) or a
probability density ( numerical features )
generative modeling
↳
building a model from which
you can sample new
examples ( sampling)
semi supervised learning
-
✗L Small set of labelled data
✗u large set of unlabelled data
self train classifier
training C ✗a
:
-
on
loop :
label ✗u with C
retrain C on Xut ✗ i