Summary of all in-class materials
, Lecture 1: Intro
Learning from examples is done by fnding examples of two classes to predict, for example
spam and non-spam. Based on this, we can come up with a learning algorithm, which infers
rules from existng examples. Eventually, we want these rules to be applied to new data and
to give the classifcatonooutcome. In this case spamono-spam.
In this course, we will apply some basic ML techniques and familiarize outselves with SciKit-
learn, the leading Python package for ML. (others are BigML and MLlib). Some examples
that use Machine learning are:
Recognizing handwriten text
Recognize faces in photos
Determine whether a text has a positve or negatve sentment
Flagging suspicious Credit Card actvity
Recommend booksomovies (Netlix)
Let’s look at the main types of learning.
Regression: We want to receive a real number (for example a house price, a grade,
or a stock price). We measure error for predicton with MSE or RMSE.
Binary classifcaton: We want to classify using two possibilites (YesoNO, SpamoNo-
spam, positveonegatve sentment). We measure error with the proporton of
mistake, thus with precision and recall, or even the F1-score.
o Precision: What fracton of fagged emails as Spam were actually SPAM?
o Recall: What fracton of real SPAM was fagged?
o F-score: Harmonic mean between precison and recall.
Multclass classifcaton: Classify into multple (but fnite) set of classes (for example,
newspapers categories or detectng animal species). For multclass, it’s hard to
measure using Recall and precision. Therefore we simplify it a bit.
Ranking: We rank according to relevance (for example, this is what Google does with
webpages).
Autonomous behaviour. We can learn a car to react on input (for example sensory
data, camera, microphones) and take actons (brake, steer, etc).
How do we know that an algorithm is learning? Let’s suppose we’re studying for an exam.
First we train, based on the sheets we infer knowledge. Than, we potentally make a test-
exam, which is something we use to determine our progression (development set). Then,
we go to the fnal exam (which is obviously not accessible in advance), which is our test set.
Machine learning studies the algorithms which can learn t solve problems from examples.
Several canocial problem types, including regression (mult-)classifcaton and ranking. The
frst step is always to decide on our data split and how to evaluate (PrecisionoRecall, or
RMSE).