Garantie de satisfaction à 100% Disponible immédiatement après paiement En ligne et en PDF Tu n'es attaché à rien 4.2 TrustPilot
logo-home
Resume

Summary Machine Learning

Vendu
51
Pages
61
Publié le
01-02-2023
Écrit en
2022/2023

English Summary of Machine Learning course of Master Data Science and Society at Tilburg University. A summary of lecture materials, readings, and notes.

Établissement
Cours











Oups ! Impossible de charger votre document. Réessayez ou contactez le support.

École, étude et sujet

Établissement
Cours
Cours

Infos sur le Document

Publié le
1 février 2023
Nombre de pages
61
Écrit en
2022/2023
Type
Resume

Sujets

Aperçu du contenu

Machine learning
Lecture 1

Machine learning is about automation of problem solving. It is the study of computer
algorithms that improve automatically through experience. Involves becoming better at a
task T based on some experience E with respect to some performance measure P.
Examples:
- Span detection
- Movie recommendation
- Speech recognition
- Credit risk analysis
- Autonomous driving
- Medical diagnosis.
It comes up with a learned algorithm. It is about learning from experience.

What does it involve?
- ML may involve a notion of generalization. When the machine learns relationships
between the input and the output, we want this to work on unseen data, which is
the concept of generalization. Is it safe to assume that current observations are
generalized to future observations?
- Annotated data, objective, optimization algorithm, features/representations,
assumptions are some critical components.
- We assume the database presents the population. As we have more data, the output
becomes better.
- There is an optimization algorithm that incrementally works towards the best
outcome.

Different types of learning:
Starting points:
- Supervised learning: annotated/labelled dataset / ground truth
o Classification: discrete variable
o Regression: continuous variable
- Unsupervised learning: unlabeled dataset
o clustering

Examples:
Spam vs non-spam?




This is usually a problem of text mining. The emails have to be pre-processed in such a way
that we can create features from the dataset. This is a binary classification problem. The

,learning algorithm should come up with a function that matches the representation of the
emails.
- Find examples of spam and non-spam
- Come up with a learning algorithm
- A learning algorithm infers rules from examples: if (A or B or C) and not D, then spam
- These rules can then be applied to new data (emails)

Learning algorithms:
- See several different learning algorithms
- Implement 2-3 simple ones from scratch in Python
- Learn about Python libraries for ML (scikit-Learn)
- How to apply them to real-world problems

Machine learning examples:
- Recognize handwritten numbers and letters
- Recognize faces in photos
- Determine whether text expresses positive, negative or no opinion
- Guess person’s age based on a sample of writing
- Flag suspicious credit-card transactions
- Recommend books and movies to users based on their own and others’ purchase
history
- Recognize and label mentions of people’s or organization names in text

Types of learning problems:
Regression:
- Response: a (real) number
- Predict a person’s age
- Predict price of stock
- Predict student’s score on exam
Binary classification:
- Response: Yes/No answer
- Detect spam
- Predict polarity of product review: positive vs negative
Multiclass classification:
- Response: one of a finite set of options
- Classify newspaper article as:
o Politics, sports, science, technology, health, finance
- Detect species based on photo
o Passer domesticus, Calidris alba, Streptopelia, decaocto, corvus cornax
Multilabel classification:
- The output does not have to consist of a single thing, but it could be multiple things
(this is the difference with multiclass classification)
- Assign songs to one or more genres (rock, pop, metal)
- You are not trying to find all of the labels correctly, but you are trying to find the
most correct labels during training.
Autonomous behavior (example of a car)
- Input: measurements from sensors – camera, microphone, radar, accelerometer.

, - Response: instructions for actuators – steering, accelerator, brake.
- Evaluation: choose a baseline, choose a metric, compare!
- Different tasks, different metrics:
o Predicting age
o Flagging spam

Two metrics that we often use in regression problems:
- Mean absolute error – the average (absolute) difference between true value and
predicted value (yn true value (ground truth), ŷn predicted value)


- Mean squared error: the average square of the difference between true value and
predicted value – more sensitive to outlier, but it is differentiable (as opposed to
MAE)



For a binary classification problems, the metrics often used are:
- Accuracy
- Error rate
These are not really informative, especially if the database is not balanced.



Classification:
- False positive – flagged as spam, but not spam
- False negative – not flagged, but is spam
- False positives are a bigger issue for this problem!
- Ture positive – spam classified as spam
- Ture negative – not-spam classified as not-spam

Precision and recall:
- Metrics which focus on one kind of mistake
- Precision: what fraction of flagged emails were real spam?

- Recall: what fraction of real spams were flagged?


Example:

, Confusion matrix example:




f-score:
- Harmonic mean between precision and recall (a kind of average)


- Aka F-measure

Fβ :
- Parameter β quantifies how much more we care about recall than precision, when it
is greater than 1, that means, recall is weighted more, when it is smaller than 1, that
means precision is weighted more



Multiclass classification:
You can still make a confusion matrix with multiclass classification as well.




When there are more than two classes, you have to come up with alternatives when it
comes to rating the learning outcomes. You can use macro-average and micro-average.

Macro-average:
Precision true positive over labeled positives; recall, true positives over actual positives.
- You can only use this if the data is balanced.
- Compute precision and recall per-class, and average:

- Rare classes have the same impact as frequent classes

Micro-average:
- Gives every point equal importance (this is the difference from the macro-average).
- Micro averaging treats the entire set of data as an aggregate result, and calculates 1
metric rather than k metrics that get averaged together
€4,49
Accéder à l'intégralité du document:
Acheté par 51 étudiants

Garantie de satisfaction à 100%
Disponible immédiatement après paiement
En ligne et en PDF
Tu n'es attaché à rien

Reviews from verified buyers

Affichage de tous les 5 avis
1 semaine de cela

1 année de cela

1 année de cela

1 année de cela

2 année de cela

3,6

5 revues

5
2
4
1
3
1
2
0
1
1
Avis fiables sur Stuvia

Tous les avis sont réalisés par de vrais utilisateurs de Stuvia après des achats vérifiés.

Faites connaissance avec le vendeur

Seller avatar
Les scores de réputation sont basés sur le nombre de documents qu'un vendeur a vendus contre paiement ainsi que sur les avis qu'il a reçu pour ces documents. Il y a trois niveaux: Bronze, Argent et Or. Plus la réputation est bonne, plus vous pouvez faire confiance sur la qualité du travail des vendeurs.
liekebuuron Avans Hogeschool
S'abonner Vous devez être connecté afin de suivre les étudiants ou les cours
Vendu
170
Membre depuis
5 année
Nombre de followers
103
Documents
15
Dernière vente
2 semaines de cela

3,3

12 revues

5
5
4
2
3
1
2
0
1
4

Récemment consulté par vous

Pourquoi les étudiants choisissent Stuvia

Créé par d'autres étudiants, vérifié par les avis

Une qualité sur laquelle compter : rédigé par des étudiants qui ont réussi et évalué par d'autres qui ont utilisé ce document.

Le document ne convient pas ? Choisis un autre document

Aucun souci ! Tu peux sélectionner directement un autre document qui correspond mieux à ce que tu cherches.

Paye comme tu veux, apprends aussitôt

Aucun abonnement, aucun engagement. Paye selon tes habitudes par carte de crédit et télécharge ton document PDF instantanément.

Student with book image

“Acheté, téléchargé et réussi. C'est aussi simple que ça.”

Alisha Student

Foire aux questions