Garantie de satisfaction à 100% Disponible immédiatement après paiement En ligne et en PDF Tu n'es attaché à rien 4,6 TrustPilot
logo-home
Notes de cours

Complete WEEK1 note: Machine Learning & Learning Algorithms(BM05BAM)

Note
-
Vendu
-
Pages
10
Publié le
12-03-2024
Écrit en
2023/2024

THIS IS A COMPLETE NOTE FROM ALL BOOKS + LECTURE! Save your time for internships, other courses by studying over this note! Are you a 1st/2nd year of Business Analytics Management student at RSM, who want to survive the block 2 Machine Learning module? Are you overwhelmed with 30 pages of reading every week with brand-new terms and formulas? If you are lost in where to start or if you are struggling to keep up due to the other courses, or if you are just willing to learn about Machine Learning, I got you covered. I successfully passed the course Machine Learning & Learning Algorithms at RSM with 7.6, WITHOUT A TECHNICAL BACKGROUND before this Master. So if you are from Non-tech bachelor program, this note will navigate the knowledge you should focus on to pass the exam and successfully complete assignments, and for people with some machine learning knowledge, this note will certainly make your life easier and gets you a booster to your grade.

Montrer plus Lire moins
Établissement
Cours









Oups ! Impossible de charger votre document. Réessayez ou contactez le support.

Livre connecté

École, étude et sujet

Établissement
Cours
Cours

Infos sur le Document

Publié le
12 mars 2024
Nombre de pages
10
Écrit en
2023/2024
Type
Notes de cours
Professeur(s)
Jason roos
Contient
Toutes les classes

Sujets

Aperçu du contenu

HLM : Chapter 1 The Machine learning Landscape

Machine learning = field of study that gives computers the ability to learn without being
explicitly programmed.

Types of Machine learning system : added to the ISRL chapter 2
Main Challenges of Machine Learning:
Generalization
Generalization problem could be caused by sampling bias, overfitting.

It is crucial to use a training set that is representative of the cases you want to generalize
to. This is often harder than it sounds: if the sample is too small, you will have sampling
noise (i.e., nonrepresentative data as a result of chance/outlier/data errors)

However, even very large samples can be nonrepresentative if the sampling method is
flawed. This is called sampling bias.

Regularization: Constraining a model to make it simpler and reduce the risk of overfitting.
The amount of regularization to apply during learning can be controlled by a
hyperparameter. A hyperparameter is a parameter of a learning algorithm (not of the
model
- It must be set prior to training and remains constant during training.
- If you set the regularization hyperparameter to a very large value, you will get an
almost flat model (a slope close to zero)

Hyperparameter vs Parameter
- A model parameter is
o estimated during model training.
o internally optimized
- A hyperparameter must be
o specified before model training.
o optimized externally.

Concept drift: It happens when the relationship that model estimate changes after
training the model, due to the external conceptual change in the circumstance.

Testing and Validating:
A better option than testing on the new data is to split your data into two sets: the
training set and the test set, which allow you to test the performance before moving on
the actual practice. As these names imply, you train your model using the training set,
and you test it using the test set.
- It is common to use 80% of the data for training and hold out 20% for testing.
However, this depends on the size of the dataset:

, The error rate on new cases is called the generalization error (or out-of-sample error), and
by evaluating your model on the test set, you get an estimate of this error.

Hyperparameter Tuning and model selection
Suppose you are hesitating between two types of models (say, a linear model and a
polynomial model): how can you decide between them?

When you want to compare just two different models: just train models on the same
train data and compare the generalization performance with test data.
When you want to find the best performing hyperparameter among 100 options: You
cannot do the same.
- When you measure the generalization error multiple times on the test set, you
adapt the model and hyperparameters to produce the best model for that
particular set so it won’t perform as well on the new data.

A common solution to this problem is called holdout validation: you simply hold out part
of the training set to evaluate several candidate models and select the best one. The new
held-out set is called the validation set (or sometimes the development set, or dev set).

Process
1. You train multiple models with various hyperparameters on the reduced training
set.
2. You select the model that performs best on the validation set (holdout validation
process)
a. if the model performs poorly on the train-dev set, then it must have overfit
the training set, so you should try to simplify or regularize the model, get
more training data, and clean up the training data.
3. You train the best model on the full training set, including the validation set
4. Test the generalization error on the test set.

Validation set should not be too small: then model evaluations will be imprecise
Validation set should not be too large: remaining training set will be much smaller, which
would change the performance result after training on the full training set.

One way to solve this problem is Cross validation that uses small validation sets. Each
model is evaluated once per validation set after it is trained on the rest of the data. By
averaging the evaluations of the mode, you get much more accurate measure of
performance.
- It also means that training time is multiplied by the number of validation sets.

No Free Lunch Theorem
David Wolpert demonstrated that if you make absolutely no assumption about the data,
then there is no reason to prefer one model over any other. This is called the No Free
Lunch (NFL) theorem.
€12,99
Accéder à l'intégralité du document:

Garantie de satisfaction à 100%
Disponible immédiatement après paiement
En ligne et en PDF
Tu n'es attaché à rien


Document également disponible en groupe

Faites connaissance avec le vendeur

Seller avatar
Les scores de réputation sont basés sur le nombre de documents qu'un vendeur a vendus contre paiement ainsi que sur les avis qu'il a reçu pour ces documents. Il y a trois niveaux: Bronze, Argent et Or. Plus la réputation est bonne, plus vous pouvez faire confiance sur la qualité du travail des vendeurs.
ArisMaya Erasmus Universiteit Rotterdam
S'abonner Vous devez être connecté afin de suivre les étudiants ou les cours
Vendu
49
Membre depuis
4 année
Nombre de followers
30
Documents
20
Dernière vente
3 mois de cela
Let's Pass Together!

4,0

1 revues

5
0
4
1
3
0
2
0
1
0

Récemment consulté par vous

Pourquoi les étudiants choisissent Stuvia

Créé par d'autres étudiants, vérifié par les avis

Une qualité sur laquelle compter : rédigé par des étudiants qui ont réussi et évalué par d'autres qui ont utilisé ce document.

Le document ne convient pas ? Choisis un autre document

Aucun souci ! Tu peux sélectionner directement un autre document qui correspond mieux à ce que tu cherches.

Paye comme tu veux, apprends aussitôt

Aucun abonnement, aucun engagement. Paye selon tes habitudes par carte de crédit et télécharge ton document PDF instantanément.

Student with book image

“Acheté, téléchargé et réussi. C'est aussi simple que ça.”

Alisha Student

Foire aux questions