Class notes

Lecture notes Introduction to Machine Learning (8BB020)

Rating

Sold

Pages

Uploaded on

31-10-2025

Written in

2024/2025

This document contains everything you need to know for each lecture before the exam. Super handy to review.

Institution

Course

Whoops! We can’t load your doc right now. Try again or contact support.

Report Copyright Violation

Written for

Institution: Technische Universiteit Eindhoven (TUE)
Study: Biomedische Technologie
Course: Introductie Machine Learning (8BB020)

All documents for this subject (1)

Document information

Uploaded on: October 31, 2025
Number of pages: 42
Written in: 2024/2025
Type: Class notes
Professor(s): Dr. federica eduati
Contains: All classes

Subjects

classification
k nn
overfitting
underfitting
parametric model
non parametric model
least squares method
gradient descent method
unsupervised machine learning
supervised machine learning

Content preview

College 1

Unsupervised machine learning:
- given a dataset xi, find some interesting properties. (clustering, density
estimation, generative models)
- a type of machine learning that learns from data without human
supervision.
- Unsupervised machine learning models are given unlabeled data and
allowed to discover patterns and insights without any explicit guidance or
instruction.

Supervised machine learning (most common):
- given a training dataset {xi, yi}, predict ˆyi of previously unseen
samples. (regression -> yi is continuous, classification -> yi is categorical)
- a category of machine learning that uses labeled datasets to train
algorithms to predict outcomes and recognize patterns.
- supervised learning algorithms are given labeled training to learn the
relationship between the input and the outputs.

Notations:

- Y = outcome measurement (dependent variable/response/target)
In regression Y is quantitative (e.g. price, blood pressure)
In classification Y takes values in a finite. This means it can come out
as large unordered sets. (survived/died, cancer class of tissue
sample)
For both we need a training data set: these are pairs of observations.
- X = vector of p predictor measurements
(inputs/regressors/covariates/features/independent variables)
- Machine learning aims to ’learn’ a model f that predicts the outcome
Y given the input X:

Y = f (X) + ϵ

Epsilon captures measurement errors and other disruption.

The main aim of machine learning is that we want to create a formula f
that is applicable to different situations. On the basis of the training data
we would like to:

- Accurately predict unseen test cases
- Understand which input affects the outcome and how
- Assess the quality of our predictions

,Classification:

- Feature space is the space where you plot the data
points in.
- k-Nearest neighbours classifier algorithm:
supervised learning classifier, which classifies or
predicts the output for a given input based on the
its closest neighbours in the feature space.
- The number of k can be changed, to achieve a
better boundary:
How to pick the k-value:
Give more data to the model

^y = y hat = prediction notation

Formalized algorithm k-NN:

- Xnew = [x0,x1] zijn nieuwe features waarvoor je de klasse ^ynew
wilt voorspellen
- Bereken de afstand tussen xnew en de bestaande punten in je
trainingsdataset.
d(xnew, xi) = sqrt((xnew,0 – xi,0)2 + (xnew,1 – xi,1)2) (using the Euclidean
distance)
- Sort the samples based on the distance and pick the k nearest ones
to the new example.
- Determine the class of the k nearest training samples.
- Assign to xnew the majority class of its nearest training samples
(neighbours).

The k-NN can be extended

- Using it for more than two classes
- Using k-NN for regression is also possible (instead of computing the
majority class of the nearest neighbours, we compute the average
target value y).
- Use different distance metric is common: for example the L1-
distance instead of the Euclidean distance. d(xnew, xi) = |xnew,0 −
xi,0| + |xnew,1 − xi,1|

Choosing the k value:

We need to choose k based on the performance
on an independent test set -> no examples
should be related to the ones in the training set.

Compute the error rate for a test set and training
set, and determine a good k by looking at both
graphs and search the lowest error rate.

,The error on the independent test dataset is called the generalization
error: it tells us how well we can expect our classifier to generalize its
performance on new, unseen examples.

- Classifiers that produce simple decision boundaries can have higher
training errors but usually generalize better to new samples.
- Classifiers that produce complex decision boundaries can have lower
training errors but usually generalize worse to new samples.
- Complexity decreases with k getting bigger:
Kleine waarden van k -> ruis kan een grote invloed hebben ->
complex model -> overfitting -> low error -> slechte prestatie op
nieuwe data, omdat het de trainingset heel goed kent.
Grote waarden van k -> ruis heeft kleine invloed -> minder complex
model -> underfitting -> high error -> te veel generalisatie en mist
belangrijke patronen.

Parametric models:

- The number of parameters is fixed.
- Once the model is trained (parameters are determined), we can
throw away the training dataset.
- Linear regression is an example.

Non-parametric models:

- The number of parameters is not fixed, it grows with the number of
training samples.
- K-NN is an example of non-parametric machine learning model.

Lecture 2

11-9-2024

Linear models for regression and
classification
In general:

,  Fundamentals
 Model interpretation
 Estimation
 Model evaluation

All in lectures = exam  look at the book for different interpretations

Linear models = combination of inputs (predictors, features or
independent variables) to predict the output

Regression = output is quantitative (continuous variable)

Classification = cateragories (binary variable (can be multiclassed))

When to use a linear model?  look at the complexity of the cohesion
between the variables and output.

Least squares method

Y on y axis

X on x axis

You want to find the red line

$8.35

Get access to the full document:

100% satisfaction guarantee

Immediately available after payment

Both online and in PDF

No strings attached

Get to know the seller

esmeevanbogget

Get to know the seller

esmeevanbogget Technische Universiteit Eindhoven

View profile

Sold

New on Stuvia

Member since

2 months

Number of followers

Documents

Last sold

0.0

0 reviews

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller esmeevanbogget. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $8.35. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews) 56626 documents were sold in the last 30 days Founded in 2010, the go-to place to buy study notes for 16 years now