Class notes

Deep learning - all material

Rating

Sold

Pages

Uploaded on

19-10-2024

Written in

2023/2024

This is a summary of all deep learning lessons. All exam material is summarized here.

Institution

Course

Whoops! We can’t load your doc right now. Try again or contact support.

Report Copyright Violation

Written for

Institution: Radboud Universiteit Nijmegen (RU)
Study: Artificial Intelligence
Course: Deep Learning (SOWBKI230A)

All documents for this subject (1)

Document information

Uploaded on: October 19, 2024
Number of pages: 27
Written in: 2023/2024
Type: Class notes
Professor(s): Umut güçlü
Contains: All classes

Subjects

deep learning
machine learning

Content preview

Deep learning Lecture Notes

Lecture 1: Linear Algebra Refresher

Linear algebra is essential in the field of deep learning, as it is used to represent and manipulate high-
dimensional data, and to optimize the parameters of deep neural networks.

- A scalar is a single value, such as a number or a constant. It can be any real or complex
number.
- A vector is an array of numbers or scalars.
o The magnitude of a vector is the length of the arrow, and is represented by the
absolute value of the vector or the Euclidean norm.
- A matrix is a rectangular array of numbers or scalars. It can be used to represent a linear
transformation or a system of linear equations.
o Matrix multiplication is not commutative, meaning that A∗B is not the same as B∗A
o The determinant of a matrix is a scalar value that represents the scaling factor of the
matrix. It can be used to determine if a matrix is invertible and to find the inverse of a
matrix.
- A tensor is a multi-dimensional array of numbers or scalars. It can be used to represent high-
dimensional data, such as images or videos.
o Tensor contraction and tensor product are the two most common operations used on
tensors.
1. Tensor contraction is the process of summing over a set of indices to
reduce the number of dimensions in a tensor.
2. Tensor product is the operation of combining two or more tensors to form
a new tensor.
- The dot product/inner product is a way of multiplying two vectors together.
o It is a scalar value that can be used to measure the similarity between two vectors or
the angle between them.
o Given vectors ⃗v =[a 1 a 2 , a3 ] and ⃗
w =[ b1 , b 2 , b3 ], the dot product
is ⃗v ⋅⃗
w =a1 b 1+ a2 b2 +a3 b3 .
o The dot product of two vectors is equal to the magnitude of one vector multiplied by
the magnitude of the other vector multiplied by the cosine of the angle between
them.
- A matrix-vector product is a way of multiplying a matrix and a vector together. It is a vector
that represents the result of the linear transformation of the input vector by the matrix.
o The result is a new vector with the same number of rows as the matrix and the same
number of columns as the vector.
o The elements of the resulting vector are obtained by taking the dot product of each
row of the matrix with the vector.
o Example

1. [ ]
A= 1 2 and ⃗x =[5,6 ]
3 4

2. A ⃗x =[5 ×1+2 ×6
3 ×5+ 4 × 6
= ][ ]
17
39
=[17,39]
- Matrix-matrix multiplication is a way of multiplying two matrices together. The resulting
matrix represents the composition of the two original matrices as linear transformations.

, - A norm is a function that assigns a scalar value to a vector or a matrix. It can be used to
measure the size or distance of a vector or a matrix.
o The most common norm used in linear algebra is the Euclidean norm.
o Other norms include the L1 norm, which is the sum of the absolute values of the
components, and the max norm, which is the maximum value of the components.
These norms can be used to measure the sparsity or the maximum value of the
vector or matrix.
o Norms are used in deep learning to measure the size or distance of the parameters of
the neural network, and to regularize the model to prevent overfitting.

Applications

- Linear algebra is particularly used in the areas of neural networks and deep learning
architectures.
- Linear algebra concepts such as matrix-vector products, matrix-matrix multiplication, and
norms are used in the computation of forward and backward propagation in neural networks.
- Tensor operations such as tensor contraction and tensor product are used in convolutional
neural networks and recurrent neural networks to extract features from images and
sequences.
- Linear algebra concepts and operations are also used in optimization algorithms such as
gradient descent and stochastic gradient descent to adjust the parameters of the neural
network.

Lecture 2: Calculus Refresher

Calculus is essential in the field of deep learning, as it is used to optimise the parameters of deep
neural networks and to study the properties of activation functions used in these networks.

- The derivative of a function is a measure of the rate of change of the function at a certain
point.
' df ( a ) f ( x ) −f ( a ) f ( a+ h )−f ( a )
o f ( a )= =lim =lim
dx x→ a x −a h →0 h
o f′(x) is called the prime notation, and df(x)/dx is called the Leibniz notation.
o There are several rules for computing the derivatives of the basic functions and the
combined functions:

, o A partial derivative is the derivative of a multivariable function with respect to one
variable, while keeping the other variables constant. It measures the rate of change
of the output of the function with respect to one of its inputs, while ignoring the
effect of the other inputs.
' ∂f
1. f x ( x 1 , x 2 , … , x n ) = (x , x , … , xn )
i
∂ xi 1 2
o A gradient is a vector of partial derivatives of a multivariable function.
1. It represents the direction of the steepest ascent of the function, and can
be used in optimisation algorithms like gradient descent to update the
parameters of a model and improve its accuracy.
∂f ∂f ∂f
2. Let f ( x 1 , x 2 , … , x n ) , then the gradient of f is ∇ f =[ , ,…, ]
∂ x1 ∂ x2 ∂ xn
3. Example
 Let f ( x , y )=x 2− y 2
 Partial derivatives:
∂f
o =2 x
∂x
∂f
o =−2 y
∂y

[ ][
∂f
 Gradient: ∇ f =
∂f −2 y ]
∂ x = 2 x =[2 x −2 y ]

∂y
- Chain rule
o The derivative of the composition of two or more functions is equal to the derivative
of the outer function evaluated at the inner function, multiplied by the derivative of
the inner function.
df ( g ( x ) ) df ( u ) du
o f ( g ( x ) )= = ⋅
dx du dx
o Example
1. f ( x )=sin ( x ) g ( x ) =x2
df ( g ( x ) )
2. =cos ⁡(x 2 )⋅ 2 x
dx
o The chain rule is a crucial concept in
deep learning because it allows us to
compute the gradient of complex
functions, which are often
represented as the composition of
multiple simpler functions.
o The gradient is used in optimisation
algorithms like gradient descent to
update the weights of a deep learning model and improve its accuracy.
o By applying the chain rule, we can find the gradient of the loss function with respect
to the parameters of the model, which can be used to update the parameters in a
direction that reduces the loss.

R112,79

Get access to the full document:

100% satisfaction guarantee

Immediately available after payment

Both online and in PDF

No strings attached

Get to know the seller

donjaschipper

4,0

(1)

Get to know the seller

donjaschipper Radboud Universiteit Nijmegen

View profile

Sold

Member since

1 year

Number of followers

Documents

Last sold

6 months ago

4,0

1 reviews

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their exams and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can immediately select a different document that better matches what you need.

Pay how you prefer, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card or EFT and download your PDF document instantly.

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying this summary from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller donjaschipper. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy this summary for R112,79. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews) 45557 documents were sold in the last 30 days Founded in 2010, the go-to place to buy summaries for 15 years now

Deep learning - all material

Written for

Document information

Subjects

Content preview

More courses for Radboud Universiteit Nijmegen (RU) > Artificial Intelligence

Get to know the seller

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Didn't get what you expected? Choose another document

Pay how you prefer, start learning right away

Frequently asked questions

What do I get when I buy this document?

Satisfaction guarantee: how does it work?

Who am I buying this summary from?

Will I be stuck with a subscription?

Can Stuvia be trusted?