Notas de lectura

Deep learning - all material

Puntuación

Vendido

Páginas

Subido en

19-10-2024

Escrito en

2023/2024

Dit is een samenvatting van alle lessen van deep learning. Alle tentamenstof is hierin samengevat

Institución

Grado

Ups! No podemos cargar tu documento ahora. Inténtalo de nuevo o contacta con soporte.

Informar violación de derechos de autor

Escuela, estudio y materia

Institución: Radboud Universiteit Nijmegen (RU)
Estudio: Artificial Intelligence
Grado: Deep Learning (SOWBKI230A)

Todos documentos para esta materia (1)

Información del documento

Subido en: 19 de octubre de 2024
Número de páginas: 27
Escrito en: 2023/2024
Tipo: Notas de lectura
Profesor(es): Umut güçlü
Contiene: Todas las clases

Temas

deep learning
machine learning

Vista previa del contenido

Deep learning Lecture Notes

Lecture 1: Linear Algebra Refresher

Linear algebra is essential in the field of deep learning, as it is used to represent and manipulate high-
dimensional data, and to optimize the parameters of deep neural networks.

- A scalar is a single value, such as a number or a constant. It can be any real or complex
number.
- A vector is an array of numbers or scalars.
o The magnitude of a vector is the length of the arrow, and is represented by the
absolute value of the vector or the Euclidean norm.
- A matrix is a rectangular array of numbers or scalars. It can be used to represent a linear
transformation or a system of linear equations.
o Matrix multiplication is not commutative, meaning that A∗B is not the same as B∗A
o The determinant of a matrix is a scalar value that represents the scaling factor of the
matrix. It can be used to determine if a matrix is invertible and to find the inverse of a
matrix.
- A tensor is a multi-dimensional array of numbers or scalars. It can be used to represent high-
dimensional data, such as images or videos.
o Tensor contraction and tensor product are the two most common operations used on
tensors.
1. Tensor contraction is the process of summing over a set of indices to
reduce the number of dimensions in a tensor.
2. Tensor product is the operation of combining two or more tensors to form
a new tensor.
- The dot product/inner product is a way of multiplying two vectors together.
o It is a scalar value that can be used to measure the similarity between two vectors or
the angle between them.
o Given vectors ⃗v =[a 1 a 2 , a3 ] and ⃗
w =[ b1 , b 2 , b3 ], the dot product
is ⃗v ⋅⃗
w =a1 b 1+ a2 b2 +a3 b3 .
o The dot product of two vectors is equal to the magnitude of one vector multiplied by
the magnitude of the other vector multiplied by the cosine of the angle between
them.
- A matrix-vector product is a way of multiplying a matrix and a vector together. It is a vector
that represents the result of the linear transformation of the input vector by the matrix.
o The result is a new vector with the same number of rows as the matrix and the same
number of columns as the vector.
o The elements of the resulting vector are obtained by taking the dot product of each
row of the matrix with the vector.
o Example

1. [ ]
A= 1 2 and ⃗x =[5,6 ]
3 4

2. A ⃗x =[5 ×1+2 ×6
3 ×5+ 4 × 6
= ][ ]
17
39
=[17,39]
- Matrix-matrix multiplication is a way of multiplying two matrices together. The resulting
matrix represents the composition of the two original matrices as linear transformations.

, - A norm is a function that assigns a scalar value to a vector or a matrix. It can be used to
measure the size or distance of a vector or a matrix.
o The most common norm used in linear algebra is the Euclidean norm.
o Other norms include the L1 norm, which is the sum of the absolute values of the
components, and the max norm, which is the maximum value of the components.
These norms can be used to measure the sparsity or the maximum value of the
vector or matrix.
o Norms are used in deep learning to measure the size or distance of the parameters of
the neural network, and to regularize the model to prevent overfitting.

Applications

- Linear algebra is particularly used in the areas of neural networks and deep learning
architectures.
- Linear algebra concepts such as matrix-vector products, matrix-matrix multiplication, and
norms are used in the computation of forward and backward propagation in neural networks.
- Tensor operations such as tensor contraction and tensor product are used in convolutional
neural networks and recurrent neural networks to extract features from images and
sequences.
- Linear algebra concepts and operations are also used in optimization algorithms such as
gradient descent and stochastic gradient descent to adjust the parameters of the neural
network.

Lecture 2: Calculus Refresher

Calculus is essential in the field of deep learning, as it is used to optimise the parameters of deep
neural networks and to study the properties of activation functions used in these networks.

- The derivative of a function is a measure of the rate of change of the function at a certain
point.
' df ( a ) f ( x ) −f ( a ) f ( a+ h )−f ( a )
o f ( a )= =lim =lim
dx x→ a x −a h →0 h
o f′(x) is called the prime notation, and df(x)/dx is called the Leibniz notation.
o There are several rules for computing the derivatives of the basic functions and the
combined functions:

, o A partial derivative is the derivative of a multivariable function with respect to one
variable, while keeping the other variables constant. It measures the rate of change
of the output of the function with respect to one of its inputs, while ignoring the
effect of the other inputs.
' ∂f
1. f x ( x 1 , x 2 , … , x n ) = (x , x , … , xn )
i
∂ xi 1 2
o A gradient is a vector of partial derivatives of a multivariable function.
1. It represents the direction of the steepest ascent of the function, and can
be used in optimisation algorithms like gradient descent to update the
parameters of a model and improve its accuracy.
∂f ∂f ∂f
2. Let f ( x 1 , x 2 , … , x n ) , then the gradient of f is ∇ f =[ , ,…, ]
∂ x1 ∂ x2 ∂ xn
3. Example
 Let f ( x , y )=x 2− y 2
 Partial derivatives:
∂f
o =2 x
∂x
∂f
o =−2 y
∂y

[ ][
∂f
 Gradient: ∇ f =
∂f −2 y ]
∂ x = 2 x =[2 x −2 y ]

∂y
- Chain rule
o The derivative of the composition of two or more functions is equal to the derivative
of the outer function evaluated at the inner function, multiplied by the derivative of
the inner function.
df ( g ( x ) ) df ( u ) du
o f ( g ( x ) )= = ⋅
dx du dx
o Example
1. f ( x )=sin ( x ) g ( x ) =x2
df ( g ( x ) )
2. =cos ⁡(x 2 )⋅ 2 x
dx
o The chain rule is a crucial concept in
deep learning because it allows us to
compute the gradient of complex
functions, which are often
represented as the composition of
multiple simpler functions.
o The gradient is used in optimisation
algorithms like gradient descent to
update the weights of a deep learning model and improve its accuracy.
o By applying the chain rule, we can find the gradient of the loss function with respect
to the parameters of the model, which can be used to update the parameters in a
direction that reduces the loss.

$6.73

Accede al documento completo:

100% de satisfacción garantizada

Inmediatamente disponible después del pago

Tanto en línea como en PDF

No estas atado a nada

Conoce al vendedor

donjaschipper

4.0

(1)

Conoce al vendedor

donjaschipper Radboud Universiteit Nijmegen

Ver perfil

Seguir

Vendido

Miembro desde

1 año

Número de seguidores

Documentos

Última venta

6 meses hace

4.0

1 reseñas

Recientemente visto por ti

Por qué los estudiantes eligen Stuvia

Creado por compañeros estudiantes, verificado por reseñas

Calidad en la que puedes confiar: escrito por estudiantes que aprobaron y evaluado por otros que han usado estos resúmenes.

¿No estás satisfecho? Elige otro documento

¡No te preocupes! Puedes elegir directamente otro documento que se ajuste mejor a lo que buscas.

Paga como quieras, empieza a estudiar al instante

Sin suscripción, sin compromisos. Paga como estés acostumbrado con tarjeta de crédito y descarga tu documento PDF inmediatamente.

“Comprado, descargado y aprobado. Así de fácil puede ser.”

Alisha Student

Preguntas frecuentes

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

100% de satisfacción garantizada: ¿Cómo funciona?

Nuestra garantía de satisfacción le asegura que siempre encontrará un documento de estudio a tu medida. Tu rellenas un formulario y nuestro equipo de atención al cliente se encarga del resto.

Who am I buying this summary from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller donjaschipper. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy this summary for $6.73. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews) 45,681 summaries were sold in the last 30 days Founded in 2010, the go-to place to buy summaries for 15 years now