100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.2 TrustPilot
logo-home
Exam (elaborations)

IE—456/556 & EEE-448/548 %% Reinforcement Learning and Dynamic Programming Final Exam - Summer 2022

Rating
-
Sold
-
Pages
8
Grade
A+
Uploaded on
25-09-2025
Written in
2025/2026

IE—456/556 & EEE-448/548 %% Reinforcement Learning and Dynamic Programming Final Exam - Summer 2022 Duration: 150 minutes Name Surname: Bilkent ID: Signature: Q1: Pacman Bonus Level! o o o (€] 1 2 3 4 o) Pacman is in a bonus level! With no ghosts around, he can eat as many dots as he wants. He is in the 5 x 1 grid shown above, where the cells are numbered from left to right, that is, s € {1,...,5}. In cells 1 through 4, the actions available are to move Right (R) or to Fly (F') out of the bonus level. The action Right deterministically lands Pacman in the cell to the right (and he eats the dot there), while the Fly action deterministically lands him in a terminal state and ends the game. From cell 5, Fly is the only action. Eating a dot gives a reward of +10, while flying out gives a reward of +20. 4 2 =§ (a) (4 pts) How many deterministic policies are there in the above MDP? Consider the following policies for 0 < i < 4: 7;(s) = R if s <4, F' otherwise. (b) ($2pts) Find the value functions of vx, (1), vr,(1), and v,(1) for the discount of v = 1, and fill out the table. Show your work. Uny (1) 20 Uy (1) 4’6 (D | 40 Vg (1) =20+ 8(0) =20 Do, (1) = 10+800)+ ¥7(29) =40 4 og () =16+ % 10) 4 50 +T(0)T & (20) = 60

Show more Read less
Institution
Revision
Course
Revision









Whoops! We can’t load your doc right now. Try again or contact support.

Written for

Institution
Revision
Course
Revision

Document information

Uploaded on
September 25, 2025
Number of pages
8
Written in
2025/2026
Type
Exam (elaborations)
Contains
Questions & answers

Subjects

Content preview

IE—456/556 & EEE-448/548
%\%\\ Reinforcement Learning and Dynamic Programming
Final Exam - Summer 2022
Duration: 150 minutes


Name Surname: Bilkent ID: Signature:


Q1: Pacman Bonus Level!




o o o (€]


1 2 3 4 o)
Pacman is in a bonus level! With no ghosts around, he can eat as many dots as he wants. He is in
the 5 x 1 grid shown above, where the cells are numbered from left to right, that is, s € {1,...,5}.
In cells 1 through 4, the actions available are to move Right (R) or to Fly (F') out of the bonus
level. The action Right deterministically lands Pacman in the cell to the right (and he eats the
dot there), while the Fly action deterministically lands him in a terminal state and ends the game.
From cell 5, Fly is the only action. Eating a dot gives a reward of +10, while flying out gives a
reward of +20.

(a) (4 pts) How many deterministic policies are there in the above MDP?
4
2 =\§
Consider the following policies for 0 < i < 4: 7;(s) = R if s <4, F' otherwise.

(b) ($2pts) Find the value functions of vx, (1), vr,(1), and v,(1) for the discount of v = 1, and
fill out the table. Show your work.

Uny (1) 20
Uy (1) 4’6
(D | 40

Vg (1) =20+ 8(0) =20
Do,(1) = 10+800)+ ¥7(29) =40 4
og () =16+ % 10) 4 50 +T(0)T & (20) = 60

, (@) (10 pts) For what ranges of v, m4 is the optimal policy (that is, my is strictly better than
o, M1, T, and 7m3)?

4
Ty
G
t F
0410 +105+I0+26%
P 2

Ty -5 |04 |0¥F|6T 420
[/ SY— \0+\025+z@\)}

[ — 10+ 208

x, —> 20


v, W0 7 Vg () we need o have
For
\ e x m x Ve \0 +\ 02 $H 07f+zoz’ =5
\0 +\ 0F +\ 0 8 +
A
For Ve ()7 VL) uk reed b‘/z
For \/72(\) Y \/?L\) we need 87V

For VL) 7Vp ) e need 87Y%


So, for Y7, we have 07*(\)>u130)>\/£l)>\/1\0)>\£§\)
o K25 g ’f‘K‘Z Ofivfi"v‘vfl/{ &;0/,‘%.
B')""J"

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
Abbyy01 Exam Questions
View profile
Follow You need to be logged in order to follow users or courses
Sold
91
Member since
3 year
Number of followers
33
Documents
1121
Last sold
4 weeks ago

3.5

13 reviews

5
5
4
2
3
3
2
1
1
2

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions