100% tevredenheidsgarantie Direct beschikbaar na je betaling Lees online óf als PDF Geen vaste maandelijkse kosten 4.2 TrustPilot
logo-home
Tentamen (uitwerkingen)

Stanford University STATS 231 hw1-solutions.

Beoordeling
-
Verkocht
-
Pagina's
10
Cijfer
A+
Geüpload op
06-07-2021
Geschreven in
2020/2021

Homework 1 solutions CS229T/STATS231 (Fall) 1. Value of labeled data (11 points) In many applications, labeled data is expensive and therefore limited, while unlabeled data is cheap and therefore abundant. For example, there are tons of images on the web, but getting labeled images is much harder. What is the statistical value of having labeled data versus unlabeled data? This problem will explore this formally using asymptotics. Specifically, suppose we have an exponential family model over a discrete latent variable h and a discrete observed variable x: pθ(h; x) = expfθ · φ(h; x) − A(θ)g; where A(θ) = log Ph;x expfθ · φ(h; x)g is the usual log-partition function. Suppose that n examples (h(1); x(1)); : : : ; (h(n); x(n)) are drawn i.i.d. from some true distribution pθ∗. Define the following two estimators: ^ θ sup = arg max θ2Rd 1 n nX i =1 log pθ(h(i); x(i)) (1) ^ θ unsup = arg max θ2Rd 1 n nX i =1 log X h pθ(h; x(i)): (2) The supervised estimator θ^sup uses the variable h(i) and maximizes the joint likelihood, while the unsupervised estimator θ^ unsup marginalizes out the latent variable h. One important caveat: our results will hold when we assume that data is actually generated from our model family and that unsupervised learning is possible. Otherwise, labeled data is worth a lot more. a. (2 points) (supervised asymptotic variance) Compute the asymptotic variance of θ^sup: that is, given that pn(θ^sup − θ∗) −! N d (0; Vsup), write an expression for Vsup that depends on expectations/variances involving φ. Solution: Using notation from class, let ‘ denote the log-likelihood and L be the expected log-likelihood. Recall that V sup = r2L(θ∗)−1 Covθ∗[r‘(z; θ∗)]r2L(θ∗)−1 = Covθ∗[r‘(z; θ∗)]−1; where the second equality follows from Bartlett’s identity since we have assumed the model is well specified. Now, letting z = (h; x), r‘(z; θ∗) = r(θ∗ · φ(h; x) − A(θ∗)) = φ(h; x) − Eθ∗[φ(h; x)]; so V sup = Covθ∗ [φ(h; x) − Eθ∗[r‘(z; θ∗)]]−1 = Covθ∗[φ(h; x)]−1 : 1b. (2 points) (unsupervised asymptotic variance) Compute the asymptotic variance of θ^unsup: that is, given that pn(θ^unsup − θ∗) −! N d (0; Vunsup), write an expression for Vunsup that depends on expectations/variances involving φ. Solution: Similar to the previous part, we have r‘(z; θ∗) = r log X h expfθ∗ · φ(h; x) − A(θ∗)g = Ph pθ∗(h; x) · (φ(h; x) − Eθ∗[φ(h; x)]) Ph pθ∗(h; x) = Eθ∗[φ(h; x) j x] − Eθ∗[φ(h; x)]: Therefore, V unsup = Covθ∗[Eθ∗ [φ(h; x) j x] − Eθ∗[φ(h; x)]]−1 = Covθ∗ [Eθ∗[φ(h; x) j x]]−1 : c. (3 points) (comparing estimators) Prove that θ^sup has lower (or equal) asymptotic variance compared to θ^unsup. That is, show that V sup  Vunsup; Solution: We have, V −1 sup = Covθ∗[φ(h; x)] = Eθ∗[Covθ∗[φ(h; x) j x]] + Covθ∗[Eθ∗[φ(h; x) j x]] = Eθ∗[Covθ∗[φ(h; x) j x]] + Vunsup −1  V −1 unsup; where the last inequality follows since the covariance is positive semi-definite. Therefore, Vsup  Vunsup; i.e., ^

Meer zien Lees minder
Instelling
Vak









Oeps! We kunnen je document nu niet laden. Probeer het nog eens of neem contact op met support.

Geschreven voor

Instelling
Vak

Documentinformatie

Geüpload op
6 juli 2021
Aantal pagina's
10
Geschreven in
2020/2021
Type
Tentamen (uitwerkingen)
Bevat
Vragen en antwoorden

Onderwerpen

Voorbeeld van de inhoud

Homework 1 solutions
CS229T/STATS231 (Fall 2018–2019)
Note: please do not copy or distribute.

Due date: Wed, Oct 10, 11pm




1. Value of labeled data (11 points)
In many applications, labeled data is expensive and therefore limited, while unlabeled data is cheap and
therefore abundant. For example, there are tons of images on the web, but getting labeled images is much
harder. What is the statistical value of having labeled data versus unlabeled data? This problem will explore
this formally using asymptotics.
Specifically, suppose we have an exponential family model over a discrete latent variable h and a discrete
observed variable x:
pθ (h, x) = exp{θ · φ(h, x) − A(θ)},
P
where A(θ) = log h,x exp{θ · φ(h, x)} is the usual log-partition function.
Suppose that n examples (h(1) , x(1) ), . . . , (h(n) , x(n) ) are drawn i.i.d. from some true distribution pθ∗ .
Define the following two estimators:
n
1X
θ̂sup = arg max log pθ (h(i) , x(i) ) (1)
θ∈Rd n i=1
n
1X X
θ̂unsup = arg max log pθ (h, x(i) ). (2)
θ∈Rd n i=1
h

The supervised estimator θ̂sup uses the variable h(i) and maximizes the joint likelihood, while the unsuper-
vised estimator θ̂unsup marginalizes out the latent variable h.
One important caveat: our results will hold when we assume that data is actually generated from our
model family and that unsupervised learning is possible. Otherwise, labeled data is worth a lot more.



a. (2 points) (supervised asymptotic variance) Compute the asymptotic variance of θ̂sup : that is,
√ d
given that n(θ̂sup − θ∗ ) −
→ N (0, Vsup ), write an expression for Vsup that depends on expectations/variances
involving φ.

Solution:
Using notation from class, let ` denote the log-likelihood and L be the expected log-likelihood. Recall that

Vsup = ∇2 L(θ∗ )−1 Covθ∗ [∇`(z, θ∗ )]∇2 L(θ∗ )−1 = Covθ∗ [∇`(z, θ∗ )]−1 ,

where the second equality follows from Bartlett’s identity since we have assumed the model is well specified.
Now, letting z = (h, x),

∇`(z, θ∗ ) = ∇(θ∗ · φ(h, x) − A(θ∗ )) = φ(h, x) − Eθ∗ [φ(h, x)],

so
−1
Vsup = Covθ∗ [φ(h, x) − Eθ∗ [∇`(z, θ∗ )]] = Covθ∗ [φ(h, x)]−1 .


1

, b. (2 points) (unsupervised asymptotic variance) Compute the asymptotic variance of θ̂unsup : that
√ d
is, given that n(θ̂unsup − θ∗ ) −
→ N (0, Vunsup ), write an expression for Vunsup that depends on expectation-
s/variances involving φ.

Solution:
Similar to the previous part, we have
X
∇`(z, θ∗ ) = ∇ log exp{θ∗ · φ(h, x) − A(θ∗ )}
h
P
· (φ(h, x) − Eθ∗ [φ(h, x)])
h pθ (h, x)P

=
h pθ (h, x)



= Eθ∗ [φ(h, x) | x] − Eθ∗ [φ(h, x)].

Therefore,
−1 −1
Vunsup = Covθ∗ [Eθ∗ [φ(h, x) | x] − Eθ∗ [φ(h, x)]] = Covθ∗ [Eθ∗ [φ(h, x) | x]] .




c. (3 points) (comparing estimators) Prove that θ̂sup has lower (or equal) asymptotic variance
compared to θ̂unsup . That is, show that

Vsup  Vunsup ,


Solution:
We have,
−1
Vsup = Covθ∗ [φ(h, x)] = Eθ∗ [Covθ∗ [φ(h, x) | x]] + Covθ∗ [Eθ∗ [φ(h, x) | x]]
−1
= Eθ∗ [Covθ∗ [φ(h, x) | x]] + Vunsup
−1
 Vunsup ,

where the last inequality follows since the covariance is positive semi-definite. Therefore, Vsup  Vunsup ; i.e.,
θ̂sup has lower asymptotic variance.



d. (4 points)
Consider the exponential family
1
pθ (h, x) = exp(θhx),
Z
where h, x ∈ {0, 1} and Z = h,x∈{0,1}2 exp(θhx).1 Essentially, (h, x) is a pair of correlated biased coin
P

flips, where pθ (1, 1) = exp(θ)/Z and pθ (0, 0) = pθ (0, 1) = pθ (1, 0) = 1/Z.
1Z is often referred to as the partition function.


2
€8,30
Krijg toegang tot het volledige document:

100% tevredenheidsgarantie
Direct beschikbaar na je betaling
Lees online óf als PDF
Geen vaste maandelijkse kosten


Ook beschikbaar in voordeelbundel

Maak kennis met de verkoper

Seller avatar
De reputatie van een verkoper is gebaseerd op het aantal documenten dat iemand tegen betaling verkocht heeft en de beoordelingen die voor die items ontvangen zijn. Er zijn drie niveau’s te onderscheiden: brons, zilver en goud. Hoe beter de reputatie, hoe meer de kwaliteit van zijn of haar werk te vertrouwen is.
Themanehoppe American Intercontinental University Online
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
292
Lid sinds
4 jaar
Aantal volgers
223
Documenten
3485
Laatst verkocht
3 maanden geleden

3,4

48 beoordelingen

5
21
4
5
3
7
2
3
1
12

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Veelgestelde vragen