Exam (elaborations)

Solutions Manual for Introduction to Natural Language Processing by Jacob Eisenstein

Rating

Sold

Pages

Grade

A+

Uploaded on

29-04-2025

Written in

2024/2025

Solutions Manual for Introduction to Natural Language Processing by Jacob Eisenstein

Institution

Introduction To Natural Language Processing

Course

Introduction to Natural Language Processing

Whoops! We can’t load your doc right now. Try again or contact support.

Report Copyright Violation

Connected book

Jacob Eisenstein Introduction to Natural Language Processing

Edition:2019
ISBN:9780262354578
Edition:Unknown

Written for

Institution: Introduction to Natural Language Processing
Course: Introduction to Natural Language Processing

Document information

Uploaded on: April 29, 2025
Number of pages: 84
Written in: 2024/2025
Type: Exam (elaborations)
Contains: Questions & answers

Subjects

solutions manual for introduction to natural langu
introduction to natural language processing by ja
introduction to natural language processing

Content preview

,Exercise solutions

Linear text classiﬁcation
�
1. Let x be a bag-of-words vector such that Vj=1 xj = 1. Verify that the multinomial
probability pmult (x; φ), as deﬁned in Equation 2.12, is identical to the probability of
the same document under a categorical distribution, pcat (w; φ).

2. Suppose you have a single feature x, with the following conditional distribution:


 α, X = 0, Y = 0


1 − α, X = 1, Y = 0
p(x | y) = [B.23]

 1 − β, X = 0, Y = 1


β, X = 1, Y = 1.

Further suppose that the prior is uniform, Pr(Y = 0) = Pr(Y = 1) = 12 , and that
both α > 12 and β > 12 . Given a Naı̈ve Bayes classiﬁer with accurate parameters,
what is the probability of making an error?
Answer:

ŷ(X = 0) =0 [B.24]
ŷ(X = 1) =1 [B.25]
Pr(ŷ = 0 | Y = 1) = Pr(X = 0 | Y = 1) = (1 − β) [B.26]
Pr(ŷ = 1 | Y = 0) = Pr(X = 1 | Y = 0) = (1 − α) [B.27]
1
Pr(ŷ �= y) = (1 − β + 1 − α) [B.28]
2
1
=1 − (α + β) [B.29]
2

3. Derive the maximum-likelihood estimate for the parameter µ in Naı̈ve Bayes.

619

,620 BIBLIOGRAPHY

Answer:

N
�
L(µ) = log pcat (y (i) ; µ) [B.30]
i=1
N
�
= log µy(i) [B.31]
i=1
N
� K
�
� �
�(µ) = log µy(i) − λ µy − 1 [B.32]
i=1 y=1
N � �
∂�(µ) � δ y (i) = y
= −λ [B.33]
∂µy i=1
µy
N
� � �
µy ∝ δ y (i) = y [B.34]
i=1

4. The classiﬁcation models in the text have a vector of weights for each possible label.
While this is notationally convenient, it is overdetermined: for any linear classiﬁer
that can be obtained with K × V weights, an equivalent classiﬁer can be constructed
using (K − 1) × V weights.

a) Describe how to construct this classiﬁer. Speciﬁcally, if given a set of weights
θ and a feature function f (x, y), explain how to construct alternative weights
and feature function θ � and f � (x, y), such that,

∀y, y � ∈ Y, θ · f (x, y) − θ · f (x, y � ) = θ � · f � (x, y) − θ � · f � (x, y � ). [B.35]

b) Explain how your construction justiﬁes the well-known alternative form for
binary logistic regression, Pr(Y = 1 | x; θ) = 1+exp(−θ
1
� ·x) = σ(θ · x), where σ
�

is the sigmoid function.

Jacob Eisenstein. Draft of January 16, 2019.

, BIBLIOGRAPHY 621

Answer:
a) Let θK,j indicate the weight for base feature j in class K. Then θk,j�
= θk,j − θK,j , and
f (x, y) = f (x, y) for all y < K. This means that θ · f (x, K) = 0.
�

b) In binary classiﬁcation, θ � = θ0 − θ1 .

exp (θ · f (x, 0))
Pr(Y = 0 | x; θ) = [B.36]
exp (θ · f (x, 0)) + exp (θ · f (x, 1))
1
= [B.37]
1 + exp (θ · f (x, 1) − θ · f (x, 0))
1
= . [B.38]
1 + exp (−θ � · x)

5. Suppose you have two labeled datasets D1 and D2 , with the same features and la-
bels.

• Let θ (1) be the unregularized logistic regression (LR) coefﬁcients from training
on dataset D1 .
• Let θ (2) be the unregularized LR coefﬁcients (same model) from training on
dataset D2 .
• Let θ ∗ be the unregularized LR coefﬁcients from training on the combined
dataset D1 ∪ D2 .

Under these conditions, prove that for any feature j,

(1) (2)
θj∗ ≥ min(θj , θj )
(1) (2)
θj∗ ≤ max(θj , θj ).

6. Let θ̂ be the solution to an unregularized logistic regression problem, and let θ ∗ be
the solution to the same problem, with L2 regularization. Prove that ||θ ∗ ||22 ≤ ||θ̂||22 .

Under contract with MIT Press, shared under CC-BY-NC-ND license.

$29.84

Get access to the full document:

100% satisfaction guarantee

Immediately available after payment

Both online and in PDF

No strings attached

Get to know the seller

AcademiContent

4.0

(386)

Get to know the seller

AcademiContent Aalborg University

View profile

Sold

3060

Member since

6 year

Number of followers

2132

Documents

1236

Last sold

1 day ago

4.0

386 reviews

203

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller AcademiContent. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $29.84. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews) 46621 documents were sold in the last 30 days Founded in 2010, the go-to place to buy study notes for 15 years now

Solutions Manual for Introduction to Natural Language Processing by Jacob Eisenstein

Connected book

Written for

Document information

Subjects

Content preview

Get to know the seller

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Didn't get what you expected? Choose another document

Pay as you like, start learning right away

Frequently asked questions

What do I get when I buy this document?

Satisfaction guarantee: how does it work?

Who am I buying these notes from?

Will I be stuck with a subscription?

Can Stuvia be trusted?