100% tevredenheidsgarantie Direct beschikbaar na je betaling Lees online óf als PDF Geen vaste maandelijkse kosten 4.2 TrustPilot
logo-home
Tentamen (uitwerkingen)

Exam (elaborations) TEST BANK FOR Pattern Classification 2nd Edition B

Beoordeling
-
Verkocht
-
Pagina's
443
Cijfer
A+
Geüpload op
13-02-2022
Geschreven in
2021/2022

Exam (elaborations) TEST BANK FOR Pattern Classification 2nd Edition B We assume, without loss of generality, that for a given particular x we have P(ω2|x) ≥ P(ω1|x), and thus P(error|x) = P(ω1|x). We have, moreover, the normalization condition P(ω1|x)=1−P(ω2|x). Together these imply P(ω2|x) > 1/2 or 2P(ω2|x) > 1 and 2P(ω2|x)P(ω1|x) > P(ω1|x) = P(error|x). This is true at every x, and hence the integrals obey  2P(ω2|x)P(ω1|x)dx ≥  P(error|x)dx. In short, 2P(ω2|x)P(ω1|x) provides an upper bound for P(error|x). (b) From part (a), we have that P(ω2|x) > 1/2, but in the current conditions not greater than 1/α for α < 2. Take as an example, α = 4/3 and P(ω1|x)=0.4 and hence P(ω2|x)=0.6. In this case, P(error|x)=0.4. Moreover, we have αP(ω1|x)P(ω2|x)=4/3 × 0.6 × 0.4 < P(error|x). This does not provide an upper bound for all values of P(ω1|x). (c) Let P(error|x) = P(ω1|x). In that case, for all x we have P(ω2|x)P(ω1|x) < P(ω1|x)P(error|x)  P(ω2|x)P(ω1|x)dx <  P(ω1|x)P(error|x)dx, and we have a lower bound. 7 8 CHAPTER 2. BAYESIAN DECISION THEORY (d) The solution to part (b) also applies here. Section 2.2 2. We are given that the density is of the form p(x|ωi) = ke−|x−ai|/bi . (a) We seek k so that the function is normalized, as required by a true density. We integrate this function, set it to 1.0, k ⎡ ⎣ ai −∞ exp[(x − ai)/bi]dx + ∞ ai exp[−(x − ai)/bi]dx ⎤ ⎦ = 1, which yields 2bik = 1 or k = 1/(2bi). Note that the normalization is independent of ai, which corresponds to a shift along the axis and is hence indeed irrelevant to normalization. The distribution is therefore written p(x|ωi) = 1 2bi e−|x−ai|/bi . (b) The likelihood ratio can be written directly: p(x|ω1) p(x|ω2) = b2 b1 exp  −|x − a1| b1 + |x − a2| b2  . (c) For the case a1 = 0, a2 = 1, b1 = 1 and b2 = 2, we have the likelihood ratio is p(x|ω2) p(x|ω1) = ⎧ ⎨ ⎩ 2e(x+1)/2 x ≤ 0 2e(1−3x)/2 0 < x ≤ 1 2e(−x−1)/2 x > 1, as shown in the figure. -2 -1 1 2 0.5 1 1.5 2 2.5 3 3.5 4 0 x p(x|ω1) p(x|ω2) Section 2.3 3. We are are to use the standard zero-one classification cost, that is λ11 = λ22 = 0 and λ12 = λ21 = 1. PROBLEM SOLUTIONS 9 (a) We have the priors P(ω1) and P(ω2)=1 − P(ω1). The Bayes risk is given by Eqs. 12 and 13 in the text: R(P(ω1)) = P(ω1)  R2 p(x|ω1)dx + (1 − P(ω1))  R1 p(x|ω2)dx. To obtain the prior with the minimum risk, we take the derivative with respect to P(ω1) and set it to 0, that is d dP(ω1) R(P(ω1)) =  R2 p(x|ω1)dx −  R1 p(x|ω2)dx = 0, which gives the desired result:  R2 p(x|ω1)dx =  R1 p(x|ω2)dx. (b) This solution is not always unique, as shown in this simple counterexample. Let P(ω1) = P(ω2)=0.5 and p(x|ω1) = 1 −0.5 ≤ x ≤ 0.5 0 otherwise p(x|ω2) = 1 0 ≤ x ≤ 1 0 otherwise. It is easy to verify that the decision regions R1 = [−0.5, 0.25] and R1 = [0, 0.5] satisfy the equations in part (a); thus the solution is not unique. 4. Consider the minimax criterion for a two-category classification problem. (a) The total risk is the integral over the two regions Ri of the posteriors times their costs: R =  R1 [λ11P(ω1)p(x|ω1) + λ12P(ω2)p(x|ω2)] dx +  R2 [λ21P(ω1)p(x|ω1) + λ22P(ω2)p(x|ω2)] dx. We use R2 p(x|ω2) dx = 1 − R1 p(x|ω2) dx and P(ω2)=1 − P(ω1), regroup to find: R = λ22 + λ12  R1 p(x|ω2) dx − λ22  R1 p(x|ω2) dx + P(ω1)  (λ11 − λ22) + λ11 R2 p(x|ω1) dx − λ12 R1 p(x|ω2) dx + λ21 R2 p(x|ω1) dx + λ22 R1 p(x|ω2) dx  10 CHAPTER 2. BAYESIAN DECISION THEORY = λ22 + (λ12 − λ22)  R1 p(x|ω2) dx +P(ω1)  (λ11 − λ22)+(λ11 + λ21)  R2 p(x|ω1) dx + (λ22 − λ12)  R1 p(x|ω2) dx  . (b) Consider an arbitrary prior 0 < P∗(ω1) < 1, and assume the decision boundary has been set so as to achieve the minimal (Bayes) error for that prior. If one holds the same decision boundary, but changes the prior probabilities (i.e., P(ω1) in the figure), then the error changes linearly, as given by the formula in part (a). The true Bayes error, however, must be less than or equal to that (linearly bounded) value, since one has the freedom to change the decision boundary at each value of P(ω1). Moreover, we note that the Bayes error is 0 at P(ω1)=0 and at P(ω1) = 1, since the Bayes decision rule under those conditions is to always decide ω2 or ω1, respectively, and this gives zero error. Thus the curve of Bayes error rate is concave down for all prior probabilities. P*(ω1 ) 0.5 1 P(ω1 ) 0.1 0.2 E(P(ω1 )) (c) According to the general minimax equation in part (a), for our case (i.e., λ11 = λ22 = 0 and λ12 = λ21 = 1) the decision boundary is chosen to satisfy  R2 p(x|ω1) dx =  R1 p(x|ω2) dx. We assume that a single decision point suffices, and thus we seek to find x∗ such that x∗  −∞ N(μ1, σ2 1) dx = ∞ x∗ N(μ2, σ2 2) dx, where, as usual, N(μi, σ2 i ) denotes a Gaussian. We assume for definiteness and without loss of generality that μ2 > μ1, and that the single decision point lies between the means. Recall the definition of an error function, given by Eq. 96 PROBLEM SOLUTIONS 11 in the Appendix of the text, that is, erf(x) = 2 √π x 0 e−t2 dt. We can rewrite the above as erf[(x∗ − μ1)/σ1] = erf[(x∗ − μ2)/σ2]. If the values of the error function are equal, then their corresponding arguments must be equal, that is (x∗ − μ1)/σ1 = (x∗ − μ2)/σ2 and solving for x∗ gives the value of the decision point x∗ = μ2σ1 + μ1σ2 σ1 + σ2  . (d) Because the mimimax error rate is independent of the prior probabilities, we can choose a particularly simple case to evaluate the error, for instance, P(ω1) = 0. In that case our error becomes E = 1/2 − erf[(x∗ − μ1)/σ1]=1/2 − erf  μ2σ1 − μ1σ2 σ1(σ1 + σ2)  . (e) We substitute the values given in the problem into the formula in part (c) and find x∗ = μ2σ1 + μ1σ2 σ1 + σ2 = 1/2+0 1+1/2 = 1/3. The error from part (d) is then E = 1/2 − erf  1/3 − 0 1+0  = 1/2 − erf[1/3] = 0.1374. (f) Note that the distributions have the same form (in particular, the same variance). Thus, by symmetry the Bayes error for P(ω1) = P∗ (for some value P∗) must be the same as for P(ω2) = P∗. Because P(ω2)=1 − P(ω1), we know that the curve, analogous to the one in part (b), is symmetric around the point P(ω1)=0.5. Because the curve is concave down, therefore it must peak at P(ω1)=0.5, that is, equal priors. The tangent to the graph of the error versus P(ω1) is thus horizontal at P(ω1)=0.5. For this case of equal priors, the Bayes decision point for this problem can be stated simply: it is the point midway between the means of the two distributions, that is, x∗ = 5.5. 5. We seek to generalize the notion of minimax criteria to the case where two independent prior probabilities are set. 12 CHAPTER 2. BAYESIAN DECISION THEORY x μ1 μ2 μ3 δ1 δ2 δ3 x2 * x1 * p(x|ω1)P(ω1) p(x|ω3)P(ω3) p(x|ωi )P(ωi ) p(x|ω2)P(ω2) (a) We use the triangle distributions and conventions in the figure. We solve for the decision points as follows (being sure to keep the signs correct, and assuming that the decision boundary consists of just two points): P(ω1) δ1 − (x∗ 1 − μ1) δ2 1  = P(ω2) δ2 − (μ2 − x∗ 1) δ2 2  , which has solution x∗ 1 = P(ω1)δ2 2δ1 + P(ω1)δ2 2μ1 − P(ω2)δ2 1δ2 + P(ω2)μ2δ2 1 P(ω1)δ2 2 + P(ω2)δ2 1 . An analogous derivation for the other decision point gives: P(ω2) δ2 − (x∗ 2 − μ2) δ2 2  = P(ω3) δ3 − (x∗ 2 − μ3) δ2 3  , which has solution x∗ 2 = −P(ω2)δ2 3μ2 + P(ω2)δ2 3δ2 + P(ω3)δ2 2δ3 + P(ω3)δ2 2μ3 P(ω2)δ2 3 + P(ω3)δ2 2 . (b) Note that from our normalization condition,  3 i=1 P(ωi) = 1, we can express all priors in terms of just two independent ones, which we choose to be P(ω1) and P(ω2). We could substitute the values for x∗ i and integrate, but it is just a bit simpler to go directly to the calculation of the error, E, as a function of priors P(ω1) and P(ω2) by considering the four contributions: E = P(ω1) 1 2δ2 1 [μ1 + δ1 − x∗ 1] 2 +P(ω2) 1 2δ2 2 [δ2 − μ2 + x∗ 1] 2 +P(ω2) 1 2δ2 2 [μ2 + δ2 − x∗ 2] 2 +  1 − P(ω1) − P(ω2)    P (ω3)  1 2δ2 3 [δ3 − μ3 + x∗ 2] 2 . PROBLEM SOLUTIONS 13 To obtain the minimax solution, we take the two partial and set them to zero. The first of the derivative equations, ∂E ∂P(ω1) = 0, yields the equation μ1 + δ1 − x∗ 1 δ1 2 = δ3 − μ3 + x∗ 2 δ3 2 or μ1 + δ1 − x∗ 1 δ1 = δ3 − μ3 + x∗ 2 δ3 . Likewise, the second of the derivative equations, ∂E ∂P(ω2) = 0, yields the equation δ2 − μ2 + x∗ 1 δ2 2 + μ2 + δ2 − x∗ 2 δ2 2 = δ3 − μ3 + x∗ 2 δ3 2 . These simultaneous quadratic equations have solutions of the general form: x∗ i = bi + √ci ai i = 1, 2. After a straightforward, but very tedious calculation, we find that: a1 = δ2 1 − δ2 2 + δ2 3, b1 = −δ2 1δ2 − δ1 δ2 2 − δ1δ2 δ3 − δ2 2μ1 + δ2 3μ1 + δ2 1μ2 − δ1δ3μ2 + δ1δ3μ3, c1 = δ2 1  2δ1δ3 2 + 2δ4 2 + 2δ1δ2 2δ3 + 2δ3 2δ3 + δ1δ2 2μ1 + 2δ3 2μ1 +2 δ1 δ2 δ3 μ1 − 2 δ2 δ3 2 μ1 + δ2 2 μ1 2 − δ3 2 μ1 2 − 2 δ1 2 δ2 μ2 − 2 δ1 δ2 2 μ2 +2 δ2 2 δ3 μ2 + 2 δ2 δ3 2 μ2 − 2 δ2 2 μ1 μ2 + 2 δ1 δ3 μ1 μ2 + 2 δ3 2 μ1 μ2 −δ1 2 μ2 2 + 2 δ2 2 μ2 2 − 2 δ1 δ3 μ2 2 − δ3 2 μ2 2 + 2 δ1 2 δ2 μ3 − 2 δ2 3 μ3 −2 δ1 δ2 δ3 μ3 − 2 δ2 2 δ3 μ3 − 2 δ1 δ3 μ1 μ3 + 2 δ1 2 μ2 μ3 − 2 δ2 2 μ2 μ3 +2 δ1 δ3 μ2 μ3 − δ1 2 μ3 2 + δ2 2 μ3 2 . An analogous calculation gives: a2 = δ2 1 − δ2 2 + δ2 3, b2 = δ1 δ2 δ3 + δ2 2 δ3 + 2 δ2 δ2 3 + δ1 δ3 μ1 − δ1 δ3 μ2 + δ2 3 μ2 + δ2 1 μ3 − δ2 2 μ3, c2 =  δ2 1 − δ2 2 + δ2 3  ×  δ2 2 δ2 3 + 2 δ2 δ2 3 μ1 + δ2 3 μ2 1 − 2 δ2 3 μ1 μ2 + 2 δ2 3 μ2 2 + 2 δ1 δ2 δ3 μ3 + 2 δ2 2 δ3 μ3 + 2 δ1 δ3 μ1 μ3 − 2 δ1 δ3 μ2 μ3 + δ2 1 μ2 3 − δ2 2 μ2 3  . (c) For {μi, δi} = {0, 1}, {.5, .5}, {1, 1}, for i = 1, 2, 3, respectively, we substitute into the above equations to find x∗ 1 = 0.2612 and x∗ 2 = 0.7388. It is a simple matter to confirm that indeed these two decision points suffice for the classification problem, that is, that no more than two points are needed. 14 CHAPTER 2. BAYESIAN DECISION THEORY -2 -1 1 2 x* 0 E1 p(x|ω1)P(ω1) p(x|ω2)P(ω2) μ1 μ2 x p(x|ωi )P(ωi ) 6. We let x∗ denote our decision boundary and μ2 > μ1, as shown in the figure. (a) The error for classifying a pattern that is actually in ω1 as if it were in ω2 is:  R2 p(x|ω1)P(ω1) dx = 1 2 ∞ x∗ N(μ1, σ2 1) dx ≤ E1. Our problem demands that this error be less than or equal to E1. Thus the bound on x∗ is a function of E1, and could be obtained by tables of cumulative normal distributions, or simple numerical integration. (b) Likewise, the error for categorizing a pattern that is in ω2 as if it were in ω1 is: E2 =  R1 p(x|ω2)P(ω2) dx = 1 2 x∗  −∞ N(μ2, σ2 2) dx. (c) The total error is simply the sum of these two contributions: E = E1 + E2 = 1 2 ∞ x∗ N(μ1, σ2 1) dx + 1 2 x∗  −∞ N(μ2, σ2 2) dx. (d) For p(x|ω1) ∼ N(−1/2, 1) and p(x|ω2) ∼ N(1/2, 1) and E1 = 0.05, we have (by simple numerical integration) x∗ = 0.2815, and thus E = 0.05 + 1 2 0 .2815 −∞ N(μ2, σ2 2) dx = 0.05 + 1 2 0 .2815 −∞ 1 √2π0.05exp  −(x − 0.5)2 2(0.5)2  dx = 0.168. (e) The decision boundary for the (minimum error) Bayes case is clearly at x∗ = 0. The Bayes error for this problem is: EB = 2∞ 0 1 2 N(μ1, σ2 1) dx PROBLEM SOLUTIONS 15 = ∞ 0 N(1, 1) dx = erf[1] = 0.159, which of course is lower than the error for the Neyman-Pearson criterion case. Note that if the Bayes error were lower than 2 × 0.05 = 0.1 in this problem, we would use the Bayes decision point for the Neyman-Pearson case, since it too would ensure that the Neyman-Pearson criteria were obeyed and would give the lowest total error. 7. We proceed as in Problem 6, with the figure below. -4 -2 2 4 0.05 0.1 0.15 0.2 0.25 0.3 0 x* E1 p(x|ω1)P(ω1) p(x|ω2)P(ω2) x p(x|ωi )P(ωi ) (a) Recall that the Cauchy density is p(x|ωi) = 1 πb 1 1 +  x−ai b 2 . If we denote our (single) decision boundary point as x∗, and note that P(ωi) = 1/2, then the error for misclassifying a ω1 pattern as ω2 is: E1 = ∞ x∗ p(x|ω1)P(ω1) dx = 1 2 ∞ x∗ 1 πb 1 1 +  x−a1 b 2 dx. We substitute (x − a1)/b = y, and sin θ = 1/ 1 + y2 to get: E1 = 1 2π  θ=0 θ=θ ˜ dθ = 1 2π sin−1  b b2 + (x∗ − a1)2  , where ˜θ = sin−1  √ b b2+(x∗−a1)2  . Solving for the decision point gives x∗ = a1 + b  1 sin2[2πE1] − 1 = a1 + b/tan[2πE1]. 16 CHAPTER 2. BAYESIAN DECISION THEORY (b) The error for the converse case is found similarly: E2 = 1 πb x∗  −∞ 1 1 +  x−a2 b 2 P(ω2) dx = 1 2π θ=θ ˜  θ=−π dθ = 1 2π  sin−1  b b2 + (x∗ − a2)2  + π  = 1 2 + 1 π sin−1  b b2 + (x∗ − a2)2  , where ˜θ is defined in part (a). (c) The total error is merely the sum of the component errors: E = E1 + E2 = E1 + 1 2 + 1 π sin−1  b b2 + (x∗ − a2)2  , where the numerical value of the decision point is x∗ = a1 + b/tan[2πE1]=0.376. (d) We add the errors (for b = 1) and find E = 0.1 + 1 2 + 1 π sin−1  b b2 + (x∗ − a2)2  = 0.2607. (e) For the Bayes case, the decision point is midway between the peaks of the two distributions, i.e., at x∗ = 0 (cf. Problem 6). The Bayes error is then EB = 2 ∞ 0 1 1 +  x−a b 2 P(ω2) dx = 0.2489. This is indeed lower than for the Neyman-Pearson case, as it must be. Note that if the Bayes error were lower than 2 × 0.1=0.2 in this problem, we would use the Bayes decision point for the Neyman-Pearson case, since it too would ensure that the Neyman-Pearson criteria were obeyed and woul

Meer zien Lees minder











Oeps! We kunnen je document nu niet laden. Probeer het nog eens of neem contact op met support.

Documentinformatie

Geüpload op
13 februari 2022
Aantal pagina's
443
Geschreven in
2021/2022
Type
Tentamen (uitwerkingen)
Bevat
Vragen en antwoorden

Onderwerpen

Voorbeeld van de inhoud

, Solution Manual to accompany

Pattern Classification (2nd ed.)
by R. O. Duda, P. E. Hart and D. G. Stork

David G. Stork
Copyright 2001. All rights reserved.

This Manual is for the sole use of designated educators
and must not be distributed to students
except in short, isolated portions and in conjunction
with the use of Pattern Classification (2nd ed.)

June 18, 2003

,Contents

Preface

1 Introduction 5

2 Bayesian decision theory 7
Problem Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Computer Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

3 Maximum likelihood and Bayesian parameter estimation 77
Problem Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Computer Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

4 Nonparametric techniques 131
Problem Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
Computer Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

5 Linear discriminant functions 177
Problem Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
Computer Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217

6 Multilayer neural networks 219
Problem Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
Computer Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254

7 Stochastic methods 255
Problem Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
Computer Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276

8 Nonmetric methods 277
Problem Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
Computer Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294

9 Algorithm-independent machine learning 295
Problem Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
Computer Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304

10 Unsupervised learning and clustering 305
Problem Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
Computer Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355

Sample final exams and solutions 357

3

, 4 CONTENTS

Worked examples 415

Errata and ammendations in the text 417
First and second printings . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417
Fifth printing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443
€10,49
Krijg toegang tot het volledige document:

100% tevredenheidsgarantie
Direct beschikbaar na je betaling
Lees online óf als PDF
Geen vaste maandelijkse kosten

Maak kennis met de verkoper
Seller avatar
COURSEHERO2

Maak kennis met de verkoper

Seller avatar
COURSEHERO2 Maastricht University
Bekijk profiel
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
4
Lid sinds
4 jaar
Aantal volgers
2
Documenten
82
Laatst verkocht
11 maanden geleden

0,0

0 beoordelingen

5
0
4
0
3
0
2
0
1
0

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Veelgestelde vragen