100% de satisfacción garantizada Inmediatamente disponible después del pago Tanto en línea como en PDF No estas atado a nada 4.2 TrustPilot
logo-home
Examen

Exam (elaborations) TEST BANK FOR Pattern Classification 2nd Edition B

Puntuación
-
Vendido
-
Páginas
443
Grado
A+
Subido en
13-02-2022
Escrito en
2021/2022

Exam (elaborations) TEST BANK FOR Pattern Classification 2nd Edition B We assume, without loss of generality, that for a given particular x we have P(ω2|x) ≥ P(ω1|x), and thus P(error|x) = P(ω1|x). We have, moreover, the normalization condition P(ω1|x)=1−P(ω2|x). Together these imply P(ω2|x) > 1/2 or 2P(ω2|x) > 1 and 2P(ω2|x)P(ω1|x) > P(ω1|x) = P(error|x). This is true at every x, and hence the integrals obey  2P(ω2|x)P(ω1|x)dx ≥  P(error|x)dx. In short, 2P(ω2|x)P(ω1|x) provides an upper bound for P(error|x). (b) From part (a), we have that P(ω2|x) > 1/2, but in the current conditions not greater than 1/α for α < 2. Take as an example, α = 4/3 and P(ω1|x)=0.4 and hence P(ω2|x)=0.6. In this case, P(error|x)=0.4. Moreover, we have αP(ω1|x)P(ω2|x)=4/3 × 0.6 × 0.4 < P(error|x). This does not provide an upper bound for all values of P(ω1|x). (c) Let P(error|x) = P(ω1|x). In that case, for all x we have P(ω2|x)P(ω1|x) < P(ω1|x)P(error|x)  P(ω2|x)P(ω1|x)dx <  P(ω1|x)P(error|x)dx, and we have a lower bound. 7 8 CHAPTER 2. BAYESIAN DECISION THEORY (d) The solution to part (b) also applies here. Section 2.2 2. We are given that the density is of the form p(x|ωi) = ke−|x−ai|/bi . (a) We seek k so that the function is normalized, as required by a true density. We integrate this function, set it to 1.0, k ⎡ ⎣ ai −∞ exp[(x − ai)/bi]dx + ∞ ai exp[−(x − ai)/bi]dx ⎤ ⎦ = 1, which yields 2bik = 1 or k = 1/(2bi). Note that the normalization is independent of ai, which corresponds to a shift along the axis and is hence indeed irrelevant to normalization. The distribution is therefore written p(x|ωi) = 1 2bi e−|x−ai|/bi . (b) The likelihood ratio can be written directly: p(x|ω1) p(x|ω2) = b2 b1 exp  −|x − a1| b1 + |x − a2| b2  . (c) For the case a1 = 0, a2 = 1, b1 = 1 and b2 = 2, we have the likelihood ratio is p(x|ω2) p(x|ω1) = ⎧ ⎨ ⎩ 2e(x+1)/2 x ≤ 0 2e(1−3x)/2 0 < x ≤ 1 2e(−x−1)/2 x > 1, as shown in the figure. -2 -1 1 2 0.5 1 1.5 2 2.5 3 3.5 4 0 x p(x|ω1) p(x|ω2) Section 2.3 3. We are are to use the standard zero-one classification cost, that is λ11 = λ22 = 0 and λ12 = λ21 = 1. PROBLEM SOLUTIONS 9 (a) We have the priors P(ω1) and P(ω2)=1 − P(ω1). The Bayes risk is given by Eqs. 12 and 13 in the text: R(P(ω1)) = P(ω1)  R2 p(x|ω1)dx + (1 − P(ω1))  R1 p(x|ω2)dx. To obtain the prior with the minimum risk, we take the derivative with respect to P(ω1) and set it to 0, that is d dP(ω1) R(P(ω1)) =  R2 p(x|ω1)dx −  R1 p(x|ω2)dx = 0, which gives the desired result:  R2 p(x|ω1)dx =  R1 p(x|ω2)dx. (b) This solution is not always unique, as shown in this simple counterexample. Let P(ω1) = P(ω2)=0.5 and p(x|ω1) = 1 −0.5 ≤ x ≤ 0.5 0 otherwise p(x|ω2) = 1 0 ≤ x ≤ 1 0 otherwise. It is easy to verify that the decision regions R1 = [−0.5, 0.25] and R1 = [0, 0.5] satisfy the equations in part (a); thus the solution is not unique. 4. Consider the minimax criterion for a two-category classification problem. (a) The total risk is the integral over the two regions Ri of the posteriors times their costs: R =  R1 [λ11P(ω1)p(x|ω1) + λ12P(ω2)p(x|ω2)] dx +  R2 [λ21P(ω1)p(x|ω1) + λ22P(ω2)p(x|ω2)] dx. We use R2 p(x|ω2) dx = 1 − R1 p(x|ω2) dx and P(ω2)=1 − P(ω1), regroup to find: R = λ22 + λ12  R1 p(x|ω2) dx − λ22  R1 p(x|ω2) dx + P(ω1)  (λ11 − λ22) + λ11 R2 p(x|ω1) dx − λ12 R1 p(x|ω2) dx + λ21 R2 p(x|ω1) dx + λ22 R1 p(x|ω2) dx  10 CHAPTER 2. BAYESIAN DECISION THEORY = λ22 + (λ12 − λ22)  R1 p(x|ω2) dx +P(ω1)  (λ11 − λ22)+(λ11 + λ21)  R2 p(x|ω1) dx + (λ22 − λ12)  R1 p(x|ω2) dx  . (b) Consider an arbitrary prior 0 < P∗(ω1) < 1, and assume the decision boundary has been set so as to achieve the minimal (Bayes) error for that prior. If one holds the same decision boundary, but changes the prior probabilities (i.e., P(ω1) in the figure), then the error changes linearly, as given by the formula in part (a). The true Bayes error, however, must be less than or equal to that (linearly bounded) value, since one has the freedom to change the decision boundary at each value of P(ω1). Moreover, we note that the Bayes error is 0 at P(ω1)=0 and at P(ω1) = 1, since the Bayes decision rule under those conditions is to always decide ω2 or ω1, respectively, and this gives zero error. Thus the curve of Bayes error rate is concave down for all prior probabilities. P*(ω1 ) 0.5 1 P(ω1 ) 0.1 0.2 E(P(ω1 )) (c) According to the general minimax equation in part (a), for our case (i.e., λ11 = λ22 = 0 and λ12 = λ21 = 1) the decision boundary is chosen to satisfy  R2 p(x|ω1) dx =  R1 p(x|ω2) dx. We assume that a single decision point suffices, and thus we seek to find x∗ such that x∗  −∞ N(μ1, σ2 1) dx = ∞ x∗ N(μ2, σ2 2) dx, where, as usual, N(μi, σ2 i ) denotes a Gaussian. We assume for definiteness and without loss of generality that μ2 > μ1, and that the single decision point lies between the means. Recall the definition of an error function, given by Eq. 96 PROBLEM SOLUTIONS 11 in the Appendix of the text, that is, erf(x) = 2 √π x 0 e−t2 dt. We can rewrite the above as erf[(x∗ − μ1)/σ1] = erf[(x∗ − μ2)/σ2]. If the values of the error function are equal, then their corresponding arguments must be equal, that is (x∗ − μ1)/σ1 = (x∗ − μ2)/σ2 and solving for x∗ gives the value of the decision point x∗ = μ2σ1 + μ1σ2 σ1 + σ2  . (d) Because the mimimax error rate is independent of the prior probabilities, we can choose a particularly simple case to evaluate the error, for instance, P(ω1) = 0. In that case our error becomes E = 1/2 − erf[(x∗ − μ1)/σ1]=1/2 − erf  μ2σ1 − μ1σ2 σ1(σ1 + σ2)  . (e) We substitute the values given in the problem into the formula in part (c) and find x∗ = μ2σ1 + μ1σ2 σ1 + σ2 = 1/2+0 1+1/2 = 1/3. The error from part (d) is then E = 1/2 − erf  1/3 − 0 1+0  = 1/2 − erf[1/3] = 0.1374. (f) Note that the distributions have the same form (in particular, the same variance). Thus, by symmetry the Bayes error for P(ω1) = P∗ (for some value P∗) must be the same as for P(ω2) = P∗. Because P(ω2)=1 − P(ω1), we know that the curve, analogous to the one in part (b), is symmetric around the point P(ω1)=0.5. Because the curve is concave down, therefore it must peak at P(ω1)=0.5, that is, equal priors. The tangent to the graph of the error versus P(ω1) is thus horizontal at P(ω1)=0.5. For this case of equal priors, the Bayes decision point for this problem can be stated simply: it is the point midway between the means of the two distributions, that is, x∗ = 5.5. 5. We seek to generalize the notion of minimax criteria to the case where two independent prior probabilities are set. 12 CHAPTER 2. BAYESIAN DECISION THEORY x μ1 μ2 μ3 δ1 δ2 δ3 x2 * x1 * p(x|ω1)P(ω1) p(x|ω3)P(ω3) p(x|ωi )P(ωi ) p(x|ω2)P(ω2) (a) We use the triangle distributions and conventions in the figure. We solve for the decision points as follows (being sure to keep the signs correct, and assuming that the decision boundary consists of just two points): P(ω1) δ1 − (x∗ 1 − μ1) δ2 1  = P(ω2) δ2 − (μ2 − x∗ 1) δ2 2  , which has solution x∗ 1 = P(ω1)δ2 2δ1 + P(ω1)δ2 2μ1 − P(ω2)δ2 1δ2 + P(ω2)μ2δ2 1 P(ω1)δ2 2 + P(ω2)δ2 1 . An analogous derivation for the other decision point gives: P(ω2) δ2 − (x∗ 2 − μ2) δ2 2  = P(ω3) δ3 − (x∗ 2 − μ3) δ2 3  , which has solution x∗ 2 = −P(ω2)δ2 3μ2 + P(ω2)δ2 3δ2 + P(ω3)δ2 2δ3 + P(ω3)δ2 2μ3 P(ω2)δ2 3 + P(ω3)δ2 2 . (b) Note that from our normalization condition,  3 i=1 P(ωi) = 1, we can express all priors in terms of just two independent ones, which we choose to be P(ω1) and P(ω2). We could substitute the values for x∗ i and integrate, but it is just a bit simpler to go directly to the calculation of the error, E, as a function of priors P(ω1) and P(ω2) by considering the four contributions: E = P(ω1) 1 2δ2 1 [μ1 + δ1 − x∗ 1] 2 +P(ω2) 1 2δ2 2 [δ2 − μ2 + x∗ 1] 2 +P(ω2) 1 2δ2 2 [μ2 + δ2 − x∗ 2] 2 +  1 − P(ω1) − P(ω2)    P (ω3)  1 2δ2 3 [δ3 − μ3 + x∗ 2] 2 . PROBLEM SOLUTIONS 13 To obtain the minimax solution, we take the two partial and set them to zero. The first of the derivative equations, ∂E ∂P(ω1) = 0, yields the equation μ1 + δ1 − x∗ 1 δ1 2 = δ3 − μ3 + x∗ 2 δ3 2 or μ1 + δ1 − x∗ 1 δ1 = δ3 − μ3 + x∗ 2 δ3 . Likewise, the second of the derivative equations, ∂E ∂P(ω2) = 0, yields the equation δ2 − μ2 + x∗ 1 δ2 2 + μ2 + δ2 − x∗ 2 δ2 2 = δ3 − μ3 + x∗ 2 δ3 2 . These simultaneous quadratic equations have solutions of the general form: x∗ i = bi + √ci ai i = 1, 2. After a straightforward, but very tedious calculation, we find that: a1 = δ2 1 − δ2 2 + δ2 3, b1 = −δ2 1δ2 − δ1 δ2 2 − δ1δ2 δ3 − δ2 2μ1 + δ2 3μ1 + δ2 1μ2 − δ1δ3μ2 + δ1δ3μ3, c1 = δ2 1  2δ1δ3 2 + 2δ4 2 + 2δ1δ2 2δ3 + 2δ3 2δ3 + δ1δ2 2μ1 + 2δ3 2μ1 +2 δ1 δ2 δ3 μ1 − 2 δ2 δ3 2 μ1 + δ2 2 μ1 2 − δ3 2 μ1 2 − 2 δ1 2 δ2 μ2 − 2 δ1 δ2 2 μ2 +2 δ2 2 δ3 μ2 + 2 δ2 δ3 2 μ2 − 2 δ2 2 μ1 μ2 + 2 δ1 δ3 μ1 μ2 + 2 δ3 2 μ1 μ2 −δ1 2 μ2 2 + 2 δ2 2 μ2 2 − 2 δ1 δ3 μ2 2 − δ3 2 μ2 2 + 2 δ1 2 δ2 μ3 − 2 δ2 3 μ3 −2 δ1 δ2 δ3 μ3 − 2 δ2 2 δ3 μ3 − 2 δ1 δ3 μ1 μ3 + 2 δ1 2 μ2 μ3 − 2 δ2 2 μ2 μ3 +2 δ1 δ3 μ2 μ3 − δ1 2 μ3 2 + δ2 2 μ3 2 . An analogous calculation gives: a2 = δ2 1 − δ2 2 + δ2 3, b2 = δ1 δ2 δ3 + δ2 2 δ3 + 2 δ2 δ2 3 + δ1 δ3 μ1 − δ1 δ3 μ2 + δ2 3 μ2 + δ2 1 μ3 − δ2 2 μ3, c2 =  δ2 1 − δ2 2 + δ2 3  ×  δ2 2 δ2 3 + 2 δ2 δ2 3 μ1 + δ2 3 μ2 1 − 2 δ2 3 μ1 μ2 + 2 δ2 3 μ2 2 + 2 δ1 δ2 δ3 μ3 + 2 δ2 2 δ3 μ3 + 2 δ1 δ3 μ1 μ3 − 2 δ1 δ3 μ2 μ3 + δ2 1 μ2 3 − δ2 2 μ2 3  . (c) For {μi, δi} = {0, 1}, {.5, .5}, {1, 1}, for i = 1, 2, 3, respectively, we substitute into the above equations to find x∗ 1 = 0.2612 and x∗ 2 = 0.7388. It is a simple matter to confirm that indeed these two decision points suffice for the classification problem, that is, that no more than two points are needed. 14 CHAPTER 2. BAYESIAN DECISION THEORY -2 -1 1 2 x* 0 E1 p(x|ω1)P(ω1) p(x|ω2)P(ω2) μ1 μ2 x p(x|ωi )P(ωi ) 6. We let x∗ denote our decision boundary and μ2 > μ1, as shown in the figure. (a) The error for classifying a pattern that is actually in ω1 as if it were in ω2 is:  R2 p(x|ω1)P(ω1) dx = 1 2 ∞ x∗ N(μ1, σ2 1) dx ≤ E1. Our problem demands that this error be less than or equal to E1. Thus the bound on x∗ is a function of E1, and could be obtained by tables of cumulative normal distributions, or simple numerical integration. (b) Likewise, the error for categorizing a pattern that is in ω2 as if it were in ω1 is: E2 =  R1 p(x|ω2)P(ω2) dx = 1 2 x∗  −∞ N(μ2, σ2 2) dx. (c) The total error is simply the sum of these two contributions: E = E1 + E2 = 1 2 ∞ x∗ N(μ1, σ2 1) dx + 1 2 x∗  −∞ N(μ2, σ2 2) dx. (d) For p(x|ω1) ∼ N(−1/2, 1) and p(x|ω2) ∼ N(1/2, 1) and E1 = 0.05, we have (by simple numerical integration) x∗ = 0.2815, and thus E = 0.05 + 1 2 0 .2815 −∞ N(μ2, σ2 2) dx = 0.05 + 1 2 0 .2815 −∞ 1 √2π0.05exp  −(x − 0.5)2 2(0.5)2  dx = 0.168. (e) The decision boundary for the (minimum error) Bayes case is clearly at x∗ = 0. The Bayes error for this problem is: EB = 2∞ 0 1 2 N(μ1, σ2 1) dx PROBLEM SOLUTIONS 15 = ∞ 0 N(1, 1) dx = erf[1] = 0.159, which of course is lower than the error for the Neyman-Pearson criterion case. Note that if the Bayes error were lower than 2 × 0.05 = 0.1 in this problem, we would use the Bayes decision point for the Neyman-Pearson case, since it too would ensure that the Neyman-Pearson criteria were obeyed and would give the lowest total error. 7. We proceed as in Problem 6, with the figure below. -4 -2 2 4 0.05 0.1 0.15 0.2 0.25 0.3 0 x* E1 p(x|ω1)P(ω1) p(x|ω2)P(ω2) x p(x|ωi )P(ωi ) (a) Recall that the Cauchy density is p(x|ωi) = 1 πb 1 1 +  x−ai b 2 . If we denote our (single) decision boundary point as x∗, and note that P(ωi) = 1/2, then the error for misclassifying a ω1 pattern as ω2 is: E1 = ∞ x∗ p(x|ω1)P(ω1) dx = 1 2 ∞ x∗ 1 πb 1 1 +  x−a1 b 2 dx. We substitute (x − a1)/b = y, and sin θ = 1/ 1 + y2 to get: E1 = 1 2π  θ=0 θ=θ ˜ dθ = 1 2π sin−1  b b2 + (x∗ − a1)2  , where ˜θ = sin−1  √ b b2+(x∗−a1)2  . Solving for the decision point gives x∗ = a1 + b  1 sin2[2πE1] − 1 = a1 + b/tan[2πE1]. 16 CHAPTER 2. BAYESIAN DECISION THEORY (b) The error for the converse case is found similarly: E2 = 1 πb x∗  −∞ 1 1 +  x−a2 b 2 P(ω2) dx = 1 2π θ=θ ˜  θ=−π dθ = 1 2π  sin−1  b b2 + (x∗ − a2)2  + π  = 1 2 + 1 π sin−1  b b2 + (x∗ − a2)2  , where ˜θ is defined in part (a). (c) The total error is merely the sum of the component errors: E = E1 + E2 = E1 + 1 2 + 1 π sin−1  b b2 + (x∗ − a2)2  , where the numerical value of the decision point is x∗ = a1 + b/tan[2πE1]=0.376. (d) We add the errors (for b = 1) and find E = 0.1 + 1 2 + 1 π sin−1  b b2 + (x∗ − a2)2  = 0.2607. (e) For the Bayes case, the decision point is midway between the peaks of the two distributions, i.e., at x∗ = 0 (cf. Problem 6). The Bayes error is then EB = 2 ∞ 0 1 1 +  x−a b 2 P(ω2) dx = 0.2489. This is indeed lower than for the Neyman-Pearson case, as it must be. Note that if the Bayes error were lower than 2 × 0.1=0.2 in this problem, we would use the Bayes decision point for the Neyman-Pearson case, since it too would ensure that the Neyman-Pearson criteria were obeyed and woul

Mostrar más Leer menos
Institución
Grado











Ups! No podemos cargar tu documento ahora. Inténtalo de nuevo o contacta con soporte.

Escuela, estudio y materia

Institución
Estudio
Grado

Información del documento

Subido en
13 de febrero de 2022
Número de páginas
443
Escrito en
2021/2022
Tipo
Examen
Contiene
Preguntas y respuestas

Temas

Vista previa del contenido

, Solution Manual to accompany

Pattern Classification (2nd ed.)
by R. O. Duda, P. E. Hart and D. G. Stork

David G. Stork
Copyright 2001. All rights reserved.

This Manual is for the sole use of designated educators
and must not be distributed to students
except in short, isolated portions and in conjunction
with the use of Pattern Classification (2nd ed.)

June 18, 2003

,Contents

Preface

1 Introduction 5

2 Bayesian decision theory 7
Problem Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Computer Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

3 Maximum likelihood and Bayesian parameter estimation 77
Problem Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Computer Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

4 Nonparametric techniques 131
Problem Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
Computer Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

5 Linear discriminant functions 177
Problem Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
Computer Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217

6 Multilayer neural networks 219
Problem Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
Computer Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254

7 Stochastic methods 255
Problem Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
Computer Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276

8 Nonmetric methods 277
Problem Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
Computer Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294

9 Algorithm-independent machine learning 295
Problem Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
Computer Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304

10 Unsupervised learning and clustering 305
Problem Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
Computer Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355

Sample final exams and solutions 357

3

, 4 CONTENTS

Worked examples 415

Errata and ammendations in the text 417
First and second printings . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417
Fifth printing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443
$12.61
Accede al documento completo:

100% de satisfacción garantizada
Inmediatamente disponible después del pago
Tanto en línea como en PDF
No estas atado a nada

Conoce al vendedor
Seller avatar
COURSEHERO2

Conoce al vendedor

Seller avatar
COURSEHERO2 Maastricht University
Seguir Necesitas iniciar sesión para seguir a otros usuarios o asignaturas
Vendido
4
Miembro desde
4 año
Número de seguidores
2
Documentos
82
Última venta
11 meses hace

0.0

0 reseñas

5
0
4
0
3
0
2
0
1
0

Recientemente visto por ti

Por qué los estudiantes eligen Stuvia

Creado por compañeros estudiantes, verificado por reseñas

Calidad en la que puedes confiar: escrito por estudiantes que aprobaron y evaluado por otros que han usado estos resúmenes.

¿No estás satisfecho? Elige otro documento

¡No te preocupes! Puedes elegir directamente otro documento que se ajuste mejor a lo que buscas.

Paga como quieras, empieza a estudiar al instante

Sin suscripción, sin compromisos. Paga como estés acostumbrado con tarjeta de crédito y descarga tu documento PDF inmediatamente.

Student with book image

“Comprado, descargado y aprobado. Así de fácil puede ser.”

Alisha Student

Preguntas frecuentes