DATA MINING FINAL EXAM SET
QUESTIONS WITH CORRECT
ANSWERS
What is the output of the following algorithm? - Answer-Frequent itemset involving
rare items
Select True or False - Answer-How to avoid the rare item problem? The value of
support must be lower.
HW 3 Question 18
Given the following set of market-based transactions.(a) Find two rules that have
60% sup and 75% conf. (b) For the following rules, calculate the sup and conf,
respectively.{Diaper} → {Milk,Beer} {Milk} → {Diaper,Beer} - Answer-a) {Bread} -->
{Milk}, {Diaper} --> {Beer}
b) {Diaper} --> {Milk,Beer}: sup = 40%, conf=50%
{Milk} --> {Diaper,Beer}: sup=40%, conf=50%
Given the following data table (right panel)(i) Calculate the sup of {apple, beer, rice}.
(ii) Calculate the conf of {apple → beer}. - Answer-(i) sup=25%
(ii) conf=75%
Suppose that {2, 3, 4} is frequent in a dataset with sup=50%. We can find proper
nonempty subsets {2,3}, {2,4}, {3,4}, {2}, {3}, {4}, with sup=50%, 50%, 75%, 75%,
75%, 75% respectively. These generate these association rules: 2,3 → 4 with conf
=100%, 2,4 → 3 with conf =100%, 3,4 → 2 with conf =67%, 2 → 3,4 with conf =67%,
3 → 2,4 with conf =67%, 4 → 2,3 with conf =67%.
What would be the percentage of sup that all rules have, ____________%? -
Answer-50}
ID
Sequences
S1
{a, b}, {c}, {f}, {g}, {e}
S2
{a, d}, {c}, {b}, {a, b, e, f}
S3
{a}, {b}, {f}, {e}
S4
{b}, {f, g}
S5
{b}, {g}
Given the set of data above. Calculate Sup and Conf when applying rules {a, b}→
{e}, and {b}→ {g} - Answer-60% and 100%
60% and 60%
, T1
1, 3, 4, 7
T2
2, 3, 5
T3
1, 2, 3, 5, 8
T4
2, 5,
T5
1, 7
Using the Apriori algorithm, find all k-item frequent itemsets from the following
dataset. Consider k=3.
First, you need to show all the scanning steps. Then, the final result for k=3 would be
________. - Answer-scan T C1 : {1}:3, {2}:3, {3}:3, {4}:1, {5}:3 {7}:2, {8}:1 F1 : {1}:3,
{2}:3, {3}:3, {5}:3 C2 : {1,2}, {1,3}, {1,5}, {2,3}, {2,5}, {3,5}, scan T C2 : {1,2}:1, {1,3}:2,
{1,5}:1, {2,3}:2, {2,5}:2, {3,5}:2, F2 : {1,3}:2, {2,3}:2, {2,5}:3, {3,5}:2 C3 : {2,3,5} 2.
scan T C3 : {2, 3, 5}:2 F3: {2, 3, 5} Resulting k=3 itemset is {2,3,5}
HW 3 question 22
Input dataset X
Initialize the rule r
While the termination criterion is not satisfied
d=Scan(X)
v=FindFrequentPatterns(d,r,o)
r=FindAssociationRules(v)
End
Output r
------
What would be the possible output of r? ____________. - Answer-All association
rules
25 - Answer-
Let minsup = 20% and minconf = 60%. The following are two examples of class
association rules: Student, School → Education game → Sport
According to the mining class association rules (CAR), what would be the sup and
conf for both of these rules, respectively? - Answer-First part: Student, school -->
education
sup=29%
conf=100%
Second Part: game --> Sport
sup=29%
conf=67%ID
Sequences
S1
{a, b}, {c}, {f}, {g}, {e}
QUESTIONS WITH CORRECT
ANSWERS
What is the output of the following algorithm? - Answer-Frequent itemset involving
rare items
Select True or False - Answer-How to avoid the rare item problem? The value of
support must be lower.
HW 3 Question 18
Given the following set of market-based transactions.(a) Find two rules that have
60% sup and 75% conf. (b) For the following rules, calculate the sup and conf,
respectively.{Diaper} → {Milk,Beer} {Milk} → {Diaper,Beer} - Answer-a) {Bread} -->
{Milk}, {Diaper} --> {Beer}
b) {Diaper} --> {Milk,Beer}: sup = 40%, conf=50%
{Milk} --> {Diaper,Beer}: sup=40%, conf=50%
Given the following data table (right panel)(i) Calculate the sup of {apple, beer, rice}.
(ii) Calculate the conf of {apple → beer}. - Answer-(i) sup=25%
(ii) conf=75%
Suppose that {2, 3, 4} is frequent in a dataset with sup=50%. We can find proper
nonempty subsets {2,3}, {2,4}, {3,4}, {2}, {3}, {4}, with sup=50%, 50%, 75%, 75%,
75%, 75% respectively. These generate these association rules: 2,3 → 4 with conf
=100%, 2,4 → 3 with conf =100%, 3,4 → 2 with conf =67%, 2 → 3,4 with conf =67%,
3 → 2,4 with conf =67%, 4 → 2,3 with conf =67%.
What would be the percentage of sup that all rules have, ____________%? -
Answer-50}
ID
Sequences
S1
{a, b}, {c}, {f}, {g}, {e}
S2
{a, d}, {c}, {b}, {a, b, e, f}
S3
{a}, {b}, {f}, {e}
S4
{b}, {f, g}
S5
{b}, {g}
Given the set of data above. Calculate Sup and Conf when applying rules {a, b}→
{e}, and {b}→ {g} - Answer-60% and 100%
60% and 60%
, T1
1, 3, 4, 7
T2
2, 3, 5
T3
1, 2, 3, 5, 8
T4
2, 5,
T5
1, 7
Using the Apriori algorithm, find all k-item frequent itemsets from the following
dataset. Consider k=3.
First, you need to show all the scanning steps. Then, the final result for k=3 would be
________. - Answer-scan T C1 : {1}:3, {2}:3, {3}:3, {4}:1, {5}:3 {7}:2, {8}:1 F1 : {1}:3,
{2}:3, {3}:3, {5}:3 C2 : {1,2}, {1,3}, {1,5}, {2,3}, {2,5}, {3,5}, scan T C2 : {1,2}:1, {1,3}:2,
{1,5}:1, {2,3}:2, {2,5}:2, {3,5}:2, F2 : {1,3}:2, {2,3}:2, {2,5}:3, {3,5}:2 C3 : {2,3,5} 2.
scan T C3 : {2, 3, 5}:2 F3: {2, 3, 5} Resulting k=3 itemset is {2,3,5}
HW 3 question 22
Input dataset X
Initialize the rule r
While the termination criterion is not satisfied
d=Scan(X)
v=FindFrequentPatterns(d,r,o)
r=FindAssociationRules(v)
End
Output r
------
What would be the possible output of r? ____________. - Answer-All association
rules
25 - Answer-
Let minsup = 20% and minconf = 60%. The following are two examples of class
association rules: Student, School → Education game → Sport
According to the mining class association rules (CAR), what would be the sup and
conf for both of these rules, respectively? - Answer-First part: Student, school -->
education
sup=29%
conf=100%
Second Part: game --> Sport
sup=29%
conf=67%ID
Sequences
S1
{a, b}, {c}, {f}, {g}, {e}