MIDTERM
CS522, Winter 2011



1. Consider the following FP-tree

[FP-tree]

Let minimum support count min_sup = 2.

(a) (15pt) List all the frequent itemsets with their support counts

(b) (5pt) List all the closed frequent itemsets with their support counts.

(c) (5pt) List all the maximal frequent itemsets.

2. (15pt) Consider the following sequences

<(ab)(c)>
<(ac)(bc)>
<(b)(b)(bc)>
<(ad)(b)(d)>
<(cd)(c)(d)>

Let minimum support count min_sup = 2. Use the GSP algorithm to find all the frequent sequences.

3. Consider the following dataset

a1 a2 a3 Class
0 1 0 1
1 1 0 0
0 0 1 1
1 0 1 0
0 0 0 0

(a) (20pt) Construct a Decision Tree to classify the record (0,1,1,?). Use Information Gain with Entropy to determine the best splits for the decision tree.

(b) (20pt) Use Naive Bayesian Classification to classify the record (0,1,1,?).

4. Use the BBN shown on Slide #21 of the Lecture Notes on Bayesian Classification to compute the following probabilities:

(a) (10pt)  P(HD=Yes|E=Yes)

(b) (10pt)  P(HD=Yes|E=Yes,CP=No)