1. Apply the Apriori Algorithm
Tasks:
Apply the Apriori Algorithm to the following data set:
Trans ID Items Purchased
101 milk, bread, eggs
102 milk, juice
103 juice, butter
104 milk, bread, eggs
105 coffee, eggs
106 coffee
107 coffee, juice
108 milk, bread, cookies, eggs
109 cookies, butter
110 milk, bread
The set of items is {milk, bread, cookies, eggs, butter, coffee, juice). Use 2 for the minimum support value. You must show all candidate and large itemsets during the process: C., L, C2, L2 etc. until the algorithm terminates.
Sol)
Given minimal support value = 2
Step1 :- K=1
Create a table which contains the support count of each item present in the dataset. gives candidate set of length 1, called C1.
C1 :
Itemset | Sup_count |
milk | 5 |
bread | 4 |
eggs | 4 |
juice | 3 |
butter | 2 |
coffee | 3 |
cookies | 2 |
compare the support count of each itemset of C1 with minimum support count('2' in our case), if candidate itemset support count is less than minimum support count remove those itemsets, which gives frequent itemsets of length 1, called L1.
No itemset from C1 have support count less than minimum support count. so
L1 :
Itemset | Sup_count |
milk | 5 |
bread | 4 |
eggs | 4 |
juice | 3 |
butter | 2 |
coffee | 3 |
cookies | 2 |
Step 2 :- K=2
To generate candidate set of length 2(C2), we have to join L1 with L1. Condition for joining Lk-1, Lk-1 is, it should have (K-2) elements in common(here K=2, no need of elements to be common in this case)
check whether all the subsets of itemsets are frequent or not(Apriori property), if not we remove those itemsets.
then, find the support count of the itemsets by searching in the dataset.
C2 :
Itemset | Sup_count |
{milk, bread} | 4 |
{milk, eggs} | 3 |
{milk, juice} | 1 |
{milk, butter} | 0 |
{milk, coffee} | 0 |
{milk, cookies} | 1 |
{bread, eggs} | 3 |
{bread, juice} | 0 |
{bread, butter} | 0 |
{bread, coffee} | 0 |
{bread, cookies} | 1 |
{eggs, juice} | 0 |
{eggs, butter} | 0 |
{eggs, coffee} | 1 |
{eggs, cookies} | 1 |
{juice, butter} | 1 |
{juice, coffee} | 1 |
{juice, cookies} | 0 |
{butter, coffee} | 0 |
{butter, cookies} | 1 |
{coffee, cookies} | 0 |
Now, compare the support count of itemsets from C2 with minimum support count(2) and remove those itemsets which have support count lesser than minimum support count. which gives frequent itemsets of length 2, called L2
L2:
Itemset | Sup_count |
{milk, bread} | 4 |
{milk, eggs} | 3 |
{bread, eggs} | 3 |
Step 3:- K=3
To generate candidate set of length 3(C3), join L2 with L2, Condition for joining Lk-1, Lk-1 is it should have (K-2) elements in common(here 3-2 =1, so here, first element should match).
{milk, bread, eggs} (milk has matched between first two itemsets of L2)
then check the apriori property for itemsets came from previous step, that is whether all the subsets of frequent itemset should be frequent, if not remove those itemsets.
In {milk, bread, eggs}, every subset {milk, bread}, {milk, eggs}, {bread, eggs} is frequent.
then, count the support count of itemsets by searching in the
dataset.
C3 :
Itemset | Sup_count |
{milk, bread, eggs} | 3 |
compare the support count of itemsets of C2 with minimum support count and remove those itemsets which have support count lesser than minimum support count, gives frequent itemsets of length 3, called L3
C3 have only one itemset with more than minimum support count, so
L3 :
Itemset | Sup_count |
{milk, bread, eggs} | 3 |
Algorithm stops here, no frequent itemsets are found further.
1. Apply the Apriori Algorithm Tasks: Apply the Apriori Algorithm to the following data set: Trans...
Apply the Apriori algorithm to the following data set: Trans Id Item Purchased 101 milk, bread, eggs 102 milk, juice 103 juice, butter 104 milk, bread, eggs 105 coffee, eggs 106 coffee 107 coffee, juice 108 milk, bread, cookies, eggs 109 cookies, butter 110 milk, bread The set of items is {milk, bread, cookies, eggs, butter, coffee, juice}. Use 2 for the minimum support value.
By applying the Apriori algorithm to the dataset in the table below: 10 Beer, Nuts, Diapers 20 Beer, Coffee, Diapers 30 Beer, Diapers, Eggs, Milk 40 Nuts, Eggs, Milk 50 Beer, Coffee, Milk 60 Diapers, Eggs, Milk 70 Beer, Coffee, Diapers 80 Beer, Nuts, Coffee, Diapers, Eggs, Milk where the minimum support for frequent patterns set at 3, the set of three items frequent itemsets, L3 is: Group of answer choices 1. L3 = {{Beer, Diapers, Milk}} 2. L3 =...
1. Find the 5-Number Summary and graph boxplots from a data set. The data are distances in feet of Mark McGwire and Sammy Sosa’s, home runs, respectively for the 1998 baseball season (they both broke Roger Maris’s home run record in 1998). - Which player has the longest distances? - Which player appears to have the most consistent distances? How can you tell from the boxplot? data: McGwire, Sosa 306, 371 420, 430 440, 440 350, 400 478, 370 425,...
Problem 1 The following data are from a research project on the effectiveness of a drug in reducing LDL cholesterol levels. While some patients in the study are assigned to the drug, others were given a placebo. Because the information about who is receiving the actual drug is kept confidential from those taking LDL measurements, that information is kept in a separate data set. Given below in Data Set A are the initial LDL cholesterol level of each individual before...