Question 3 (10 marks) A large supermarket tracks sales data by stock- keeping unit (SKU) for each item, such as “pasta”, “rice”, “egg”, “soup” is identified by a numerical SKU. The supermarket has a database of transactions where each transaction is a set of SKUs that were bought together. The database of transactions consist of following itemsets:
Transaction ID Itemsets
T1 egg, soup
T2 rice, egg, soup
T3 rice, soup
T4 pasta, rice, egg, soup
T5 rice, egg
T6 pasta, rice, soup
T7 pasta, rice
Solve the above problem using Apriori Algorithm (Association Rule Mining) with minimum support of 3 and write each and every step.
Given minimum support is 3
Solving the question using Apriori Algorithm:
Initially, valid itemset = {“pasta”, “rice”, “egg”, “soup”}
Step-1:
Items of size-1 from itemset are as follows:
Item | Count |
pasta | 3 |
rice | 6 |
egg | 4 |
soup | 5 |
Since all the items are greater than minimum support, there is no need to delete any items from itemset.
Valid itemset = {“pasta”, “rice”, “egg”, “soup”}
Step-2:
Items of size-2 from itemset are as follows:
Item | Count |
pasta,rice | 3 |
pasta,egg | 1 |
pasta,soup | 2 |
rice,egg | 3 |
rice,soup | 4 |
egg,soup | 3 |
Since "pasta,egg" and "pasta,soup" are less than minimum support, delete them from items.
So new Valid itemset = {“pasta”, “rice”, “egg”, “soup”}
Step-3:
Items of size-3 from itemset are as follows:
Item | Count |
pasta,rice,egg | 1 |
pasta,rice,soup | 2 |
pasta,egg.soup | 1 |
egg,rice,soup | 2 |
Since all the items are less than minimum support, now the itemset becomes empty.
valid itemset = {}
Since we have empty itemset, we can't proceed further.
Now group all valid itemset from step-2 to final step.
Valid frequent patterns are as follows: {{pasta,rice},{rice,egg},{rice,soup},{egg,soup}}
Mentiuon in commments if any mistakes or errors are found. Thank you.
Question 3 (10 marks) A large supermarket tracks sales data by stock- keeping unit (SKU) for...
A large supermarket tracks sales data by stock- keeping unit (SKU) for each item, such as “pasta”, “rice”, “egg”, “soup” is identified by a numerical SKU. The supermarket has a database of transactions where each transaction is a set of SKUs that were bought together. The database of transactions consist of following itemsets: Transaction ID Itemsets T1 egg, soup T2 rice, egg, soup T3 rice, soup T4 pasta, rice, egg, soup T5 rice, egg T6 pasta, rice, soup T7 pasta,...
A large supermarket tracks sales data by stock- keeping unit (SKU) for each item, such as “pasta”, “rice”, “egg”, “soup” is identified by a numerical SKU. The supermarket has a database of transactions where each transaction is a set of SKUs that were bought together. The database of transactions consist of following itemsets: Transaction ID Itemsets T1 egg, soup T2 rice, egg, soup T3 rice, soup T4 pasta, rice, egg, soup T5 rice, egg T6 pasta, rice, soup T7 pasta,...
Question 3 (10 marks) A large supermarket tracks sales data by stock-keeping unit (SKU) for each item, such as "pasta", "rice", "egg", "soup" is identified by a numerical SKU. The supermarket has a database of transactions where each transaction is a set of SKUs that were bought together. The database of transactions consist of following itemsets: Transaction ID Itemsets T1 egg, soup T2 rice, egg, soup T3 rice, soup T4 pasta, rice, egg, soup Copyright © 2015-2018 VIT, All Rights...
I
need help with a data mining problem
Consider the following transaction dataset. T1: a, d, e T2: a, b, c, e T2: a, b, d, e T4: a, c, d, e T5: b, c, e T6: b, d, e T7:c, d T8: a, b, d a) Compute the support for itemsets {e}, {b, d}, and {b, d, e}. b) Compute the confidence for the association rules {b, d} rightarrow {e} and {e} rightarrow {b, d). c) Is confidence a...
We are given four items, namely A, B, C, and D. Their corresponding unit profits are pA, pB, pC, and pD. The following shows five transactions with these items. Each row corresponds to a transaction where a non-negative integer shown in the row corresponds to the total number of occurrences of the correspondence item present in the transaction. T A B C D t1 0 0 3 2 t2 3 4 0 0 t3 0 0 1 3 t4 1...