The document discusses implementing the Apriori algorithm for association rule mining using the Weka data mining tool. It describes Apriori as a classical bottom-up algorithm for mining frequent itemsets and relevant association rules from transactional databases. It also outlines how to create a sample dataset in Excel, convert it to ARFF format, load it into Weka, apply the Apriori algorithm to generate association rules, and interpret the results.
2. A-PRIORI ALGORITHM
• A classical algorithm.
• Used for mining frequent item sets and relevant association rules.
• Uses a “bottom up" approach.
• Devised to operate on a database containing a lot of transactions.
• Produces association rules.
Implementing A-priori algorithm using weka 12/10/2018 2
3. ATTRIBUTE TYPES IN A-PRIORI
For running a-priori algorithm all attribute type must be one of these –
Nominal
Binary
Unary
Implementing A-priori algorithm using weka 12/10/2018 3
4. ASSOCIATION RULE
• A prominent and well-explored method for determining relations among variables in large databases.
• Helps to uncover relationships between seemingly unrelated data in a relational database.
• It has two parts –
Antecedent (if)
Consequent (then)
• Example –
Let us consider an association rule be
{Onion, Potato} => {Burger}
which means that if onion and potato are bought, customers also buy a burger.
• Created by analyzing data for frequent if/then patterns and using the criteria support and confidence to
identify the most important relationships.
Implementing A-priori algorithm using weka 12/10/2018 4
5. SUPPORT
• The support of an itemset X, supp(X) is the proportion of transaction in the database in which the item
X appears. It signifies the popularity of an itemset.
• 𝑠𝑢𝑝𝑝 𝑋 =
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑟𝑎𝑛𝑠𝑎𝑐𝑡𝑖𝑜𝑛 𝑖𝑛 𝑤ℎ𝑖𝑐ℎ 𝑋 𝑎𝑝𝑝𝑒𝑎𝑟𝑠
𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑟𝑎𝑛𝑠𝑎𝑐𝑡𝑖𝑜𝑛𝑠
Implementing A-priori algorithm using weka 12/10/2018 5
6. CONFIDENCE
• Signifies the likelihood of item Y being purchased when item X is purchased.
• 𝑐𝑜𝑛𝑓 𝑋 → 𝑌 =
𝑠𝑢𝑝𝑝(𝑋 ∪ 𝑌)
𝑠𝑢𝑝𝑝(𝑋)
Implementing A-priori algorithm using weka 12/10/2018 6
7. AVAILABLE TOOLS
More popular tools used for data mining are –
• Weka
• Keel
In this presentation, we will use WEKA data mining tool.
Implementing A-priori algorithm using weka 12/10/2018 7
8. WHAT IS WEKA
• Waikato Environment for Knowledge Analysis (Weka).
• A collection of machine learning algorithms for data mining tasks.
• Contains tools for data –
pre-processing
classification
Regression
Clustering
association rules
visualization.
• An open source software issued under the GNU General Public License.
Implementing A-priori algorithm using weka 12/10/2018 8
9. DATASET IN WEKA
• Data set can be -
CREATED
DOWNLOAED
• For this presentation, we have created our own dataset using Microsoft Excel
Implementing A-priori algorithm using weka 12/10/2018 9
10. CREATING DATASET IN MICROSOFT EXCEL
Implementing A-priori algorithm using weka 12/10/2018 10
14. LOADING (.ARFF) FILE IN WEKA
Implementing A-priori algorithm using weka 12/10/2018 14
15. APPLYING ASSOCIATION RULE
In WEKA, a-priori algorithm is default association rule.
Before running a-priori algorithm we have checked if all attributes are nominal or binary or unary.
Implementing A-priori algorithm using weka 12/10/2018 15
19. INTERPRETATION OF RULES
Let us interpret our first rule where the rule is –
age = aged 5 ==> purchase = willBuy 5
Defines, if age = aged, then there are 5 incidents where purchase = willBuy.
Implementing A-priori algorithm using weka 12/10/2018 19