Download presentation
Presentation is loading. Please wait.
1
Association Rules (market basket analysis) Retail shops are often interested in associations between different items that people buy. Someone who buys bread is quite likely also to buy milk A person who bought the book Database System Concepts is quite likely also to buy the book Operating System Concepts. Associations information can be used in several ways. E.g. when a customer buys a particular book, an online shop may suggest associated books. Association rules: bread milk DB-Concepts, OS-Concepts Networks Left hand side: antecedent, right hand side: consequent An association rule must have an associated population; the population consists of a set of instances E.g. each transaction (sale) at a shop is an instance, and the set of all transactions is the population
2
Association Rule Definitions Set of items: I={I 1,I 2,…,I m } Transactions: D={t 1,t 2, …, t n }, t j I Itemset: {I i1,I i2, …, I ik } I Support of an itemset: Percentage of transactions which contain that itemset. Large (Frequent) itemset: Itemset whose number of occurrences is above a threshold.
3
Association Rules Example I = { Beer, Bread, Jelly, Milk, PeanutButter}
4
Association Rule Definitions Association Rule (AR): implication X Y where X,Y I and X Y = the null set; Support of AR (s) X Y: Percentage of transactions that contain X Y Confidence of AR ( ) X Y: Ratio of number of transactions that contain X Y to the number that contain X
5
Association Rules Ex (cont’d)
6
Of 5 transactions, 3 involve both Bread and PeanutButter, 3/5 = 60% Of the 4 transactions that involve Bread, 3 of them also involve PeanutButter 3/4 = 75%
7
Association Rule Problem Given a set of items I={I 1,I 2,…,I m } and a database of transactions D={t 1,t 2, …, t n } where t i ={I i1,I i2, …, I ik } and I ij I, the Association Rule Problem is to identify all association rules X Y with a minimum support and confidence (supplied by user). NOTE: Support of X Y is same as support of X Y.
8
Association Rule Algorithm (Basic Idea) Find Large Itemsets. Generate rules from frequent itemsets. This is the simple naïve algorithm, better algorithms exist.
9
Association Rule Algorithm We are generally only interested in association rules with reasonably high support (e.g. support of 2% or greater) Naïve algorithm 1. Consider all possible sets of relevant items. 2. For each set find its support (i.e. count how many transactions purchase all items in the set). Large itemsets: sets with sufficiently high support Use large itemsets to generate association rules. From itemset A generate the rule A - {b} b for each b A. Support of rule = support (A). Confidence of rule = support (A ) / support (A - {b})
10
From itemset A generate the rule A - {b} b for each b A. Support of rule = support (A). Confidence of rule = support (A ) / support (A - {b}) Lets say itemset A = {Bread, Butter, Milk} Then A - {b} b for each b A includes 3 possibilities {Bread, Butter} Milk {Bread, Milk} Butter {Butter, Milk} Bread
11
Apriori Large Itemset Property: Any subset of a large itemset is large. Contrapositive: If an itemset is not large, none of its supersets are large.
12
Large Itemset Property
13
If B is not frequent, then none of the supersets of B can be frequent. If {ACD} is frequent, then all subsets of {ACD} ({AC}, {AD}, {CD}) must be frequent. If {ACD} is frequent, then all subsets of ({A}, {A}, {C}) must be frequent.
14
My Personal View of Association Rules Vastly over studied problem, of dubious utility
15
Student Presentations Starting next week students will be giving presentations Presentation can be on The student project A paper chosen by the student (per my approval) The presentation should last 8 to15 minutes. You need to tell me in advance how long the talk will be. You must email me the slides by midnight, before the talk There will be a signup sheet (topic and date) on my door tomorrow.
16
Tips for Giving a Good Talk Winter 2003 Dr Eamonn Keogh Computer Science & Engineering Department University of California - Riverside Riverside,CA 92521 eamonn@cs.ucr.edu Modified from the notes of Edward R. Tufte, Craig S. Kaplan, Eamonn Keogh and others
17
Outline Advice on giving talks General advice Organization Making clear overheads Avoiding common pitfallsConclusion
18
Show up early. You may have a chance to head off some technical or ergonomic problem. Have a backup plan. If your lecture is based on a PowerPoint presentation, have overhead backups of each page. Check out the room ahead of time. Before your talk, check out the room, and make sure it has everything you need. General Advice I
19
Never apologize. Most people wouldn’t have noticed the issues for which you’re apologizing—and it just sounds lame. Invest in a laser pointer. They are inexpensive, and are extremely useful. Rehearse timing. This is the most common sin!!! General Advice II
20
Overheads I Use large fonts. Use the biggest fonts realistically possible. Small fonts are hard to read Use highly contrasting colors. Avoid busy backgrounds. Too much in the background makes the text hard to read
21
Overheads II Avoid using red text. Red text is often hard to read. AVOID ALL CAPS! All caps look like you're shouting. …Include a good combination of words, pictures, and graphics. A variety keeps the presentation interesting
22
Overheads III Be Terse The sales forecasts show an increase on the horizon. Sales are up. Use bullets or numbered items appropriately Goals Ease of use Reusability Reliability Outline of our method 1.Design 2.Implementation 3.Testing
23
Overheads IIII Begin with an introduction slide (Who you are, why you are giving a talk, the title of the talk) Next, give an outline (“roadmap”). For a short talk, you might want to combine this with the above State your point (one simple slide) Demonstrate your point (a few slides) Review your point (one simple slide)
24
Overheads V End with a slide that reviews the entire talk… We introduced the TSP problem We explained why it is an important problem We explained why it is a hard problem We introduced a new heuristic to solve TSP We empirically demonstrated the utility of our approach End “cleanly”, don’t fade away.
25
Overheads VI Avoid using “standard” clipart/ background etc I have seen this at least 20 times in conference presentations.
26
Overheads VII Be careful with Acronyms… C_max C_min Range i, Diameter i R 1, D 1 R 2, D 2 Neighboring Unlabeled Token: sskh f dhfa
27
Annoying Personal Habits I (This means you) Playing with jewelry Licking and/or biting your lips Constantly adjusting your glasses Popping the top of a pen Playing with facial hair (men) Playing with/twirling your hair (women)
28
Annoying Personal Habits II (This means you) Jingling change in your pocket Leaning against anything for support Fillers: “ah”, “um”, and “and” Starting every sentence with the same word Sticky floor syndrome Avoiding eye contact Lack of enthusiasm “Basically” and “essentially” seem to be the current favorites.
29
Conclusion We have motivated the need for a high quality talk We have seen various tips on creating high quality overheads We have seen various hints on avoiding common pitfalls
30
Questions? Dr Eamonn Keogh Computer Science & Engineering Department University of California - Riverside Riverside,CA 92521 eamonn@cs.ucr.edu
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.