COMP53311 Knowledge Discovery in Databases Overview Prepared by Raymond Wong Presented by Raymond Wong

Slides:



Advertisements
Similar presentations
LOGO Association Rule Lecturer: Dr. Bo Yuan
Advertisements

CS583 – Data Mining and Text Mining
FP-Growth algorithm Vasiljevic Vladica,
Data Mining Sangeeta Devadiga CS 157B, Spring 2007.
COMP53311 Data Stream Prepared by Raymond Wong Presented by Raymond Wong
2015/6/1Course Introduction1 Welcome! MSCIT 521: Knowledge Discovery and Data Mining Qiang Yang Hong Kong University of Science and Technology
CS583 – Data Mining and Text Mining
SAK 5609 DATA MINING Prof. Madya Dr. Md. Nasir bin Sulaiman
Web Information Retrieval and Extraction Chia-Hui Chang, Associate Professor National Central University, Taiwan
© Prentice Hall1 DATA MINING TECHNIQUES Introductory and Advanced Topics Eamonn Keogh (some slides adapted from) Margaret Dunham Dr. M.H.Dunham, Data Mining,
1 Data Mining Techniques Instructor: Ruoming Jin Fall 2006.
Web Information Retrieval and Extraction Chia-Hui Chang, Associate Professor National Central University, Taiwan Sep. 16, 2005.
© Tan,Steinbach, Kumar Introduction to Data Mining 1/17/ Data Mining Classification: Alternative Techniques Figures for Chapter 5 Introduction to.
CS157A Spring 05 Data Mining Professor Sin-Min Lee.
CS 5831 CS583 – Data Mining and Text Mining Course Web Page 05/cs583.html.
CS 5941 CS583 – Data Mining and Text Mining Course Web Page 05/cs583.html.
CS583 – Data Mining and Text Mining
Data Warehousing and Data Mining IS-427 مستودعات البيانات و التنقيب عنها نال 427.
1 An Introduction to Data Mining Hosein Rostani Alireza Zohdi Report 1 for “advance data base” course Supervisor: Dr. Masoud Rahgozar December 2007.
CSCI 347 – Data Mining Lecture 01 – Course Overview.
Math 125 Statistics. About me  Nedjla Ougouag, PhD  Office: Room 702H  Ph: (312)   Homepage:
1 Data Mining Books: 1.Data Mining, 1996 Pieter Adriaans and Dolf Zantinge Addison-Wesley 2.Discovering Data Mining, 1997 From Concept to Implementation.
Course Title Database Technologies Instructor: Dr ALI DAUD Course Credits: 3 with Lab Total Hours: 45 approximately.
Data Mining By Fu-Chun (Tracy) Juang. What is Data Mining? ► The process of analyzing LARGE databases to find useful patterns. ► Attempts to discover.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
CS525 DATA MINING COURSE INTRODUCTION YÜCEL SAYGIN SABANCI UNIVERSITY.
CS 5831 CS583 – Data Mining and Text Mining Course Web Page 06/cs583.html.
Data Mining with Oracle using Classification and Clustering Algorithms Proposed and Presented by Nhamo Mdzingwa Supervisor: John Ebden.
Syllabus CS479(7118) / 679(7112): Introduction to Data Mining Spring-2008 course web site:
Data Warehousing/Mining 1 Data Warehousing/Mining Comp 150DW Course Overview Instructor: Dan Hebert.
Introduction of Data Mining and Association Rules cs157 Spring 2009 Instructor: Dr. Sin-Min Lee Student: Dongyi Jia.
EXAM REVIEW MIS2502 Data Analytics. Exam What Tool to Use? Evaluating Decision Trees Association Rules Clustering.
Multi-Relational Data Mining: An Introduction Joe Paulowskey.
Open Systems and Electronic Commerce
MAIN BOOKS 1. DATA WAREHOUSING IN THE REAL WORLD : Sam Anshory & Dennis Murray, Pearson 2. DATA MINING CONCEPTS AND TECHNIQUES : Jiawei Han & Micheline.
Prepared by: Mahmoud Rafeek Al-Farra College of Science & Technology Dep. Of Computer Science & IT BCs of Information Technology Data Mining
An Evaluation of Commercial Data Mining Proposed and Presented by Emily Davis Supervisor: John Ebden.
General Information 439 – Data Mining Assist.Prof.Dr. Derya BİRANT.
CSCE 5073 Section 001: Data Mining Spring Overview Class hour 12:30 – 1:45pm, Tuesday & Thur, JBHT 239 Office hour 2:00 – 4:00pm, Tuesday & Thur,
1 Advanced Database System Design Instructor: Ruoming Jin Fall 2010.
1 SBM411 資料探勘 陳春賢. 2 Lecture I Class Introduction.
1 IMM472 資料探勘 陳春賢. 2 Lecture I Class Introduction.
Chapter 8 Association Rules. Data Warehouse and Data Mining Chapter 10 2 Content Association rule mining Mining single-dimensional Boolean association.
COMP53311 Association Rule Mining Prepared by Raymond Wong Presented by Raymond Wong
Sotarat Thammaboosadee, Ph.D. EGIT563- Data Mining Course Outline.
DATA MINING: LECTURE 1 By Dr. Hammad A. Qureshi Introduction to the Course and the Field There is an inherent meaning in everything. “Signs for people.
Introduction.  Instructor: Cengiz Örencik   Course materials:  myweb.sabanciuniv.edu/cengizo/courses.
1 SBM411 資料探勘 陳春賢. 2 Lecture I Class Introduction.
DATABASE SYSTEM COURSE SYLLABUS Ghulam Imaduddin Informatics Engineering Muhammadiyah Jakarta University Database System by Ghulam I1.
CS583 – Data Mining and Text Mining
CS583 – Data Mining and Text Mining
Classification 3 (Nearest Neighbor Classifier)
COMP1942 Exploring and Visualizing Data Overview
مستودعات البيانات و التنقيب عنها
COMP1942 Classification: More Concept Prepared by Raymond Wong
CS583 – Data Mining and Text Mining
©Jiawei Han and Micheline Kamber
Data Mining: Concepts and Techniques Course Outline
©Jiawei Han and Micheline Kamber
CS583 – Data Mining and Text Mining
COMP5331 FP-Tree Prepared by Raymond Wong Presented by Raymond Wong
CS583 – Data Mining and Text Mining
COMP5331 Advanced Topics Prepared by Raymond Wong
CS583 – Data Mining and Text Mining
Dept. of Computer Science University of Liverpool
Welcome! Knowledge Discovery and Data Mining
CSCE 4143 Section 001: Data Mining Spring 2019.
CS583 – Data Mining and Text Mining
CSE591: Data Mining by H. Liu
Information Retrieval and Data Mining (AT71. 07) Comp. Sc. and Inf
Presentation transcript:

COMP53311 Knowledge Discovery in Databases Overview Prepared by Raymond Wong Presented by Raymond Wong

COMP53312 Course Details Reference books/materials: Papers Data Mining: Concepts and Techniques. Jiawei Han and Micheline Kamber. Morgan Kaufmann Publishers (3 rd edition) Introduction to Data Mining. Pang-Ning Tan, Michael Steinbach, Vipin Kumar Boston : Pearson Addison Wesley (2006)

COMP53313 Area DB or AI This course can count towards one of the areas ONLY and cannot be double counted towards the required credits

COMP53314 Course Details Grading Scheme: Assignment 30% Project 30% Final Exam 40%

COMP53315 Assignment If the students can answer the selected questions in class correctly, for each corrected answer, I will give him/her a coupon This coupon can be used to waive one question in an assignment which means that s/he can get full marks for this question without answering this question

COMP53316 Assignment Guideline For each assignment, each student can waive at most one question only. s/he can waive any question he wants and obtain full marks for this question (no matter whether s/he answer this question or not) s/he may also answer this question. But, we will also mark it but will give full marks to this question. When the student submits the assignment, please staple the coupon to the submitted assignment please write down the question no. s/he wants to waive on the coupon

COMP53317 Project Each project is completed by a group. The number of students in a group depends on the class size. The duration of each presentation depends on the class size. It will be announced soon.

COMP53318 Project Project Type (One of the following) Survey Implementation-oriented Project Research-oriented Project Your group only needs to read about 2~5 papers Your group only needs to read about 1~2 papers You can read some papers and conduct research

COMP53319 Project Project Type (One of the following) Survey Implementation-oriented Project Research-oriented Project 1.Proposal 2.Presentation 3.Final report 1.Proposal 2.Presentation 3.Final report 4.Coding 1.Proposal 2.Presentation 3.Final report (containing your proposed methodology) 4.Coding (if any) Full Score = 80% Full Score = 90% Full Score = 100%

COMP Project Project Topic Some pre-selected topics/papers Your own choice For fairness, please do not choose the topic which is closely related to your own research

COMP Exam You are allowed to bring a calculator with you. Please remember to prepare a calculator for the exam

COMP Major Topics 1.Association 2.Clustering 3.Classification 4.Data Warehouse 5.Data Mining over Data Streams 6.Web Databases 7.Multi-criteria Decision Making

COMP Association CustomerAppleOrangeMilk RaymondAppleOrange AdaOrangeMilk GraceAppleOrange ………… Items/ItemsetsFrequency Apple2 Orange3 Milk1 {Apple, Orange}2 {Orange, Milk}1 We are interested in the items/itemsets with frequency >= 2 Frequent Pattern (or Frequent Item) Frequent Pattern (or Frequent Item) Frequent Pattern (or Frequent Itemset)

COMP Association CustomerAppleOrangeMilk RaymondAppleOrange AdaOrangeMilk GraceAppleOrange ………… Items/ItemsetsFrequency Apple2 Orange3 Milk1 {Apple, Orange}2 {Orange, Milk}1 We are interested in the items/itemsets with frequency >= 2 Association Rule: 1. Apple  Orange ( customers who buy apple will probably buy orange.) 2. Orange  Apple ( customer who buy orange will probably buy apple.) 100% % 3 2 Problem: to find all frequent patterns and association rules

COMP Major Topics 1.Association 2.Clustering 3.Classification 4.Data Warehouse 5.Data Mining over Data Streams 6.Web Databases 7.Multi-criteria Decision Making

COMP Clustering ComputerHistory Raymond Louis 9045 Wyman 2095 ……… Computer History Cluster 1 (e.g. High Score in Computer and Low Score in History) Cluster 2 (e.g. High Score in History and Low Score in Computer) Problem: to find all clusters

COMP Major Topics 1.Association 2.Clustering 3.Classification 4.Data Warehouse 5.Data Mining over Data Streams 6.Web Databases 7.Multi-criteria Decision Making

COMP Classification root child=yeschild=no Income=high Income=low 100% Yes 0% No 100% Yes 0% No 0% Yes 100% No Decision tree RaceIncomeChildInsurance whitehighno? Suppose there is a person.

COMP Major Topics 1.Association 2.Clustering 3.Classification 4.Data Warehouse 5.Data Mining over Data Streams 6.Web Databases 7.Multi-criteria Decision Making

COMP Warehouse Databases Users Databases Users Data Warehouse Need to wait for a long time (e.g., 1 day to 1 week) Pre-computed results Query

COMP Major Topics 1.Association 2.Clustering 3.Classification 4.Data Warehouse 5.Data Mining over Data Streams 6.Web Databases 7.Multi-criteria Decision Making

COMP Data Mining over Static Data 1.Association 2.Clustering 3.Classification Static Data Output (Data Mining Results)

COMP Data Mining over Data Streams 1.Association 2.Clustering 3.Classification Output (Data Mining Results) … Unbounded Data Real-time Processing

COMP Major Topics 1.Association 2.Clustering 3.Classification 4.Data Warehouse 5.Data Mining over Data Streams 6.Web Databases 7.Multi-criteria Decision Making

COMP Web Databases Raymond Wong

COMP How to rank the webpages?

COMP Major Topics 1.Association 2.Clustering 3.Classification 4.Data Warehouse 5.Data Mining over Data Streams 6.Web Databases 7.Multi-criteria Decision Making

COMP Multi-criteria Decision Making HotelPriceDistance to beach (km) a10004 b24005 c hotels Suppose we want to look for a hotel which is close to a beach. We have two attributes. Which hotel should we select? Suppose we compare hotel a and hotel b We know that hotel a is “ better ” than hotel b because 1.Price of hotel a is smaller 2.Distance of hotel a is smaller

COMP Multi-criteria Decision Making HotelPriceDistance to beach (km) a10004 b24005 c hotels Suppose we want to look for a hotel which is close to a beach. We have two attributes. Which hotel should we select? Suppose we compare hotel a and hotel c We cannot determine hotel a is “ better ” than hotel c (wrt two attributes). We cannot determine hotel c is “ better ” than hotel a (wrt two attributes).. This is because 1.Price of hotel a is smaller 2.Distance of hotel c is smaller