An Evaluation of Commercial Data Mining Proposed and Presented by Emily Davis Supervisor: John Ebden.

Slides:



Advertisements
Similar presentations
CS583 – Data Mining and Text Mining
Advertisements

© Devon M.Simmonds, 2007 CSC 550 Graduate Course in Software Engineering ______________________ Devon M. Simmonds Computer Science Department University.
By: Mr Hashem Alaidaros MIS 211 Lecture 4 Title: Data Base Management System.
2015/6/1Course Introduction1 Welcome! MSCIT 521: Knowledge Discovery and Data Mining Qiang Yang Hong Kong University of Science and Technology
1 DATA MINING: DEFINITIONS AND DECISION TREE EXAMPLES Emily Thomas Director of Planning and Institutional Research.
CS583 – Data Mining and Text Mining
Chapter 9 Business Intelligence Systems
SAK 5609 DATA MINING Prof. Madya Dr. Md. Nasir bin Sulaiman
CS/CMPE 536 –Data Mining Outline. CS Data Mining (Au 2004/2005) - Asim LUMS2 Description A comprehensive introduction to the concepts and.
CS 536 –Data Mining Outline.
1 ACCTG 6910 Building Enterprise & Business Intelligence Systems (e.bis) Introduction to Data Mining Olivia R. Liu Sheng, Ph.D. Emma Eccles Jones Presidential.
1 Data Mining Techniques Instructor: Ruoming Jin Fall 2006.
Introduction to Data Mining with Case Studies
Data Mining By Archana Ketkar.
1 Introduction to Data Mining Instructor: Y.T. Wang ( 王耀德 ) Office: 主顧 686 Phone: (04) # Office hours:
CS 5941 CS583 – Data Mining and Text Mining Course Web Page 05/cs583.html.
Oracle Data Mining Ying Zhang. Agenda Data Mining Data Mining Algorithms Oracle DM Demo.
Special Topics in Data Mining. Direct Objectives To learn data mining techniques To see their use in real-world/research applications To get an understanding.
TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT (Muscat, Oman) DATA MINING.
Microsoft Enterprise Consortium Data Mining Concepts Introduction: The essential background Prepared by David Douglas, University of ArkansasHosted by.
Data Mining: Concepts & Techniques. Motivation: Necessity is the Mother of Invention Data explosion problem –Automated data collection tools and mature.
CSc288 Term Project Data mining on predict Voice-over-IP Phones market Huaqin Xu.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
Information Retrieval from Data Bases for Decisions Dr. Gábor SZŰCS, Ph.D. Assistant professor BUTE, Department Information and Knowledge Management.
Data Mining Techniques
Intelligent Systems Lecture 23 Introduction to Intelligent Data Analysis (IDA). Example of system for Data Analyzing based on neural networks.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Introduction: The essential background
Research in Computing Discipline Prabhas Chongstitvatana.
An Evaluation of A Commercial Data Mining Suite Oracle Data Mining Presented by Emily Davis Supervisor: John Ebden.
1 Data Mining Books: 1.Data Mining, 1996 Pieter Adriaans and Dolf Zantinge Addison-Wesley 2.Discovering Data Mining, 1997 From Concept to Implementation.
INTRODUCTION TO DATA MINING MIS2502 Data Analytics.
Data Mining Applied to Document Imaging Jeff Rekoske.
CS525 DATA MINING COURSE INTRODUCTION YÜCEL SAYGIN SABANCI UNIVERSITY.
Data Mining with Oracle using Classification and Clustering Algorithms Presented by Nhamo Mdzingwa Supervisor: John Ebden.
Text Mining In InQuery Vasant Kumar, Peter Richards August 25th, 1999.
AI Week 14 Machine Learning: Introduction to Data Mining Lee McCluskey, room 3/10
An Investigation of Oracle and SQL Server with respect to Integrity, and SQL Language standards Presented by: Paul Tarwireyi Supervisor: John Ebden.
Lecturer: Gareth Jones. How does a relational database organise data? What are the principles of a database management system? What are the principal.
Data Mining with Oracle using Classification and Clustering Algorithms Proposed and Presented by Nhamo Mdzingwa Supervisor: John Ebden.
Oracle Data Mining Update and Xerox Application Charlie Berger Sr. Director of Product Management, Life Sciences and Data Mining
Research Methods Introduction to Research Methods Prof.
Knowledge Representation of Statistic Domain For CBR Application Supervisor : Dr. Aslina Saad Dr. Mashitoh Hashim PM Dr. Nor Hasbiah Ubaidullah.
Introduction of Data Mining and Association Rules cs157 Spring 2009 Instructor: Dr. Sin-Min Lee Student: Dongyi Jia.
27-18 września Data Mining dr Iwona Schab. 2 Semester timetable ORGANIZATIONAL ISSUES, INDTRODUCTION TO DATA MINING 1 Sources of data in business,
CISB113 Fundamentals of Information Systems Data Management.
General Information 439 – Data Mining Assist.Prof.Dr. Derya BİRANT.
An Investigation of Commercial Data Mining Presented by Emily Davis Supervisor: John Ebden.
Data Mining: Knowledge Discovery in Databases Peter van der Putten ALP Group, LIACS Pre-University College LAPP-Top Computer Science February 2005.
BOĞAZİÇİ UNIVERSITY DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS MATLAB AS A DATA MINING ENVIRONMENT.
COMP53311 Knowledge Discovery in Databases Overview Prepared by Raymond Wong Presented by Raymond Wong
Computing & Information Sciences Kansas State University Paper Review Guidelines KDD Lab Course Supplement William H. Hsu Kansas State University Department.
Project Seminar on STABLE CLUSTERING ALGORITHM TO IDENTIFY CPU USAGE OF COMPUTERS BEHAVIOR IN GRID ENVIRONMENT Under the guidance of Prof. Lakshmi Rajamani.
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
Waqas Haider Bangyal. 2 Source Materials “ Data Mining: Concepts and Techniques” by Jiawei Han & Micheline Kamber, Second Edition, Morgan Kaufmann, 2006.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
Introduction.  Instructor: Cengiz Örencik   Course materials:  myweb.sabanciuniv.edu/cengizo/courses.
1 SBM411 資料探勘 陳春賢. 2 Lecture I Class Introduction.
Chapter 3 Building Business Intelligence Chapter 3 DATABASES AND DATA WAREHOUSES Building Business Intelligence 6/22/2016 1Management Information Systems.
Data Mining 101 with Scikit-Learn
Waikato Environment for Knowledge Analysis
Data Mining: Concepts and Techniques Course Outline
©Jiawei Han and Micheline Kamber
כריית מידע -- מבוא ד"ר אבי רוזנפלד.
Master dissertation Proposals
A Comparison of Capabilities of Data Mining Tools
Performance And Scalability In Oracle9i And SQL Server 2000
Comparisons of Clustering Detection and Neural Network in E-Miner, Clementine and I-Miner Jong-Hee Lee and Yong-Seok Choi.
Dept. of Computer Science University of Liverpool
Performance And Scalability In Oracle9i And SQL Server 2000
Presentation transcript:

An Evaluation of Commercial Data Mining Proposed and Presented by Emily Davis Supervisor: John Ebden

Statement of the Problem An Evaluation of Commercial Data Mining Capabilities, for example Oracle9i’s Data Mining Suite.

Background Data mining is a relatively new offshoot of database technology which has arisen as a result of the ability of computers to: Store vast quantities of data in data warehouses. Implement ingenious algorithms for the mining of data. Use these algorithms to analyse these vast quantities of data in a reasonable amount of time.

Data mining discovers the patterns in data that represent knowledge. It is of interest what algorithms data mining suites use and how well each category of data mining algorithm performs on data and what kind of results are produced. Another important issue is usability of the algorithm. Random Number Example taken from ng/mining1.html

# data a data b data c

???????? Data A and B random numbers generated in Excel. Data c = 2*(data a) + 3*(data b).

51 st value calculated by Excel: Value calculated using Knowledge Miner – a Macintosh data mining tool: and the equation : 1.97*(data a) *(data b)

Experiment repeated using three columns of random numbers and this equation: Data d = 23*(data a)-4.5*(data b)+(data a + data c). The last five entries for Data D were missing from the column.

These were generated by Excel: These are what Knowledge Miner predicted:

Plan of Action Literature Survey (and other resources) Install Software for Oracle Get to know the Oracle Suite Evaluate Oracle9i’s Data Mining Suite

Install Software for Oracle Including JDeveloper May be extended to the installation of other commercial data mining suites eg. DB2’s Intelligent Miner Informix’s Data Mine

Investigate Oracle9i’s Data Mining Suite Two major algorithm types – supervised and unsupervised learning. A Medical Example: Supervised learning – researchers input medical profiles into a leukaemia model to predict propensity for the disease. Unsupervised learning – searches for clusters of related information in data sets to reveal insights about diseases and patient populations.

Get to know the Oracle DM Suite (a major task). Explore JDeveloper, Oracle9i’s Java based API. JDeveloper complies with JDM (Java Data Mining) used by Oracle, Sun, IBM and others. Explore DM4J( Data Mining for Java) the new Graphical User Interface for Oracle DM.

Addressing the Problem: Run the different algorithms available in the data mining suite. Document and analyse results in terms of performance and effectiveness of algorithm.

Expected Results: The ability to say conclusively whether Oracle's data mining capabilities are inferior or superior to anything else in the market place and why this can be stated.

Possible Extensions to the Project: To have sufficient knowledge of the topic to give recommendations or feedback:  to Oracle regarding their data mining suite.  to IT customers wanting to purchase data mining suites. Explore the field of Random stereograms- could a computer see them? If not, why not?

Literature Survey Principles of data mining by David Hand, Heikki Mannila and Padhraic Smyth, Cambridge Massachusetts, MIT Press, 2001 – algorithmic concepts Data mining: concepts and techniques by Jiawei Han and Micheline Kamber, San Francisco, California, Morgan Kauffmann, 2001 – algorithmic evaluations Data mining: a tutorial- based primer by Richard J. Roiger and Michael W. Geatz, Boston, Massachusetts, Addison Wesley, practical knowledge and processing

Data Mining by Pieter Adriaans and Dolf Zantinge, Harlow, England, Addison Wesley, 1996 – real life application Data Mining and Statistical Analysis Using SQL by Robert P. Trueblood and John N. Lovett, Jnr., USA, Apress, 2001 – statistical principles Data Mining Using SAS Applications by George Fernandez, USA, Chapman and Hall/CRC, methodologies

Mastering Data Mining: The Art and Science of Customer Relationship Management by Michael J.A. Berry and Gordon S. Linoff, USA, Wiley Computer Publishing, 2000 – building effective models Data Preparation for Data Mining by Dorian Pyle, San Francisco, California, Morgan Kauffman, 2000 – Demo code, 10 Golden Rules.

The White Paper: Data Mining- Beyond Algorithms by Dr Akeel Al-Attar, available at Summary from the KDD-03 Panel—Data Mining: The Next Ten Years available at s/issue5-2/pnl_10yrs_final1.pdf s/issue5-2/pnl_10yrs_final1.pdf Oracle Website Oracle Magazine