Data Mining with Oracle using Classification and Clustering Algorithms Presented by Nhamo Mdzingwa Supervisor: John Ebden.

Slides:



Advertisements
Similar presentations
Data Mining: A Closer Look Chapter Data Mining Strategies.
Advertisements

© Prentice Hall1 DATA MINING TECHNIQUES Introductory and Advanced Topics Eamonn Keogh (some slides adapted from) Margaret Dunham Dr. M.H.Dunham, Data Mining,
Basic Data Mining Techniques Chapter Decision Trees.
WEKA Evaluation of WEKA Waikato Environment for Knowledge Analysis Presented By: Manoj Wartikar & Sameer Sagade.
Bagging and Boosting in Data Mining Carolina Ruiz
University of Athens, Greece Pervasive Computing Research Group An Online Adaptive Model for Location Prediction University of Athens, Department of Informatics.
Data Mining: A Closer Look Chapter Data Mining Strategies (p35) Moh!
Copyright © 2004 Pearson Education, Inc.. Chapter 27 Data Mining Concepts.
Introduction to WEKA Aaron 2/13/2009. Contents Introduction to weka Download and install weka Basic use of weka Weka API Survey.
ML ALGORITHMS. Algorithm Types Classification (supervised) Given -> A set of classified examples “instances” Produce -> A way of classifying new examples.
Oracle Data Mining Ying Zhang. Agenda Data Mining Data Mining Algorithms Oracle DM Demo.
Data Mining: A Closer Look
Data Mining: A Closer Look Chapter Data Mining Strategies 2.
Chapter 5 Data mining : A Closer Look.
Gavin Russell-Rockliff BI Technical Specialist Microsoft BIN305.
Machine Learning in Simulation-Based Analysis 1 Li-C. Wang, Malgorzata Marek-Sadowska University of California, Santa Barbara.
Evaluating Performance for Data Mining Techniques
Overview of Distributed Data Mining Xiaoling Wang March 11, 2003.
1 © Goharian & Grossman 2003 Introduction to Data Mining (CS 422) Fall 2010.
An Exercise in Machine Learning
Social Network Analysis via Factor Graph Model
An Evaluation of A Commercial Data Mining Suite Oracle Data Mining Presented by Emily Davis Supervisor: John Ebden.
INTRODUCTION TO MACHINE LEARNING. $1,000,000 Machine Learning  Learn models from data  Three main types of learning :  Supervised learning  Unsupervised.
Cost-Sensitive Bayesian Network algorithm Introduction: Machine learning algorithms are becoming an increasingly important area for research and application.
The CRISP-DM Process Model
Chapter 9 Neural Network.
Data Mining: Classification & Predication Hosam Al-Samarraie, PhD. Centre for Instructional Technology & Multimedia Universiti Sains Malaysia.
Treatment Learning: Implementation and Application Ying Hu Electrical & Computer Engineering University of British Columbia.
Chapter 8 The k-Means Algorithm and Genetic Algorithm.
An Investigation of Oracle and SQL Server with respect to Integrity, and SQL Language standards Presented by: Paul Tarwireyi Supervisor: John Ebden.
Data Mining with Oracle using Classification and Clustering Algorithms Proposed and Presented by Nhamo Mdzingwa Supervisor: John Ebden.
Oracle Data Mining Update and Xerox Application Charlie Berger Sr. Director of Product Management, Life Sciences and Data Mining
Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Chapter 2 Data Mining: A Closer Look Jason C. H. Chen, Ph.D. Professor of MIS School of Business Administration.
JSR 73: Data Mining API 資工三 B 林宗澤. Introduction In JDM, data mining [Mitchell1997, BL1997] includes the functional areas of classification, regression,
Methodology Qiang Yang, MTM521 Material. A High-level Process View for Data Mining 1. Develop an understanding of application, set goals, lay down all.
2009 ML Project: Goal: Do some real machine learning… A project you are interested in works better Data is often the hard part (get it in plenty of time)
2014 ML Project2: Goal: Do some real machine learning; learn you to use machine learning to make sense out of data. Group Project—4 (3) students per group.
Stefan Mutter, Mark Hall, Eibe Frank University of Freiburg, Germany University of Waikato, New Zealand The 17th Australian Joint Conference on Artificial.
Credibility: Evaluating what’s been learned This Lecture based on Ch 5 of Witten & Frank Plan for this week 3 classes before Midterm Paper and Survey discussion.
Exploit of Online Social Networks with Community-Based Graph Semi-Supervised Learning Mingzhen Mo and Irwin King Department of Computer Science and Engineering.
1 STAT 5814 Statistical Data Mining. 2 Use of SAS Data Mining.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
An Evaluation of Commercial Data Mining Proposed and Presented by Emily Davis Supervisor: John Ebden.
An Investigation of Commercial Data Mining Presented by Emily Davis Supervisor: John Ebden.
Computational Approaches for Biomarker Discovery SubbaLakshmiswetha Patchamatla.
October 2-3, 2015, İSTANBUL Boğaziçi University Prof.Dr. M.Erdal Balaban Istanbul University Faculty of Business Administration Avcılar, Istanbul - TURKEY.
Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-1 Data Mining Methods: Classification Most frequently used DM method Employ supervised.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall DM Finals Study Guide Rodney Nielsen.
Progress presentation
Dimensionality Reduction in Unsupervised Learning of Conditional Gaussian Networks Authors: Pegna, J.M., Lozano, J.A., Larragnaga, P., and Inza, I. In.
A RESEARCH SUPPORT SYSTEM FRAMEWORK FOR WEB DATA MINING Jin Xu, Yingping Huang, Gregory Madey Department of Computer Science and Engineering University.
WEKA's Knowledge Flow Interface Data Mining Knowledge Discovery in Databases ELIE TCHEIMEGNI Department of Computer Science Bowie State University, MD.
Show Me Potential Customers Data Mining Approach Leila Etaati.
Oracle Advanced Analytics
Machine Learning Models
RESEARCH APPROACH.
Data mining and statistical learning, lecture 1b
Data Mining 101 with Scikit-Learn
Introduction to Data Science Lecture 7 Machine Learning Overview
Waikato Environment for Knowledge Analysis
Overview of Supervised Learning
A Unifying View on Instance Selection
Clustering vs. Classification
Supervised vs. unsupervised Learning
Classification and Prediction
Objectives Data Mining Course
Performance And Scalability In Oracle9i And SQL Server 2000
Basics of ML Rohan Suri.
Mingzhen Mo and Irwin King
Data Mining CSCI 307, Spring 2019 Lecture 21
Presentation transcript:

Data Mining with Oracle using Classification and Clustering Algorithms Presented by Nhamo Mdzingwa Supervisor: John Ebden

Overview of Presentation Recap of Proposal Classification of Data Mining & DM Algorithms Oracle Data Mining Data Mining Process Evaluation of Results Progress so far Updated Timeline Plans

Objective Investigate two types of algorithms available in Oracle10g for data mining (ODM). Apply the two algorithms to actual data. Analyse & Evaluate results in terms of performance.

Classification of Data Mining Directed data mining/supervised learning which build a model that describes one particular attribute in terms of the rest of the data. Undirected DM / Unsupervised learning builds a model to establish the relationships amongst all the input attributes by grouping.

Classification of Data Mining algorithms DM strategies Unsupervised learning Supervised learning Classification Naive Bayes Model Seeker Adaptive Bayes Estimation Prediction Predictive variance Clustering k-Means O-Cluster Input attributes but have no output attributes Input attributes and output one or more attributes Association Discovery Visualization

Algorithms offered in Oracle10g classification 1. Adaptive Bayes Network 2. Naive Bayes 3. Model Seeker clustering 1. k-Means 2. O-Cluster 3. Predictive variance association rules 1. Apriori (association rules)

Evaluation of Results Evaluation of unsupervised learning models involves determining the level of predictive accuracy. Evaluated using test data sets. Compare confidence and support levels of models created from the same training data to determine accuracy.

Progress Literature Survey Oracle10g installed on Athena in Hons Lab Exploring the Oracle9i and 10g Suite including JDeveloper Member of MetaLink (Oracle’s online support service)

Updated Timeline Continuation from literature and tutorials done Investigate Clustering & Classification algorithms (theory) done Find suitable computerised case studies of the use of above algorithms – with or without Oracle. done Search datasets for testing (possibilities: AIDS data & faculty data) In progress Apply algorithms to data found then Critically Analyse & assess results Second semester Write up paperSeptember vacation and 3rd term Final project write up Due 7/11