Download presentation
Presentation is loading. Please wait.
Published byEmmeline Ward Modified over 9 years ago
1
Data Mining Database Systems Timothy Vu
2
2 Mining Mining is the extraction of valuable minerals or other geological materials from the earth, usually bauxite, coal, diamonds, iron, precious metals, lead, limestone, nickel, phosphate, rock salt, tin, and uranium, petroleum, natural gas, and even water. Often something that is valuable, rare, or useful.
3
3 What is Data Mining Data Mining, also known as Knowledge-Discovery in Databases (KDD), is the process of automatically searching large volumes of data for patterns. In order to achieve this, data mining uses computational techniques from statistics, machine learning and pattern recognition. Machine learning - a method for creating computer programs by the analysis of data sets. Pattern recognition - classify data (patterns) based on either a priori knowledge or on statistical information extracted from the patterns.
4
4 Why Data Mining Data mining is a technique that helps individuals or companies find useful information to make better decisions from large amounts of data. - Reduce risks - Find problems and issues - Save money - High confidence predictions - Simplifying information
5
5 Discussion Topics 1 ) Classification 2 )Regression 3) Association 4) Clustering
6
6 Classifiers Decision-Tree Classifiers – each node has an associated class and each internal node has a predicate. Bayesian Classifiers – find the distribution of attribute values for each class in the training data ( the maximum probability predicted ). Nuro Net Classifiers – Use the training data to train artificial nuro nets.
7
7 Regression Regression – Deals with the prediction of a value rather than a class. Linear Regression – Predict values using a polynomial by finding the curve fitting, meaning finding coefficients that give the best answer.
8
8 Associations Finding the association or relationship between two or more items. Support – measure of what fractions of the pupulation satisifies both the antecedent and the consequent of the rule. MILK => Screwdrivers Confidence – how often the consequent is true when the antecedent is true. MILK => Bread
9
9 Clustering Clustering is the classification of similar objects into different groups, or more precisely, the partitioning of a data set into subsets (clusters), so that the data in each subset (ideally) share some common trait - often proximity according to some defined distance measure.
10
10 Applications of Data Mining 1. Predictions - Stock Market - Earth Quakes - NBA games 2. Association - Store Inventory - Fashion Trends 3. Descriptive Patterns - Disease Analysis - Image Recognition - Fraud Detection
11
11 Gather Data
12
12 Electrocardiogram
13
13 Disease Analysis
14
14 References Silberschatz, H.F. Korth, S. Sudershan: Database System Concepts, 5th ed., McGraw-Hill, 2006 Runge, Marschall, Magnus Ohman, and Frank Netter. Netter's Cardiology (Netter Clinical Science). W.B. Saunders Company, 2004. "Data mining". Wikipedia. 4/1/2006.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.