Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia.

Slides:



Advertisements
Similar presentations
COMP3740 CR32: Knowledge Management and Adaptive Systems
Advertisements

Modelling with expert systems. Expert systems Modelling with expert systems Coaching modelling with expert systems Advantages and limitations of modelling.
Standardized Scales.
Mining Association Rules from Microarray Gene Expression Data.
Rule extraction in neural networks. A survey. Krzysztof Mossakowski Faculty of Mathematics and Information Science Warsaw University of Technology.
Decision Tree Approach in Data Mining
Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Part I Introduction to Data Mining by Tan,
Deriving rules from data Decision Trees a.j.m.m (ton) weijters.
Active subgroup mining for descriptive induction tasks Dragan Gamberger Rudjer Bošković Instute, Zagreb Zdenko Sonicki University of Zagreb.
About ISoft … What is Decision Tree? Alice Process … Conclusions Outline.
Dr. Abdul Aziz Associate Dean Faculty of Computer Sciences Riphah International University Islamabad, Pakistan Dr. Nazir A. Zafar.
Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.
1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.
Data Mining: A Closer Look Chapter Data Mining Strategies (p35) Moh!
Building Knowledge-Driven DSS and Mining Data
CS157A Spring 05 Data Mining Professor Sin-Min Lee.
Presented To: Madam Nadia Gul Presented By: Bi Bi Mariam.
Data Mining: A Closer Look
GUHA method in Data Mining Esko Turunen Tampere University of Technology Tampere, Finland.
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
Data Mining Techniques
 A set of objectives or student learning outcomes for a course or a set of courses.  Specifies the set of concepts and skills that the student must.
1 Development of Valid and Reliable Case Studies for Teaching, Diagnostic Reasoning, and Other Purposes Margaret Lunney, RN, PhD Professor College of.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Data Mining Chun-Hung Chou
Understanding Data Analytics and Data Mining Introduction.
Data Mining: Classification
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
Notes for Chapter 12 Logic Programming The AI War Basic Concepts of Logic Programming Prolog Review questions.
Short Introduction to Machine Learning Instructor: Rada Mihalcea.
Data Mining and Application Part 1: Data Mining Fundamentals Part 2: Tools for Knowledge Discovery Part 3: Advanced Data Mining Techniques Part 4: Intelligent.
Instrumentation.
COMP3503 Intro to Inductive Modeling
Data Mining CS157B Fall 04 Professor Lee By Yanhua Xue.
Chapter 9 Neural Network.
Computational Intelligence: Methods and Applications Lecture 19 Pruning of decision trees Włodzisław Duch Dept. of Informatics, UMK Google: W Duch.
Data Mining Knowledge on rough set theory SUSHIL KUMAR SAHU.
Data Mining By Dave Maung.
Decision Trees. Decision trees Decision trees are powerful and popular tools for classification and prediction. The attractiveness of decision trees is.
Bloom’s Taxonomy.
Discovering Descriptive Knowledge Lecture 18. Descriptive Knowledge in Science In an earlier lecture, we introduced the representation and use of taxonomies.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
Object-Oriented Software Engineering using Java, Patterns &UML. Presented by: E.S. Mbokane Department of System Development Faculty of ICT Tshwane University.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
An Investigation of Commercial Data Mining Presented by Emily Davis Supervisor: John Ebden.
Data Mining and Decision Trees 1.Data Mining and Biological Information 2.Data Mining and Machine Learning Techniques 3.Decision trees and C5 4.Applications.
Chapter 6 Classification and Prediction Dr. Bernard Chen Ph.D. University of Central Arkansas.
Prognostic Prediction of Breast Cancer Using C5 Sakina Begum May 1, 2001.
Artificial Intelligence, Expert Systems, and Neural Networks Group 10 Cameron Kinard Leaundre Zeno Heath Carley Megan Wiedmaier.
Data Mining: Knowledge Discovery in Databases Peter van der Putten ALP Group, LIACS Pre-University College Bio Informatics January
Automatic Discovery and Processing of EEG Cohorts from Clinical Records Mission: Enable comparative research by automatically uncovering clinical knowledge.
ISQS 7342 Dr. zhangxi Lin By: Tej Pulapa. DT in Forecasting Targeted Marketing - Know before hand what an online customer loves to see or hear about.
An Introduction Student Name: Riaz Ahmad Program: MSIT( ) Subject: Data warehouse & Data Mining.
Classification using Decision Trees 1.Data Mining and Information 2.Data Mining and Machine Learning Techniques 3.Decision trees and C5 4.Applications.
Advanced Database Concepts
Data Mining and Decision Support
Postgraduate books recommended by Degree Management and Postgraduate Education Bureau, Ministry of Education Medical Statistics (the 2nd edition) 孙振球 主.
Approach to building ontologies A high-level view Chris Wroe.
Lucent Technologies - Proprietary 1 Interactive Pattern Discovery with Mirage Mirage uses exploratory visualization, intuitive graphical operations to.
 Knowledge Acquisition  Machine Learning. The transfer and transformation of potential problem solving expertise from some knowledge source to a program.
DATA MINING TECHNIQUES (DECISION TREES ) Presented by: Shweta Ghate MIT College OF Engineering.
High resolution product by SVM. L’Aquila experience and prospects for the validation site R. Anniballe DIET- Sapienza University of Rome.
MULTI DISEASE CLASSIFICATION BASED ON EFFECTIVE ANALYTICAL TECHNIQUES Guide: Mr.R. Nandhi kesavan S.Aabitha Banu A.Karthika.
Machine Learning with Spark MLlib
SNS COLLEGE OF TECHNOLOGY
Architecture Components
Classification and Prediction
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
Presentation transcript:

Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

Interest and Excitement for Intelligent Data Analysis  Decision making is asking for information and knowledge  Data processing can give them  Multidimensionality of problems is looking for methods for adequate and deep data processing and analysis

Learning Objectives  To understand the concept of the IDA  To meet web-sites and literature on IDA  To meet some tools for IDA  To learn how to use IDA tools and to validate the IDA results

Performance Objectives  Recognize problems asking for IDA  Preparing data and making analysis  Validating and interpreting results of IDA

IDA is… … an interdisciplinary study concerned with the effective analysis of data; … used for extracting useful information from large quantities of online data; extracting desirable knowledge or interesting patterns from existing databases;

IDA or …  Data mining  Knowledge acquisition from data  Genetic algorithm-based rule discovery  Knowledge discovery  Learning classifier system  Machine learning  etc.

IDA gives knowledge …

Knowledge is …  the distillation of information that has been collected, classified, organized, integrated, abstracted and value-added;  at a level of abstraction higher than the data, and information on which it is based and can be used to deduce new information and new knowledge;  usually in the context of human expertise used in solving problems.

Knowledge acquisition …  The process of eliciting, analyzing, transforming, classifying, organizing and integrating knowledge and representing that knowledge in a form that can be used in a computer system.

Knowledge in a domain can be expressed as a number of rules

Rule is … A formal way of specifying a recommendation, directive, or strategy, expressed as "IF premise THEN conclusion" or "IF condition THEN action".

How to discover rules hidden in the data?

Some tools for IDA …  See5 - program for analyzing data and generating classifiers in the form of decision trees and/or rule sets.

Some tools for IDA …  Cubist - analyzes data and generates rule-based piecewise linear models – collections of rules, each with an associated linear expression for computing a target value..

Some tools for IDA …  ILLM - the tool constructs classification models in the form of rules which represent knowledge about relations hidden in data.

Some tools for IDA …  Magnum Opus - finds association rules providing competitive advantage by revealing underlying interactions between factors within the data.

Evaluation of IDA results  Absolute & relative accuracy  Sensitivity & specificity  False positive & false negative  Error rate  Reliability of rules  Etc.

Example of IDA Illustration of IDA by using See5

See5…application…  application.names - lists the classes to which cases may belong and the attributes used to describe each case.  Attributes are of two types: discrete attributes have a value drawn from a set of possibilities, and continuous attributes have numeric values.

See5…application…  application.data - provides information on the training cases from which See5 will extract patterns.  The entry for each case consists of one or more lines that give the values for all attributes.

See5…application…  application.test - provides information on the test cases (used for evaluation of results).  The entry for each case consists of one or more lines that give the values for all attributes.

See5…application…example…  Epidemiological study ( )  Sample of examinees died from cardiovascular diseases during the period  Question: Did they know they were ill? 1 – they were healthy 2 – they were ill (drug treatment, positive clinical and laboratory findings)

See5…application…example…  application.names – example Goal. gender:M,F activity:1,2,3 age: continuous smoking: No,Yes … Goal:1,2 …

See5…application…example…  application.data – example M,1,59,Yes,0,0,0,0,119,73,103,86,247,87,1 5979,?,?,?,1,73,2.5 M,1,66,Yes,0,0,0,0,132,81,183,239,?,783,1 4403,27221,19153,23187,1,73,2.6 M,1,61,No,0,0,0,0,130,79,148,86,209,115,2 1719,12324,10593,11458,1,74,2.5 …

See5…application…example…  Results – example Rule 1: (cover 26) gender = M SBP > 111 oil_fat > 2.9 -> class 1 [0.929]

See5…application…example…  Results – example Rule 4: (cover 14) smoking = Yes SBP > 131 glucose > 93 glucose <= 118 oil_fat <= 2.9 -> class 2 [0.938]

See5…application…example…  Results – example Rule 15: (cover 2) SBP <= 111 oil_fat > 2.9 -> class 2 [0.750]

See5…application…example…  Results – example Evaluation on training data (199 cases): (a) (b)<-classified as (a): class (b): class 2

See5…application…example…  Results – example (training set) Sensitivity=0.97 Specificity=0.81

See5…application…example…  Results – example Evaluation on test data (73 cases): (a) (b)<-classified as (a): class (b): class 2

See5…application…example…  Results – example (test set) Sensitivity=0.98 Specificity=0.90

All the suggested IDA tools are available at mentioned URLs, at least as demo version All the suggested IDA tools are available at mentioned URLs, at least as demo version Try your own IDA… Thank you!