University of Economics, Prague MLNET related activities of Laboratory for Intelligent Systems and Dept. of Information and Knowledge Engineering

Slides:



Advertisements
Similar presentations
Groupe de travail athérosclérose 1 Project STULONG Project STULONG Some analytical work at EuroMISE and University of Economics, Prague Jan Rauch EuroMISE.
Advertisements

SDS-Rules and Association Rules March 17, 2004Nicosia, Cyprus Tomáš Karban 1 Jan Rauch 2 Milan Šimůnek 2 1 Charles University, Prague Dept. of Software.
Clickstream analysis - data collection, preprocessing and mining using the LISp-Miner system Effective placement of on-line advertisments Tomáš Kliegr.
Weka & Rapid Miner Tutorial By Chibuike Muoh. WEKA:: Introduction A collection of open source ML algorithms – pre-processing – classifiers – clustering.
Working visit at KDD Laboratory University of North Carolina Charlotte Web Intelligence / Intelligent Agent Technology / Granular Computing / Bioinformatics.
C.-C. Chan Department of Computer Science University of Akron Akron, OH USA 1 UA Faculty Forum 2008 by C.-C. Chan.
PKDD Discovery Challenges short review Jan Rauch EuroMISE – Cardio University of Economics, Prague This work is supported by the project LN00B107 of the.
ECML/PKDD Discovery Challenges Petr Berka University of Economics, Prague
1.Data categorization 2.Information 3.Knowledge 4.Wisdom 5.Social understanding Which of the following requires a firm to expend resources to organize.
AtherEx: an Expert System for Atherosclerosis Risk Assessment Petr Berka, Vladimír Laš University of Economics, Prague Marie Tomečková Institute of Computer.
Decision Making: An Introduction 1. 2 Decision Making Decision Making is a process of choosing among two or more alternative courses of action for the.
1. Abstract 2 Introduction Related Work Conclusion References.
Artificial Intelligence MEI 2008/2009 Bruno Paulette.
Mining Financial Data Histograms & Contingency Tables Shishir Gupta Under the guidance of Dr. Mirsad Hadzikadic.
GUHA - a summary 1. GUHA (General Unary Hypotheses Automaton) is a method of automatic generation of hypotheses based on empirical data, thus a method.
1 ACCTG 6910 Building Enterprise & Business Intelligence Systems (e.bis) Introduction to Data Mining Olivia R. Liu Sheng, Ph.D. Emma Eccles Jones Presidential.
GUHA - a summary 1. GUHA (General Unary Hypotheses Automaton) is a method of automatic generation of hypotheses based on empirical data, thus a method.
CS157A Spring 05 Data Mining Professor Sin-Min Lee.
Business Intelligence & Exam 1 Review
Presented To: Madam Nadia Gul Presented By: Bi Bi Mariam.
SEWEBAR - a Framework for Creating and Dissemination of Analytical Reports from Data Mining Jan Rauch, Milan Šimůnek University of Economics, Prague, Czech.
GUHA method in Data Mining Esko Turunen Tampere University of Technology Tampere, Finland.
Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.
Martin Ralbovský KIZI FIS VŠE The GUHA method Provides a general mainframe for retrieving interesting information from data Strong foundations.
1 Data Mining Books: 1.Data Mining, 1996 Pieter Adriaans and Dolf Zantinge Addison-Wesley 2.Discovering Data Mining, 1997 From Concept to Implementation.
3 Objects (Views Synonyms Sequences) 4 PL/SQL blocks 5 Procedures Triggers 6 Enhanced SQL programming 7 SQL &.NET applications 8 OEM DB structure 9 DB.
Data Mining and Application Part 1: Data Mining Fundamentals Part 2: Tools for Knowledge Discovery Part 3: Advanced Data Mining Techniques Part 4: Intelligent.
CRESCENDO Full virtuality in design and product development within the extended enterprise Naples, 28 Nov
Data Mining – A First View Roiger & Geatz. Definition Data mining is the process of employing one or more computer learning techniques to automatically.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Analysis of Death Causes in the STULONG Data Set Jan Burian, Jan Rauch EuroMISE – Cardio University of Economics Prague.
A three-step approach for STULONG database analysis: characterization of patients’ groups O. Couturier, H. Delalin, H. Fu, E. Kouamou, E. Mephu Nguifo.
Development in the Ferda project December 2006 Martin Ralbovský.
Visual, Interactive Data Mining with InfoZoom – the Financial Data Set Michael Spenke Christian Beilken.
Some working definitions…. ‘Data Mining’ and ‘Knowledge Discovery in Databases’ (KDD) are used interchangeably Data mining = –the discovery of interesting,
Ferda Visual Environment for Data Mining Martin Ralbovský.
Methodology Qiang Yang, MTM521 Material. A High-level Process View for Data Mining 1. Develop an understanding of application, set goals, lay down all.
Vision + Focus + Execution Meiliu Lu, RVR 5016, For CSc 209 Spring 2003, 5/6/03.
Mining Click-stream Data With Statistical and Rule-based Methods Martin Labský, Vladimír Laš, Petr Berka University of Economics, Prague.
Modifying Logic of Discovery for Dealing with Domain Knowledge in Data Mining Jan Rauch University of Economics, Prague Czech Republic.
1 Classes of association rules short overview Jan Rauch, Department of Knowledge and Information Engineering University of Economics, Prague.
A Learning System for Decision Support in Telecommunications Filip Železný, Olga Štěpánková (Czech Technical University in Prague) Jiří Zídek (Atlantis.
I Robot.
3-1 Data Mining Kelby Lee. 3-2 Overview ¨ Transaction Database ¨ What is Data Mining ¨ Data Mining Primitives ¨ Data Mining Objectives ¨ Predictive Modeling.
Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Part I Data Mining Fundamentals Chapter 1 Data Mining: A First View Jason C. H. Chen, Ph.D. Professor.
Part I Data Mining Fundamentals. Data Mining: A First View Chapter 1.
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
Self-Organised Data Mining – 20 Years after GUHA-80 Martin Kejkula KEG 8 th April 2004
27-18 września Data Mining dr Iwona Schab. 2 Semester timetable ORGANIZATIONAL ISSUES, INDTRODUCTION TO DATA MINING 1 Sources of data in business,
1 Appendix D: Application of Genetic Algorithm in Classification Duong Tuan Anh 5/2014.
PKDD Discovery Challenge (not only) on Financial Data Petr Berka Laboratory for Intelligent Systems University of Economics, Prague
1 Introduction to Data Mining C hapter 1. 2 Chapter 1 Outline Chapter 1 Outline – Background –Information is Power –Knowledge is Power –Data Mining.
The Nature of Evidence Evidence Management & Intelligence Analysis Richard M Leary.
A Decision Support Based on Data Mining in e-Banking Irina Ionita Liviu Ionita Department of Informatics University Petroleum-Gas of Ploiesti.
FNA/Spring CENG 562 – Machine Learning. FNA/Spring Contact information Instructor: Dr. Ferda N. Alpaslan
1 Chapter 1 Introduction to Accounting Information Systems Chapter 2 Intelligent Systems and Knowledge Management.
System Design, Implementation and Review
Pekka Laitila, Kai Virtanen
Eick: Introduction Machine Learning
Pekka Laitila, Kai Virtanen
Introduction to Data Mining
Enterprise Resource Planning
MIS 451 Building Business Intelligence Systems
An Enhanced Support Vector Machine Model for Intrusion Detection
Learning Predictive Modeling with Data from Lending Club
PKDD Discovery Challenge (not only) on Financial Data
Dept. of Computer Science University of Liverpool
Christoph F. Eick: A Gentle Introduction to Machine Learning
Expert Knowledge Based Systems
Carlos Ordonez, Javier Garcia-Garcia,
Presentation transcript:

University of Economics, Prague MLNET related activities of Laboratory for Intelligent Systems and Dept. of Information and Knowledge Engineering

(c) Petr Berka, LISp, Research ä probabilistic methods - decomposable probability models and bayesian networks ä symbolic methods - generalized association rules and decision rules ä logical calculi for knowledge discovery in databases

(c) Petr Berka, LISp, People Jiří IvánekRadim Jiroušek Petr Berka Jan Rauch Tomáš KočkaVojtěch Svátek

(c) Petr Berka, LISp, Software LISp-Miner ä two data mining procedures: 4FT Miner (generalised association rules) and KEX (decision rules), ä large preprocessing module including SQL, ä output of rules in database format enables the users to implement own interpretation procedures.

(c) Petr Berka, LISp, LISP-Miner procedures ä 4FT-Miner (GUHA procedure) generalised association rules in the form Ant ~ Suc / Cond ä KEX weighted decision rules in the form Ant ==> C (weight)

(c) Petr Berka, LISp, FT-Miner Data Matrix: CLIENTS LOANS Id Age Sex Salary District Amount Payment Months Quality 1 45 F Prague good M Brno bad Problem: Are there segments of clients SC and segments of loans SL such that To be in SC is at 90% equivalent to have a loan from SL and there is at least 100 such clients Ant is at 90% equivalent to Suc Ant  0.90%, 100 Suc is true iff a/(a+b+c)  0.9  a  100 Suc  Suc a - number of objects satisfying Ant and Suc Ant a b b- number of objects satisfying Ant and not satisfying Suc  Ant c d c- number of objects not satisfying Ant and satisfying Suc d- number of objects satisfying neither Ant nor Suc

(c) Petr Berka, LISp, FT Miner Input: Data matrix, quantifier  0.90%, 100 Derived attributes for SC (possible Ant): Age (7 values), Sex (2 values), Salary (3 values), District (77 values) Derived attributes for SL (possible Suc): Amount (6 values), Duration (5 values), Quality (2 values) Output: All Ant  0.90%, 100 Suc true in data matrix (5 equivalences from about 5 milions possible relations) an example: Age( )  Sex(F)  Salary(low)  District (Prague)  0.90%, 100 Amount<20,50)  Quality(Bad) Suc  Suc a/(a+b+c) = 0.95  0.9 Ant  950  100  Ant

(c) Petr Berka, LISp, KEX - classification

(c) Petr Berka, LISp, KEX - learning

(c) Petr Berka, LISp, LISp-Miner

(c) Petr Berka, LISp, LISp-Miner

(c) Petr Berka, LISp, LISp-Miner

(c) Petr Berka, LISp, LISp-Miner

(c) Petr Berka, LISp, FT Miner and KEX Applications ä truck reliability assessment ä quality control in a brewery ä segmentation of clients of a bank ä short-term electric load prediction

(c) Petr Berka, LISp, LISp Miner References:  Berka,P. - Ivanek,J.: Automated Knowledge Acquisition for PROSPECTOR-like Expert Systems. In: (Bergadano, deRaedt eds.) Proc. ECML'94, Springer 1994,  Berka,P. - Rauch,J.: Data Mining using GUHA and KEX. In: (Callaos, Yang, Aguilar eds.) 4th. Int. Conf. on Information Systems, Analysis and Synthesis ISAS'98, 1998, Vol 2,  Rauch,J.: Classes of Four Fold Table Quantifiers. In: (Zytkow, Quafafou eds.) Principles of Data Mining and Knowledge Discovery. Springer 1998,

(c) Petr Berka, LISp, Datasets PKDD‘99 Discovery Challenge data ( ä financial data: clients of a bank, their accounts, transactions, loans etc, ä medical data: patients with collagen disease

(c) Petr Berka, LISp, Financial data

(c) Petr Berka, LISp, Medical data

(c) Petr Berka, LISp, Other activities ä Organized conferences ä Teaching (in czech) ä KDD ä KDD seminar ä ML

(c) Petr Berka, LISp, New projects SOL-EU-NET project „Data Mining and Decision Support for Business Competitiveness: A European Virtual Enterprise“ (supported by EU grant IST ) (supported by EU grant IST )