D 3 M: D 3 M: Domain-Driven Data Mining An Overview of Domain-Driven Data Mining: Toward Actionable Knowledge Discovery (AKD) Longbing Cao Faculty of Engineering.

Slides:



Advertisements
Similar presentations
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Evaluation Kleanthous Styliani
Advertisements

Agents and Data Mining Interaction: Where to go next? Longbing CAO University of Technology Sydney.
Prof. Carolina Ruiz Department of Computer Science Worcester Polytechnic Institute INTRODUCTION TO KNOWLEDGE DISCOVERY IN DATABASES AND DATA MINING.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
2008 © ChengXiang Zhai Dragon Star Lecture at Beijing University, June 21-30, Introduction to IR Research ChengXiang Zhai Department of Computer.
Data Mining Glen Shih CS157B Section 1 Dr. Sin-Min Lee April 4, 2006.
Integration of Agent and Data Mining Longbing Cao University of Technology, Sydney.
New Geometric Methods of Mixture Models for Interactive Visualization PIs: Jia Li, Xiaolong (Luke) Zhang, Bruce Lindsay Department of Statistics College.
Automated Analysis and Code Generation for Domain-Specific Models George Edwards Center for Systems and Software Engineering University of Southern California.
Advanced Topics COMP163: Database Management Systems University of the Pacific December 9, 2008.
Data Mining By Archana Ketkar.
Intelligent Database Systems Lab Advisor : Dr.Hsu Graduate : Keng-Wei Chang Author : Andrew K. C. Wong Yang Wang 國立雲林科技大學 National Yunlin University of.
Emergent Phenomena & Human Social Systems NIL KILICAY.
Behavior Informatics and Analytics: Let Behavior Talk Longbing Cao Data Sciences & Knowledge Discovery Lab Centre for Quantum Computation and Intelligent.
Business Intelligence: Essential of Business
DASHBOARDS Dashboard provides the managers with exactly the information they need in the correct format at the correct time. BI systems are the foundation.
Business Intelligence
Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
Day 1 Session 2/ Programme Objectives
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Intrusion Detection Jie Lin. Outline Introduction A Frame for Intrusion Detection System Intrusion Detection Techniques Ideas for Improving Intrusion.
Data Management Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
Industrial Engineering Primary Responsibilities within the Service Industry Institute of Industrial Engineering Industry Advisory Board Business Planning.
1 Data Mining Books: 1.Data Mining, 1996 Pieter Adriaans and Dolf Zantinge Addison-Wesley 2.Discovering Data Mining, 1997 From Concept to Implementation.
Chapter 1 Introduction to Data Mining
Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved. Decision Support Systems Chapter 10.
Model-Driven Analysis Frameworks for Embedded Systems George Edwards USC Center for Systems and Software Engineering
Fundamentals of Information Systems, Third Edition2 Principles and Learning Objectives Artificial intelligence systems form a broad and diverse set of.
Chapter 3 DECISION SUPPORT SYSTEMS CONCEPTS, METHODOLOGIES, AND TECHNOLOGIES: AN OVERVIEW Study sub-sections: , 3.12(p )
ICDM 2003 Review Data Analysis - with comparison between 02 and 03 - Xindong Wu and Alex Tuzhilin Analyzed by Shusaku Tsumoto.
Data Mining By Dave Maung.
Introduction to Science Informatics Lecture 1. What Is Science? a dependence on external verification; an expectation of reproducible results; a focus.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
Welcome to Department of Computer and Systems Sciences – DSV.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
Multi-Agent & Data Mining Group, UTS, Australia Chengqi Zhang Faculty of Information Technology University of Technology Sydney, Australia Longbing Cao.
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning Tuesday, December 7, 1999 William.
The Knowledge Grid Methodology  Concepts, Principles and Practice Hai Zhuge China Knowledge Grid Research Group Chinese Academy of Sciences.
ICT-enabled Agricultural Science for Development Scenarios, Opportunities, Issues by ICTs transforming agricultural science, research & technology generation.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Friday, 14 November 2003 William.
1 Introduction to Data Mining C hapter 1. 2 Chapter 1 Outline Chapter 1 Outline – Background –Information is Power –Knowledge is Power –Data Mining.
MIS2502: Data Analytics Advanced Analytics - Introduction.
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware A Cloud Computing Methodology Study of.
Artificial Intelligence: Research and Collaborative Possibilities a presentation by: Dr. Ernest L. McDuffie, Assistant Professor Department of Computer.
Advanced Analytics Turin April, Index 2 ■ Advanced Analytics Approach –Architecture Overview –Methodology –Professional Skills ■ Impacted Areas.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
1 2. Knowledge Management. 2  Structuring of knowledge enables effective and efficient problem solving dynamic learning strategic planning decision making.
Supplemental Chapter: Business Intelligence Information Systems Development.
Survey on Different Data Mining Techniques for E- Crimes
Data-Driven Educational Data Mining ---- the Progress of Project
On Routine Evolution of Complex Cellular Automata
Fundamentals of Information Systems, Sixth Edition
MIS2502: Data Analytics Advanced Analytics - Introduction
DATA MINING © Prentice Hall.
INTELLIGENT SYSTEMS BUSINESS MOTIVATION BUSINESS INTELLIGENCE
Intro to Machine Learning
Introduction to IR Research
Introduction C.Eng 714 Spring 2010.
Model-Driven Analysis Frameworks for Embedded Systems
Data Warehousing and Data Mining
KNOWLEDGE MANAGEMENT (KM) Session # 37
Automated Analysis and Code Generation for Domain-Specific Models
Data Science: Challenges and Directions
Welcome! Knowledge Discovery and Data Mining
Promising “Newer” Technologies to Cope with the
Presentation transcript:

D 3 M: D 3 M: Domain-Driven Data Mining An Overview of Domain-Driven Data Mining: Toward Actionable Knowledge Discovery (AKD) Longbing Cao Faculty of Engineering and Information Technology University of Technology, Sydney, Australia

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM Outline  Why Do We Need D 3 M  What Is D 3 M  The D 3 M Framework  D 3 M Theoretical Underpinnings  D 3 M Research Issues  D 3 M Applications  D 3 M References

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM Why Do We Need D 3 M  A common scenario in deploying data mining algorithms I find something interesting!  “Many patterns are found”,  “They satisfy technical metric threshold well” What do business people say?  “So what?”  “They are just commonsense”  “I don’t care about them”  “I don’t understand them”  “How can I use them?”  “Am I wrong? What can I do better for my business mate?”

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM Why Do We Need D 3 M  Where is something wrong? Gap:  academic objectives || business goals  Technical outputs || business expectation macro-level methodological and fundamental issues  Academic: technical interest; innovative algorithms & patterns  Practitioner: social, environmental, organizational factors and impact; getting a problem solved properly micro-level technical and engineering issues  System dynamics, system environment, and interaction in a system  Business processes, organizational factors, and constraints  Human and domain knowledge involvement

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM  An example: Problem with association mining Existing association rule mining algorithms are specifically designed to find strong patterns that have high predictive accuracy or correlation; While frequent patterns are referred to as commonsense knowledge, they can be eager to discover new and hidden patterns in databases. Many patterns are found;  How associations can be taken over by business people seamlessly and into operationalizable actions accordingly?

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM What Is D 3 M  Next-generation data mining methodologies, frameworks, algorithms, evaluation systems, tools and decision support, Cater for business environment Satisfy business needs Deliver business-friendly and decision-making rules and actions that are of solid technical and business significance Can be understood & taken over by business people to make decision data- centered hidden pattern miningdomain-driven actionable knowledge discovery  aim to promote the paradigm shift from data- centered hidden pattern mining to domain-driven actionable knowledge discovery (AKD)

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM  Involve and synthesize Ubiquitous Intelligence human intelligence, domain intelligence, data intelligence, network intelligence, organizational and social intelligence, and meta-synthesis of the above ubiquitous intelligence

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM The D 3 M Framework  AKD-based problem-solving

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM  Interestingness & actionability

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM  Conflicts & tradeoff

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM  A framework for AKD Post-analysis-based AKD

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM D 3 M Theoretical Underpinnings  artificial intelligence and intelligent systems,  behavior informatics and analytics,  business modeling,  business process management,  cognitive sciences,  data integration,  human-machine interaction,  human-centered computing,  knowledge representation and management,  machine learning,  ontological engineering,  organizational and social computing,  project management methodology,  social network analysis,  statistics,  system simulation, and so on.

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM D 3 M Research Issues  Data Intelligence: deep knowledge in complex data structure; mining in-depth data patterns, and mining structured & informative knowledge in complex data  Domain Intelligence: Domain & prior knowledge, business processes/logics/workflow, constraints, and business interestingness; representation, modeling and involvement of them in KDD  Network Intelligence: network-based data, knowledge, communities and resources; information retrieval, text mining, web mining, semantic web, ontological engineering techniques, and web knowledge management  Human Intelligence: empirical and implicit knowledge, expert knowledge and thoughts, group/collective intelligence; human-machine interaction, representation and involvement of human intelligence  Social Intelligence: organizational/social factors, laws/policies/protocols, trust/utility/benefit-cost; collective intelligence, social network analysis, and social cognition interaction  Intelligence metasynthesis: Synthesize ubiquitous intelligence in KDD; metasynthetic interaction (m- interaction) as working mechanism, and metasynthetic space (m-space) as an AKD-based problem-solving system

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM  How to reach an interest tradeoff Balance between technical and business interests Suppose there are multiple metrics for each aspect

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM  actionable knowledge discovery through m-spaces acquiring and representing unstructured, ill- structured and uncertain domain/human knowledge supporting dynamic involvement of business experts and their knowledge/intelligence acquiring and representing expert thinking such as imaginary thinking and creative thinking in group heuristic discussions during KDD modeling acquiring and representing group/collective interaction behavior and impact emergence Building infrastructure supporting the involvement and synthesis of ubiquitous intelligence

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM D 3 M Applications  Real-world data mining  Our recent case studies Capital markets  actionable trading agents  actionable trading strategies Social security  activity mining  combined mining

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM Actionable Trading Evidence for Brokerage Firms  Trading strategy/evidence  Actionable trading evidence

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM  Domain factors

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM  Business interest

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM  Developing in-depth trading strategy

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM Activity mining for Australian Commonwealth Governmental Debt Prevention  Impact-targeted activity mining

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM  Impact-targeted activity mining Frequent impact-targeted activity sequences Impact-contrasted activity sequences Impact-reversed activity sequences Impact-targeted combined association clusters

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM  Data intelligence Activity data Itemset imbalance Impact imbalance Seasonal effect Demographic data Transactional data  Itemset/tuple selection/construction

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM  Domain intelligence Business process/event for activity selection Domain knowledge Feature selection Sequence construction Impact target  Positive impact  Negative impact  Multi-level impacts  Feature/attribute selection  Interestingness definition  New pattern structures

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM  Organizational/social factors Operational/intervention activities Seasonal business requirement/ interaction changes Business cost (debt amount/duration) Business benefit (saving/preventing debt amount or reducing debt duration) Deliverable format

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM  Impact-reserved pattern pair Underlying pattern 1: Derivative pattern 2:  Impact-targeted combined association clusters

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM  Conditional impact ratio (Cir)  Conditional Piatetsky-Shapiro’s (P-S) ratio (Cps)

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM  Interestingness: tech & biz

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM  The process

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM  Impact-reversed sequential activity patterns

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM  Demographic + transactional combined pattern

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM D 3 M References Books:  Cao, L. Yu, P.S., Zhang, C., Zhao, Y. Domain Driven Data Mining, Springer,  Cao, L. Yu, P.S., Zhang, C., Zhang, H.(ed.) Data Mining for Business Applications, Springer, Workshops:  Domain-driven data mining 2008, joint with ICDM2008.  Domain-driven data mining 2007, joint with SIGKDD2007. Special issues:  Domain-driven data mining, IEEE Trans. Knowledge and Data Engineering,  Domain-driven, actionable knowledge discovery, IEEE Intelligent Systems, Department, 22(4): 78-89, Some of relevant papers:  Longbing Cao, Yanchang Zhao, Huaifeng Zhang, Dan Luo, Chengqi Zhang. Flexible Frameworks for Actionable Knowledge Discovery, submitted to IEEE Trans. on Knowledge and Data Engineering.  Cao, L., Zhang, H., Zhao, Y., Zhang, C. Combined Mining: Discovering More Informative Knowledge in e- Government Services, submitted to ACM TKDD,  Cao, L., Dai, R., Zhou, M.: Metasynthesis, M-Space and M-Interaction for Open Complex Giant Systems, technical report,  Cao, L. and Ou, Y. Market Microstructure Patterns Powering Trading and Surveillance Agents. Journal of Universal Computer Sciences, 2008 (to appear).  Cao, L. and He, T. Developing actionable trading agents, Knowledge and Information Systems: An International Journal,  Cao, L. Developing Actionable Trading Strategies, in edited book: Intelligent Agents in the Evolution of WEB and Applications, Springer, 2008.

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM Some of relevant papers:  Cao, L., Zhao, Y., Zhang, C. (2008), Mining Impact-Targeted Activity Patterns in Imbalanced Data, IEEE Trans. Knowledge and Data Engineering, IEEE,, Vol. 20, No. 8, pp ,  Cao, L., Yu, P., Zhang, C., Zhao, Y., Williams, G.:DDDM2007: Domain Driven Data Mining, ACM SIGKDD Explorations Newsletter, 9(2): 84-86,  Cao, L., Zhang, C.: Knowledge Actionability: Satisfying Technical and Business Interestingness, International Journal of Business Intelligence and Data Mining, 2(4): ,  Cao, L., Zhang, C.: The Evolution of KDD: Towards Domain-Driven Data Mining, International Journal of Pattern Recognition and Artificial Intelligence, 21(4): ,  Cao, L.: Domain-Driven Actionable Knowledge Discovery, IEEE Intelligent Systems, 22(4): 78-89,  Cao, L., and Zhang, C. Domain-driven data mining: A practical methodology, International Journal of Data Warehousing and Mining (IJDWM), IGI Global, 2(4):49-65, 2006.

D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM Thank you! Longbing CAO Faculty of Engineering and IT University of Technology, Sydney, Australia Tel: Fax: Homepage: www-staff.it.uts.edu.au/~lbcao/www-staff.it.uts.edu.au/~lbcao/ The Smart Lab: datamining.it.uts.edu.audatamining.it.uts.edu.au