Download presentation
Presentation is loading. Please wait.
1
D 3 M: D 3 M: Domain-Driven Data Mining An Overview of Domain-Driven Data Mining: Toward Actionable Knowledge Discovery (AKD) Longbing Cao Faculty of Engineering and Information Technology University of Technology, Sydney, Australia
2
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 2 Outline Why Do We Need D 3 M What Is D 3 M The D 3 M Framework D 3 M Theoretical Underpinnings D 3 M Research Issues D 3 M Applications D 3 M References
3
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 3 Why Do We Need D 3 M A common scenario in deploying data mining algorithms I find something interesting! “Many patterns are found”, “They satisfy technical metric threshold well” What do business people say? “So what?” “They are just commonsense” “I don’t care about them” “I don’t understand them” “How can I use them?” “Am I wrong? What can I do better for my business mate?”
4
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 4 Why Do We Need D 3 M Where is something wrong? Gap: academic objectives || business goals Technical outputs || business expectation macro-level methodological and fundamental issues Academic: technical interest; innovative algorithms & patterns Practitioner: social, environmental, organizational factors and impact; getting a problem solved properly micro-level technical and engineering issues System dynamics, system environment, and interaction in a system Business processes, organizational factors, and constraints Human and domain knowledge involvement
5
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 5 An example: Problem with association mining Existing association rule mining algorithms are specifically designed to find strong patterns that have high predictive accuracy or correlation; While frequent patterns are referred to as commonsense knowledge, they can be eager to discover new and hidden patterns in databases. Many patterns are found; How associations can be taken over by business people seamlessly and into operationalizable actions accordingly?
6
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 6 What Is D 3 M Next-generation data mining methodologies, frameworks, algorithms, evaluation systems, tools and decision support, Cater for business environment Satisfy business needs Deliver business-friendly and decision-making rules and actions that are of solid technical and business significance Can be understood & taken over by business people to make decision data- centered hidden pattern miningdomain-driven actionable knowledge discovery aim to promote the paradigm shift from data- centered hidden pattern mining to domain-driven actionable knowledge discovery (AKD)
7
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 7 Involve and synthesize Ubiquitous Intelligence human intelligence, domain intelligence, data intelligence, network intelligence, organizational and social intelligence, and meta-synthesis of the above ubiquitous intelligence
8
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 8 The D 3 M Framework AKD-based problem-solving
9
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 9 Interestingness & actionability
10
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 10 Conflicts & tradeoff
11
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 11 A framework for AKD Post-analysis-based AKD
12
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 12 D 3 M Theoretical Underpinnings artificial intelligence and intelligent systems, behavior informatics and analytics, business modeling, business process management, cognitive sciences, data integration, human-machine interaction, human-centered computing, knowledge representation and management, machine learning, ontological engineering, organizational and social computing, project management methodology, social network analysis, statistics, system simulation, and so on.
13
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 13 D 3 M Research Issues Data Intelligence: deep knowledge in complex data structure; mining in-depth data patterns, and mining structured & informative knowledge in complex data Domain Intelligence: Domain & prior knowledge, business processes/logics/workflow, constraints, and business interestingness; representation, modeling and involvement of them in KDD Network Intelligence: network-based data, knowledge, communities and resources; information retrieval, text mining, web mining, semantic web, ontological engineering techniques, and web knowledge management Human Intelligence: empirical and implicit knowledge, expert knowledge and thoughts, group/collective intelligence; human-machine interaction, representation and involvement of human intelligence Social Intelligence: organizational/social factors, laws/policies/protocols, trust/utility/benefit-cost; collective intelligence, social network analysis, and social cognition interaction Intelligence metasynthesis: Synthesize ubiquitous intelligence in KDD; metasynthetic interaction (m- interaction) as working mechanism, and metasynthetic space (m-space) as an AKD-based problem-solving system
14
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 14 How to reach an interest tradeoff Balance between technical and business interests Suppose there are multiple metrics for each aspect
15
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 15 actionable knowledge discovery through m-spaces acquiring and representing unstructured, ill- structured and uncertain domain/human knowledge supporting dynamic involvement of business experts and their knowledge/intelligence acquiring and representing expert thinking such as imaginary thinking and creative thinking in group heuristic discussions during KDD modeling acquiring and representing group/collective interaction behavior and impact emergence Building infrastructure supporting the involvement and synthesis of ubiquitous intelligence
16
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 16 D 3 M Applications Real-world data mining Our recent case studies Capital markets actionable trading agents actionable trading strategies Social security activity mining combined mining
17
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 17 Actionable Trading Evidence for Brokerage Firms Trading strategy/evidence Actionable trading evidence
18
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 18 Domain factors
19
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 19 Business interest
20
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 20 Developing in-depth trading strategy
21
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 21
22
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 22
23
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 23 Activity mining for Australian Commonwealth Governmental Debt Prevention Impact-targeted activity mining
24
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 24 Impact-targeted activity mining Frequent impact-targeted activity sequences Impact-contrasted activity sequences Impact-reversed activity sequences Impact-targeted combined association clusters
25
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 25 Data intelligence Activity data Itemset imbalance Impact imbalance Seasonal effect Demographic data Transactional data Itemset/tuple selection/construction
26
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 26 Domain intelligence Business process/event for activity selection Domain knowledge Feature selection Sequence construction Impact target Positive impact Negative impact Multi-level impacts Feature/attribute selection Interestingness definition New pattern structures
27
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 27 Organizational/social factors Operational/intervention activities Seasonal business requirement/ interaction changes Business cost (debt amount/duration) Business benefit (saving/preventing debt amount or reducing debt duration) Deliverable format
28
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 28 Impact-reserved pattern pair Underlying pattern 1: Derivative pattern 2: Impact-targeted combined association clusters
29
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 29 Conditional impact ratio (Cir) Conditional Piatetsky-Shapiro’s (P-S) ratio (Cps)
30
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 30 Interestingness: tech & biz
31
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 31 The process
32
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 32 Impact-reversed sequential activity patterns
33
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 33 Demographic + transactional combined pattern
34
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 34 D 3 M References Books: Cao, L. Yu, P.S., Zhang, C., Zhao, Y. Domain Driven Data Mining, Springer, 2009. Cao, L. Yu, P.S., Zhang, C., Zhang, H.(ed.) Data Mining for Business Applications, Springer, 2008. Workshops: Domain-driven data mining 2008, joint with ICDM2008. Domain-driven data mining 2007, joint with SIGKDD2007. Special issues: Domain-driven data mining, IEEE Trans. Knowledge and Data Engineering, 2009. Domain-driven, actionable knowledge discovery, IEEE Intelligent Systems, Department, 22(4): 78-89, 2007. Some of relevant papers: Longbing Cao, Yanchang Zhao, Huaifeng Zhang, Dan Luo, Chengqi Zhang. Flexible Frameworks for Actionable Knowledge Discovery, submitted to IEEE Trans. on Knowledge and Data Engineering. Cao, L., Zhang, H., Zhao, Y., Zhang, C. Combined Mining: Discovering More Informative Knowledge in e- Government Services, submitted to ACM TKDD, 2008. Cao, L., Dai, R., Zhou, M.: Metasynthesis, M-Space and M-Interaction for Open Complex Giant Systems, technical report, 2008. Cao, L. and Ou, Y. Market Microstructure Patterns Powering Trading and Surveillance Agents. Journal of Universal Computer Sciences, 2008 (to appear). Cao, L. and He, T. Developing actionable trading agents, Knowledge and Information Systems: An International Journal, 2008. Cao, L. Developing Actionable Trading Strategies, in edited book: Intelligent Agents in the Evolution of WEB and Applications, Springer, 2008.
35
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 35 Some of relevant papers: Cao, L., Zhao, Y., Zhang, C. (2008), Mining Impact-Targeted Activity Patterns in Imbalanced Data, IEEE Trans. Knowledge and Data Engineering, IEEE,, Vol. 20, No. 8, pp. 1053-1066, 2008. Cao, L., Yu, P., Zhang, C., Zhao, Y., Williams, G.:DDDM2007: Domain Driven Data Mining, ACM SIGKDD Explorations Newsletter, 9(2): 84-86, 2007. Cao, L., Zhang, C.: Knowledge Actionability: Satisfying Technical and Business Interestingness, International Journal of Business Intelligence and Data Mining, 2(4): 496-514, 2007. Cao, L., Zhang, C.: The Evolution of KDD: Towards Domain-Driven Data Mining, International Journal of Pattern Recognition and Artificial Intelligence, 21(4): 677-692, 2007. Cao, L.: Domain-Driven Actionable Knowledge Discovery, IEEE Intelligent Systems, 22(4): 78-89, 2007. Cao, L., and Zhang, C. Domain-driven data mining: A practical methodology, International Journal of Data Warehousing and Mining (IJDWM), IGI Global, 2(4):49-65, 2006.
36
D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 36 Thank you! Longbing CAO Faculty of Engineering and IT University of Technology, Sydney, Australia Tel: 61-2-9514 4477 Fax: 61-2-9514 1807 email: lbcao@it.uts.edu.aulbcao@it.uts.edu.au Homepage: www-staff.it.uts.edu.au/~lbcao/www-staff.it.uts.edu.au/~lbcao/ The Smart Lab: datamining.it.uts.edu.audatamining.it.uts.edu.au
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.