Presentation is loading. Please wait.

Presentation is loading. Please wait.

D 3 M: D 3 M: Domain-Driven Data Mining An Overview of Domain-Driven Data Mining: Toward Actionable Knowledge Discovery (AKD) Longbing Cao Faculty of Engineering.

Similar presentations


Presentation on theme: "D 3 M: D 3 M: Domain-Driven Data Mining An Overview of Domain-Driven Data Mining: Toward Actionable Knowledge Discovery (AKD) Longbing Cao Faculty of Engineering."— Presentation transcript:

1 D 3 M: D 3 M: Domain-Driven Data Mining An Overview of Domain-Driven Data Mining: Toward Actionable Knowledge Discovery (AKD) Longbing Cao Faculty of Engineering and Information Technology University of Technology, Sydney, Australia

2 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 2 Outline  Why Do We Need D 3 M  What Is D 3 M  The D 3 M Framework  D 3 M Theoretical Underpinnings  D 3 M Research Issues  D 3 M Applications  D 3 M References

3 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 3 Why Do We Need D 3 M  A common scenario in deploying data mining algorithms I find something interesting!  “Many patterns are found”,  “They satisfy technical metric threshold well” What do business people say?  “So what?”  “They are just commonsense”  “I don’t care about them”  “I don’t understand them”  “How can I use them?”  “Am I wrong? What can I do better for my business mate?”

4 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 4 Why Do We Need D 3 M  Where is something wrong? Gap:  academic objectives || business goals  Technical outputs || business expectation macro-level methodological and fundamental issues  Academic: technical interest; innovative algorithms & patterns  Practitioner: social, environmental, organizational factors and impact; getting a problem solved properly micro-level technical and engineering issues  System dynamics, system environment, and interaction in a system  Business processes, organizational factors, and constraints  Human and domain knowledge involvement

5 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 5  An example: Problem with association mining Existing association rule mining algorithms are specifically designed to find strong patterns that have high predictive accuracy or correlation; While frequent patterns are referred to as commonsense knowledge, they can be eager to discover new and hidden patterns in databases. Many patterns are found;  How associations can be taken over by business people seamlessly and into operationalizable actions accordingly?

6 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 6 What Is D 3 M  Next-generation data mining methodologies, frameworks, algorithms, evaluation systems, tools and decision support, Cater for business environment Satisfy business needs Deliver business-friendly and decision-making rules and actions that are of solid technical and business significance Can be understood & taken over by business people to make decision data- centered hidden pattern miningdomain-driven actionable knowledge discovery  aim to promote the paradigm shift from data- centered hidden pattern mining to domain-driven actionable knowledge discovery (AKD)

7 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 7  Involve and synthesize Ubiquitous Intelligence human intelligence, domain intelligence, data intelligence, network intelligence, organizational and social intelligence, and meta-synthesis of the above ubiquitous intelligence

8 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 8 The D 3 M Framework  AKD-based problem-solving

9 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 9  Interestingness & actionability

10 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 10  Conflicts & tradeoff

11 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 11  A framework for AKD Post-analysis-based AKD

12 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 12 D 3 M Theoretical Underpinnings  artificial intelligence and intelligent systems,  behavior informatics and analytics,  business modeling,  business process management,  cognitive sciences,  data integration,  human-machine interaction,  human-centered computing,  knowledge representation and management,  machine learning,  ontological engineering,  organizational and social computing,  project management methodology,  social network analysis,  statistics,  system simulation, and so on.

13 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 13 D 3 M Research Issues  Data Intelligence: deep knowledge in complex data structure; mining in-depth data patterns, and mining structured & informative knowledge in complex data  Domain Intelligence: Domain & prior knowledge, business processes/logics/workflow, constraints, and business interestingness; representation, modeling and involvement of them in KDD  Network Intelligence: network-based data, knowledge, communities and resources; information retrieval, text mining, web mining, semantic web, ontological engineering techniques, and web knowledge management  Human Intelligence: empirical and implicit knowledge, expert knowledge and thoughts, group/collective intelligence; human-machine interaction, representation and involvement of human intelligence  Social Intelligence: organizational/social factors, laws/policies/protocols, trust/utility/benefit-cost; collective intelligence, social network analysis, and social cognition interaction  Intelligence metasynthesis: Synthesize ubiquitous intelligence in KDD; metasynthetic interaction (m- interaction) as working mechanism, and metasynthetic space (m-space) as an AKD-based problem-solving system

14 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 14  How to reach an interest tradeoff Balance between technical and business interests Suppose there are multiple metrics for each aspect

15 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 15  actionable knowledge discovery through m-spaces acquiring and representing unstructured, ill- structured and uncertain domain/human knowledge supporting dynamic involvement of business experts and their knowledge/intelligence acquiring and representing expert thinking such as imaginary thinking and creative thinking in group heuristic discussions during KDD modeling acquiring and representing group/collective interaction behavior and impact emergence Building infrastructure supporting the involvement and synthesis of ubiquitous intelligence

16 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 16 D 3 M Applications  Real-world data mining  Our recent case studies Capital markets  actionable trading agents  actionable trading strategies Social security  activity mining  combined mining

17 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 17 Actionable Trading Evidence for Brokerage Firms  Trading strategy/evidence  Actionable trading evidence

18 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 18  Domain factors

19 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 19  Business interest

20 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 20  Developing in-depth trading strategy

21 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 21

22 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 22

23 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 23 Activity mining for Australian Commonwealth Governmental Debt Prevention  Impact-targeted activity mining

24 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 24  Impact-targeted activity mining Frequent impact-targeted activity sequences Impact-contrasted activity sequences Impact-reversed activity sequences Impact-targeted combined association clusters

25 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 25  Data intelligence Activity data Itemset imbalance Impact imbalance Seasonal effect Demographic data Transactional data  Itemset/tuple selection/construction

26 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 26  Domain intelligence Business process/event for activity selection Domain knowledge Feature selection Sequence construction Impact target  Positive impact  Negative impact  Multi-level impacts  Feature/attribute selection  Interestingness definition  New pattern structures

27 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 27  Organizational/social factors Operational/intervention activities Seasonal business requirement/ interaction changes Business cost (debt amount/duration) Business benefit (saving/preventing debt amount or reducing debt duration) Deliverable format

28 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 28  Impact-reserved pattern pair Underlying pattern 1: Derivative pattern 2:  Impact-targeted combined association clusters

29 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 29  Conditional impact ratio (Cir)  Conditional Piatetsky-Shapiro’s (P-S) ratio (Cps)

30 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 30  Interestingness: tech & biz

31 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 31  The process

32 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 32  Impact-reversed sequential activity patterns

33 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 33  Demographic + transactional combined pattern

34 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 34 D 3 M References Books:  Cao, L. Yu, P.S., Zhang, C., Zhao, Y. Domain Driven Data Mining, Springer, 2009.  Cao, L. Yu, P.S., Zhang, C., Zhang, H.(ed.) Data Mining for Business Applications, Springer, 2008. Workshops:  Domain-driven data mining 2008, joint with ICDM2008.  Domain-driven data mining 2007, joint with SIGKDD2007. Special issues:  Domain-driven data mining, IEEE Trans. Knowledge and Data Engineering, 2009.  Domain-driven, actionable knowledge discovery, IEEE Intelligent Systems, Department, 22(4): 78-89, 2007. Some of relevant papers:  Longbing Cao, Yanchang Zhao, Huaifeng Zhang, Dan Luo, Chengqi Zhang. Flexible Frameworks for Actionable Knowledge Discovery, submitted to IEEE Trans. on Knowledge and Data Engineering.  Cao, L., Zhang, H., Zhao, Y., Zhang, C. Combined Mining: Discovering More Informative Knowledge in e- Government Services, submitted to ACM TKDD, 2008.  Cao, L., Dai, R., Zhou, M.: Metasynthesis, M-Space and M-Interaction for Open Complex Giant Systems, technical report, 2008.  Cao, L. and Ou, Y. Market Microstructure Patterns Powering Trading and Surveillance Agents. Journal of Universal Computer Sciences, 2008 (to appear).  Cao, L. and He, T. Developing actionable trading agents, Knowledge and Information Systems: An International Journal, 2008.  Cao, L. Developing Actionable Trading Strategies, in edited book: Intelligent Agents in the Evolution of WEB and Applications, Springer, 2008.

35 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 35 Some of relevant papers:  Cao, L., Zhao, Y., Zhang, C. (2008), Mining Impact-Targeted Activity Patterns in Imbalanced Data, IEEE Trans. Knowledge and Data Engineering, IEEE,, Vol. 20, No. 8, pp. 1053-1066, 2008.  Cao, L., Yu, P., Zhang, C., Zhao, Y., Williams, G.:DDDM2007: Domain Driven Data Mining, ACM SIGKDD Explorations Newsletter, 9(2): 84-86, 2007.  Cao, L., Zhang, C.: Knowledge Actionability: Satisfying Technical and Business Interestingness, International Journal of Business Intelligence and Data Mining, 2(4): 496-514, 2007.  Cao, L., Zhang, C.: The Evolution of KDD: Towards Domain-Driven Data Mining, International Journal of Pattern Recognition and Artificial Intelligence, 21(4): 677-692, 2007.  Cao, L.: Domain-Driven Actionable Knowledge Discovery, IEEE Intelligent Systems, 22(4): 78-89, 2007.  Cao, L., and Zhang, C. Domain-driven data mining: A practical methodology, International Journal of Data Warehousing and Mining (IJDWM), IGI Global, 2(4):49-65, 2006.

36 D 3 M : D 3 M : Domain-Driven Data Mining The Smart Lab: datamining.it.uts.edu.au 15 December 2008 Cao, L: D3M at DDDM2008 Joint with ICDM2008 36 Thank you! Longbing CAO Faculty of Engineering and IT University of Technology, Sydney, Australia Tel: 61-2-9514 4477 Fax: 61-2-9514 1807 email: lbcao@it.uts.edu.aulbcao@it.uts.edu.au Homepage: www-staff.it.uts.edu.au/~lbcao/www-staff.it.uts.edu.au/~lbcao/ The Smart Lab: datamining.it.uts.edu.audatamining.it.uts.edu.au


Download ppt "D 3 M: D 3 M: Domain-Driven Data Mining An Overview of Domain-Driven Data Mining: Toward Actionable Knowledge Discovery (AKD) Longbing Cao Faculty of Engineering."

Similar presentations


Ads by Google