Download presentation
Presentation is loading. Please wait.
Published byNyasia Ricard Modified over 10 years ago
1
www.monash.edu.au Advanced Topics in Data Mining and Research Directions CSE5610 Intelligent Software Systems Semester 1, 2006
2
www.monash.edu.au 2 Outline Mining Different Data Types –Spatial, Temporal, Time Series, Data Streams, Multimedia, XML, Web, Text etc. Distributed Data Mining (DDM) Mobile & Ubiquitous Data Mining (UDM) Data Mining E-Services Anytime, Anywhere Data Mining E-Services
3
www.monash.edu.au 3 Generations of Data Mining Four Generations of Data Mining Systems – Robert Grossman First Generation – Stand Alone, Centralised, Single Algorithm Second Generation – Integration with databases, support for high- dimensionality, complex data types Third Generation –Distribution and Heterogeniety Fourth Generation – Support for mining embedded, mobile and ubiquitous data sources
4
www.monash.edu.au Distributed Data Mining
5
www.monash.edu.au 5 Distributed Data Mining Inherently distributed data MNC + Global Markets => Physical/geographical separation of users from the data sources Traditional data mining model involving the co-location of users, data and computational resources is inadequate
6
www.monash.edu.au 6 Distributed Data Mining (DDM) The inherent distribution of data and other resources as a result of organisations being distributed. The large volumes of data, the transfer of which results in exorbitant communication costs. The need to mine heterogeneous data, the integration of which is both non-trivial and expensive. The performance and scalability bottle necks of data mining.
7
www.monash.edu.au 7 Distributed Data Mining (DDM) DDM = Data Mining (DM) + Knowledge Integration (KI) DM - Performing traditional knowledge discovery at each distributed data site. KI - Merging the results generated from the individual sites into a body of cohesive and unified knowledge.
8
www.monash.edu.au 8 Parallel Data Mining (PDM) Principal distinction between DDM & Parallel DM –parallel mining involves parallel processors with or without shared memory Parallel data mining also includes development of parallel versions of traditional data mining techniques. Can be integration – DecisionCentre
9
www.monash.edu.au 9 DDM – Algorithms & Architectures Research in distributed data mining can be divided into two broad categories [Fu01]: Data Mining Algorithms. –focus on efficient techniques for knowledge integration. Distributed Data Mining Architectures. –focus on development of distributed data mining architectures –emphasizes the processes and technologies that support construction of software systems to perform distributed data mining
10
www.monash.edu.au 10 Taxonomy of DDM Architectures
11
www.monash.edu.au 11 Classification – DDM Systems DDM Architectural ModelsDDM Systems Client-serverDecisionCentre [CDG99], IntelliMiner [PaS99, PaS01], InterAct [PaD02] Agents Mobile Agent Stationary Agent JAM [SPT97], Infosleuth [UMG98, MUU99], BODHI [KPH99], Papyrus [Ram98], PADMA [KHS97a, KHS97b]
12
www.monash.edu.au 12 Client-Server DDM
13
www.monash.edu.au 13 Mobile Agent Model for DDM
14
www.monash.edu.au 14 Hybrid Model for DDM
15
www.monash.edu.au Ubiquitous Data Mining
16
www.monash.edu.au 16 Ubiquitous Data Mining (UDM) Mining data in a resource-constrained environment to support the time critical information needs of mobile users Typical Characteristics –Mobile User – frequent disconnections –Handheld Device - >Resource constraints – memory, battery, processor, screen real-estate –Time critical –Real-time & On-line –Data Streams Example Scenarios Many Challenges
17
www.monash.edu.au 17 Current Research Kargupta’s Group –MobiMine @CSSE, Monash Univ. –AgentUDM –Adapative, Cost-efficient & Light-weight data mining techniques for data streams >Mohamed Medhat >LWC, LWF & LWClass >Watch this space!!!
18
www.monash.edu.au Data Mining E-Services
19
www.monash.edu.au 19 Data Mining E-Services “…data analysis and mining functions themselves will be offered as business intelligence e-services that accept operational data from clients and return models or rules” Umesh Dayal, 2001 Why? – Knowledge is a key resource – Cost of data mining infrastructure
20
www.monash.edu.au 20 Data Mining E-Services Current Commercial Landscape –Several ASPs -> DigiMine, Information Discovery, WhiteCross Systems, ListAnalyst.com etc. etc. –Mode of Operation Hybrid Model & Data Mining ASPs –Optimise Response Time >Leads to improved throughput –QoS Estimation –Location Preferences of Clients
21
www.monash.edu.au 21 Data Mining E-Services Current Commercial Landscape –Several ASPs -> DigiMine, Information Discovery, WhiteCross Systems, ListAnalyst.com etc. etc. –Mode of Operation Hybrid Model & Data Mining ASPs –Optimise Response Time >Leads to improved throughput –QoS Estimation –Location Preferences of Clients
22
www.monash.edu.au Anytime, Anywhere Data Mining E-Services
23
www.monash.edu.au 23 My Thoughts Data is a commodity, Analysis is a service Access anytime, anywhere By anyone… –From large corporations to small business to individuals From home buyers to mobile salespersons to grocery shoppers…
24
www.monash.edu.au 24 My Thoughts A preliminary model for delivery –Datacentric Grids
25
www.monash.edu.au References
26
www.monash.edu.au 26 References http://www.csse.monash.edu.au/projects/ MobileComponents/projects/dame/http://www.csse.monash.edu.au/projects/ MobileComponents/projects/dame/ http://www.csse.monash.edu.au/~shonali/ research.htmlhttp://www.csse.monash.edu.au/~shonali/ research.html http://www.csee.umbc.edu/~hillol/DDMBIB /http://www.csee.umbc.edu/~hillol/DDMBIB / http://www.csee.umbc.edu/~hillol/diadic.h tmlhttp://www.csee.umbc.edu/~hillol/diadic.h tml http://www.csse.monash.edu.au/~mgaber/ main.htmlhttp://www.csse.monash.edu.au/~mgaber/ main.html
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.