Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intelligent Internet Agents for Distributed Data Mining {yzhang, sowen, sprasad, Yanqing Zhang, Scott Owen, Sushil Prasad.

Similar presentations


Presentation on theme: "Intelligent Internet Agents for Distributed Data Mining {yzhang, sowen, sprasad, Yanqing Zhang, Scott Owen, Sushil Prasad."— Presentation transcript:

1 Intelligent Internet Agents for Distributed Data Mining {yzhang, sowen, sprasad, raj}@cs.gsu.edu gjv@ece.gatech.edu Yanqing Zhang, Scott Owen, Sushil Prasad and Raj Sunderraman Department of Computer Science Georgia State University George Vachtsevanos School of Electrical and Computer Engineering Georgia Institute of Technology

2 Outline Motivation Architecture of Intelligent Internet Agents Program Libraries of Intelligent Middleware Smart Web Search Agents Intelligent Soft Computing Agents Benefits Deliverables Conclusion

3 Motivation Distributed Web KDD: Useful information and knowledge mined in distributed Web databases QoS (Efficiency, Web Speed, User Time) : Huge amounts of useless data flow on the Internet From Data Web to Information Web: Upgrade a current data-flow-oriented Internet to a future information-flow-oriented Internet Intelligent Web Middleware: with reusable, portable and scalable intelligent functionality Smart E-Business: Use intelligent Web agents to do better E-Business on the Internet

4 Architecture of Intelligent Internet Agents Application Layer: E-Commerce, E-Education, other E-B Intelligent Layer: Data Mining, Soft Computing, ES, etc Network Layer: Backbone, gigaPoPs, other hardware

5 Program Libraries of Intelligent Middleware 1.Binary Association Rule Generator 2. Fuzzy Association Rule Generator 3.Neural-Net-based Data Classifier and Pattern Generator 4.Fuzzy c-means Program for Data Clustering 5.Genetic Algorithms for Data Refinement and Optimization 6.Granular Neural Nets for Linguistic Data Mining 7.XML-based Smart Web Search Sub-Programs 8.Connection Programs between Database and Middle Layer 9.Local Cache Database Manager 10.Local Cache Informationbase Manager 11.Basic GUI Programs 12.Client-Server Creation and Communication Programs 13.Distributed Operation Manager 14.Distributed Data Mining Synchronization, 15.Web Customer Log Miner,.….., and so on.

6 Smart Web Search Agents Data Search Engines >> Information Search Agents - Traditional searching on the Web is done using one of the following three: - Directories (Yahoo, Lycos, etc) - Search Engines (AltaVista, NorthernLight, etc) - Metasearch Engines (MetaCrawler, SavvySearch, AskJeeves, etc) All of these involve keyword searches; Drawback: not easily personalized, too many results (although many give relevancy factors)

7 - Smart Search Agents will provide - more personalized searches - domain-based search, - more efficient searches

8 Smart Search Agents will employ - local cache databases (containing frequently asked queries/results; possibly updated periodically - nightly!) - local cache information base (containing mined information and discovered knowledge for efficient personal use) - domain-based agents (e.g. Job Search; Sports-NBA Stats, Bibliography-Digital Libraries)

9 Some initial results: M. Nagarajan, Metagenie - A metasearch engine for multi-databases, M.S. thesis, GSU (July 1999) Domains: Jobs, Books S. Ahmed, EXACT-FINDER: A cache-based meta-search engine, M.S. thesis, GSU (May 2000) Local cache database storing personalized frequently asked queries and results, updated periodically R. Sunderraman, ReQueSS: Relational Querying of semi- structured data, ICDE 2000 (demo session), San Diego, CA, March 2000. X. Li, Querying unified sources of Web data, M.S. thesis, GSU (July 1999) Data wrappers for Web sources (NBA stats/box scores, DBLP Bibliography database)

10 Intelligent Tools for E-Business Computational Intelligence, Neural Networks, Fuzzy Logic, Genetic Algorithms, Hybrid Systems Learning Algorithms, Heuristic Searching Data Analysis and Modeling, Data Fusion and Mining, Knowledge Discovery Prediction & Time Series Analysis Information Retrieval, Intelligent User Interface Intelligent Agents, Distributed IA and Multi- Agents, Cooperative Knowledge-based Systems

11 Enhancing E-Business Process Through Data Mining Quality of discovered knowledge –Having right data –Having appropriate data mining tools!!! Traditional Data Mining Tools –Simple query and reporting –Visualization driven data exploration tools, OLAP –Discovery process is user driven

12 Intelligent Data Mining Tools Automate the process of discovering patterns/knowledge in data Require hypothesis, exploration Derive business knowledge (patterns) from data Combine business knowledge of users with results of discovery algorithms

13 Intelligent Information Agents The Data Mining Problem: –Clustering/ Classification –Association –Sequencing Viewed as an Optimization Problem Tools: Genetic Algorithms

14 Fuzzy Rules Discovering Rules discovering : The discovery of associations between business events, i.e. which items are purchased together In order to do flexible querying and intelligent searching, fuzzy query is developed to uncover potential valuable knowledge Fuzzy Query uses fuzzy terms like tall, small, and near to define linguistic concepts and formulate a query Automated search for fuzzy Rules is carried out by the discovery of fuzzy clusters or segmentation in data

15 Fuzzy Decision Making:Match Users with Dynamic Products, Services, and Pricing Loss Ratio ( R isk) R esponse Persistency ( R etention) Low Medium High Low Medium High Low Risk High Response High Retention -> Customer: Preferred Pricing: according to Life-time Value Cross-Selling: Bundle Extra Liability Insurance ( R isk- R esponse- R etention ( R ) Model) 3 Example of 3 Service Provider’s Features

16 Measuring Performance of Intelligent Agents Accuracy : distance or variance measure of IAs’ performance from their goal, i.e. Fuzzy Entropy Speed : latency of response Cost : resources consumed, consequences of failures Benefit : payoff for goals achieved

17 Performance Assessment, Learning and Optimization Learning/ Adaptation Learning/ Adaptation Performance Evaluation Module Performance Evaluation Module Goals/ Objectives Goals/ Objectives

18 Examples Product Information Clustering –Use a GA as the Heuristic Search Engine –Apply the GA selection and inversion operators –Evaluate information content –Estimate system entropy –Apply reinforcement learning strategy Dynamic Pricing –In addition to above steps, explore association and sequencing relations

19 The “New Technology” Paradigm Internet Related Technologies Euphoria/ Optimism Reality Back to Basics Time

20 INFORMATION IS SELLING NOW! Intelligent Agents will give your information product bargaining power

21 Benefits Better QoS: - Web users get information (not raw data) - Smart agents can make decisions for users - Smart agents can save users’ surfing time Faster Internet: - Information flows on the Internet quickly (e.g., 1k information << 100 k raw data) - Reduce data redundancy on the Internet - Reduce Web communication congestion

22 Deliverables Intelligent Middle Layer - Data Mining Program Libraries - Soft Computing Program Libraries (e.g., Neural Networks, Fuzzy Logic, Genetic Algorithms, Neuro-fuzzy Systems) Application Layer - Smart Web Search Agents - Intelligent Soft Computing Agents

23 Conclusion To make the future Internet more intelligent and more efficient, it is necessary to design relevant "Intelligent Middleware" between network hardware and high-level Web application systems. We will first design basic intelligent middle layer with basic intelligent functionality, and then implement two Web application systems for distributed data mining and E-Business.


Download ppt "Intelligent Internet Agents for Distributed Data Mining {yzhang, sowen, sprasad, Yanqing Zhang, Scott Owen, Sushil Prasad."

Similar presentations


Ads by Google