Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 ~Khaled Shaban PhD. Candidate Supervisors: Dr. Otman Basir Dr. Mohammad Kamel.

Similar presentations


Presentation on theme: "1 ~Khaled Shaban PhD. Candidate Supervisors: Dr. Otman Basir Dr. Mohammad Kamel."— Presentation transcript:

1 1 ~Khaled Shaban PhD. Candidate Supervisors: Dr. Otman Basir Dr. Mohammad Kamel

2 2 Previous work MSc. Thesis, 2002, “Information Fusion in a Cooperative Multiagent System for Web Information Retrieval” K. B. Shaban, O. A. Basir, K. Hassanein, and M. Kamel, "Intelligent Information Fusion Approach in Cooperative Multiagent Systems", World Automation Congress. June 2002. K. B. Shaban, O. A. Basir, K. Hassanein, and M. Kamel, "Information Fusion in a Cooperative Multi-agent System for Web Information Retrieval", The Fifth International Conference On Information Fusion, July 2002.

3 3 System vision Envisioned View of the System Personal Agent Intermediate “Fusion” Agent Personal Agent Resource “Information Retrieval” Agent Environment “The Web” User

4 4 Decision Fusion Markovian team. A1A1 A2A2 A3A3 AnAn RGRG Z1Z1 Z2Z2 Z3Z3 ZnZn Environment (a) R1R1 R2R2 A1A1 A2A2 AnAn RGRG Z1Z1 Z2Z2 ZnZn DECISION MAKER R1R1 R2R2 RnRn (b) Centralized team. (c) A1A1 A2A2 AnAn RGRG Z1Z1 Z2Z2 ZnZn DECISION MAKER R1R1 R2R2 RnRn Consensus team.

5 5 Implementation AltaVista Excite AltaVista Retrieval Agent Fusion AgentPersonal Agent

6 6 Current Project “Semantic-based Document Clustering”

7 7 Project Goals Clustering documents based on semantic similarities of their contents Lend ideas to other mining projects PhD. thesis by 2005/2006!

8 8 Document Clustering Clustering Documents Low Inter-cluster similarity High Intra-cluster similarity Document Cluster

9 9 Applications Improve information retrieval systems performance Improve the organization and viewing of documents Accelerate nearest-neighbour search Generate directories of hierarchy clusters Improve automatic speech recognition systems

10 10 Existing Schemes Data representation models –Documents as bags-of-words (Vector Space Model (VSM)) –N-grams –Latent Semantic Indexing (LSI) –Phrase-based Similarity measures –Euclidean distances –Minkowski distances

11 11 Existing Schemes, Cont. Clustering algorithms –Partitioning (k-means & Fuzzy C-means) –Geometric (Self-Organized Maps (SOM), LSI) –Probabilistic (Maximization Expectation (ME), Probabilistic LSI) Evaluation methods –Entropy –F-measure –Overall Similarity

12 12 Shortcomings Abandoning meanings produce wrong results! –Ex. ”John eats the apple standing beside the tree“ vs. ”The apple tree stands beside John’s house” ”John is an intelligent boy“ vs. “John is a brilliant son”

13 13 Proposed Approach Syntactic analysis Documents Semantic analysis Parse Tree Semantic- based document clustering Knowledge Representation scheme Document Cluster

14 14 Proposed Approach - Steps Preprocess text –Remove tags, hyperlinks, etc. Morphological analysis –Identifying words, punctuations, etc. Syntactic analysis –Building sentences grammatical structures (Parse Tree) Semantic analysis –Assigning meaning to words –Discourse integration –Pragmatic analysis –Knowledge representation structure Clustering using the produced representations –New similarity measures –New clustering algorithm –Better document clustering results (hopefully!)

15 15 Illustration –“John eats the apple standing beside the tree.” vs. “The apple tree stands beside John’s house.” tree clause 1 clause 2 np vg np n v n det Johneats applethe vg adv v standing prep det the beside sent 1 n clause 1 np prep np n adv n det John apple besidethe ‘s sent 2 treestands apos house Parse Trees

16 16 Illustration, Cont. beside the tree John eats the apple Act 1Obj 2 standing Act 2Obj 3 Obj 1 The apple treeStands beside John’s house St 1Obj 1 Knowledge Representations

17 17 Relation to LORNET? –Findings can be applied to Learning Objects (LO) mining Knowledge Representations Clustering Classification Retrieval Knowledge Sharing

18 18 Milestones Jan 03Jan 04Jan 05 Phase 1 Phase 2 Phase 3 Grad. courses Lit. review Proposal Comp. Exam Development Experimentations Evaluations Reporting Thesis writing Defence

19 19 Thank you! Questions?


Download ppt "1 ~Khaled Shaban PhD. Candidate Supervisors: Dr. Otman Basir Dr. Mohammad Kamel."

Similar presentations


Ads by Google