Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sept 20-21, 2001R. Scott Cost - CADIP, UMBC1 CARROT II Collaborative Agent-based Routing and Retrieval of Text, Version 2 CADIP Fall Research Symposium.

Similar presentations


Presentation on theme: "Sept 20-21, 2001R. Scott Cost - CADIP, UMBC1 CARROT II Collaborative Agent-based Routing and Retrieval of Text, Version 2 CADIP Fall Research Symposium."— Presentation transcript:

1 Sept 20-21, 2001R. Scott Cost - CADIP, UMBC1 CARROT II Collaborative Agent-based Routing and Retrieval of Text, Version 2 CADIP Fall Research Symposium

2 Sept 20-21, 2001R. Scott Cost - CADIP, UMBC2 Overview A distributed, agent-based system for large scale, high bandwidth information retrieval and visualization. Carrot I, implemented ~1997, demonstrated the distribution of queries to various backend systems through a single broker, using Telltale, with TKQML as a communication mechanism.

3 Sept 20-21, 2001R. Scott Cost - CADIP, UMBC3 Outline Project Review Goals Overview Issues Architecture Progress Report

4 Sept 20-21, 2001R. Scott Cost - CADIP, UMBC4 C2 Project Goals Build a powerful, high-bandwidth distributed IR system Create a testbed for research in a variety of IR issues Foster new and ongoing IR research at UMBC

5 Sept 20-21, 2001R. Scott Cost - CADIP, UMBC5 Basic C2 Approach A client submits a query to some agent in a distributed C2 system. That agent uses metadata about its collection and the collections around it to decide whether to handle or forward the query to another agent. Results are assembled, and returned to the client.

6 Sept 20-21, 2001R. Scott Cost - CADIP, UMBC6 How does it work? Single IR engine is replicated across multiple machines Each engine gets a portion of the total document collection Engines exchange metadata describing their collections Engines receive queries, and either answer or forward them as appropriate

7 Sept 20-21, 2001R. Scott Cost - CADIP, UMBC7 Research Issues/Questions Heterogeneity (information sources) Metadata (form, order, comparison) Query Management (routing, standing) Results Fusion Corpus Management Integration with Parallel Telltale, RAMA Index-based parallelism Storage-based parallelism

8 Sept 20-21, 2001R. Scott Cost - CADIP, UMBC8 Flexible System Form of system can change dramatically, based on: How system is distributed How metadata is distributed How queries are handled How fusion is handled Whether or not system adapts dynamically to query performance and/or load…

9 Sept 20-21, 2001R. Scott Cost - CADIP, UMBC9 Some example scenarios Two peer agents, each managing a corpus (IR System is MG). Each agent advertises metadata to the other. Queries directed at either, routed to appropriate agent. Based on TREC WT10g Collection ~1,700,000 documents from the WWW N agents, one for each of the ~12,000 servers represented in collection Topology of system inferred from link topology in collection of web pages An agent starts and runs a C2 system for a specific purpose.

10 Sept 20-21, 2001R. Scott Cost - CADIP, UMBC10 C2 Architecture C2 Agents Form the core of the C2 system C2 Infrastructure Elements Provide effective communication and control support C2 Support Elements Control and provide access to system

11 Sept 20-21, 2001R. Scott Cost - CADIP, UMBC11 C2 Agent Java-based software agent Communicates using the Jackal system Runs a local corpus and metadata engine (currently MG)

12 Sept 20-21, 2001R. Scott Cost - CADIP, UMBC12 Basic Node Architecture Agent Jackal Other nodes IR Engine Wrapper Decision Interface IR System: Manages local corpus and metadata

13 Sept 20-21, 2001R. Scott Cost - CADIP, UMBC13 C2 Infrastructure Provides for efficient control of system Hierarchical Several Types of Agent: Master Node Platform Cluster

14 Sept 20-21, 2001R. Scott Cost - CADIP, UMBC14 Infrastructure Master Agent Node Agent Controls one physical node Platform Controls one JVM Cluster Agent Controls one Jackal instance C2 Agent Next Cluster… Next Platform… Next Node…

15 Sept 20-21, 2001R. Scott Cost - CADIP, UMBC15 Infrastructure… Infrastructure hierarchy allows for efficient propagation of control information Communication and coordination is localized to reduce overhead Shape of tree can be modified to change performance

16 Sept 20-21, 2001R. Scott Cost - CADIP, UMBC16 C2 Support Master Controls the C2 system ANS White pages communications support Collection Manager Controls distribution of documents/collections to C2 Agents Logger Agent Logs system operational information

17 Sept 20-21, 2001R. Scott Cost - CADIP, UMBC17 C2 Tools Query Agent Supports the controlled presentation, collection and analysis of large batches of queries C2 System Visualizer Presents a graphical view of the flow of queries through the system

18 Sept 20-21, 2001R. Scott Cost - CADIP, UMBC18 C2 Tools: Visualizer (screen shot)

19 Sept 20-21, 2001R. Scott Cost - CADIP, UMBC19 For More Information … For more details on the goals and design of the project, individuals are referred to documents on the Project site: http://acm.org/~cost /carrot2/info.htm

20 Sept 20-21, 2001R. Scott Cost - CADIP, UMBC20 3/6/12 Plan (From 9/2000) 3: Clear design, working prototype. 6: Fully operational system, testing on real data. 12: Publication ready results for one or more research questions. Tentative target of CIKM. 50-75% complete: System still in test with scalability issues, design publications in press.

21 Sept 20-21, 2001R. Scott Cost - CADIP, UMBC21 3/6/12 Plan (From 9/2001) 3: Exercise system and prepare initial results for publication. 6: Expand system. 12: To be determined.

22 Sept 20-21, 2001R. Scott Cost - CADIP, UMBC22 External Publication Plans WWW 2002 Autonomous Agents 2002 SIGIR 2002

23 Sept 20-21, 2001R. Scott Cost - CADIP, UMBC23 Academic Milestones Monitoring and Control of a Distributed IR System M.S. Thesis, Srikanth Kallurkar (Fall ’01) Integrating C2 as an Information Source for ITTALKS M.S. Project, Yogesh Nagappa (Fall ’01) Integrating Telltale into the C2 System 691 Project, Jonathan Kessler and Matt Siegel (Fall ’01) Visualization of a Distributed IR System 691 Project, Tom Laufert (Fall ’01) Data Fusion in C2 Agents 691 Project, Mithun Sheshagiri (Fall ’01) Query Caching in the C2 System M.S. Thesis, Hemali Majithia (Spring ’02) A User-friendly interface to the C2 System Jacquelyn Nicole Winston, High School Intern (Spring ’01)


Download ppt "Sept 20-21, 2001R. Scott Cost - CADIP, UMBC1 CARROT II Collaborative Agent-based Routing and Retrieval of Text, Version 2 CADIP Fall Research Symposium."

Similar presentations


Ads by Google