Presentation is loading. Please wait.

Presentation is loading. Please wait.

Agent Technology for Data Analysis Tony Johnson - SLAC 21 st October 1998 WORKSHOP ON SCIENTIFIC DATA MANAGEMENT PROBLEMS AND SOLUTIONS.

Similar presentations


Presentation on theme: "Agent Technology for Data Analysis Tony Johnson - SLAC 21 st October 1998 WORKSHOP ON SCIENTIFIC DATA MANAGEMENT PROBLEMS AND SOLUTIONS."— Presentation transcript:

1

2 Agent Technology for Data Analysis Tony Johnson - SLAC 21 st October 1998 WORKSHOP ON SCIENTIFIC DATA MANAGEMENT PROBLEMS AND SOLUTIONS

3 Motivation and Disclaimer b Many efforts to use supernetworks to link supercomputers to transfer huge datasets b Few efforts to make effective use of existing real-world networks Allow university users to access remote dataAllow university users to access remote data b I am not an agent technology expert We do have a prototype applicationWe do have a prototype application I’m hoping some of you are!I’m hoping some of you are!

4 Outline b Overview of problem Network restraintsNetwork restraints b Why agent technology? b Why Java For Agent Technology?For Agent Technology? For Data Analysis?For Data Analysis? b Analysis Studio application b More information

5 What Problem are we trying to solve? b Widely distributed users who need access to petabyte datasets Many university users with mediocre networksMany university users with mediocre networks Most universities have no way to handle petabyte data samplesMost universities have no way to handle petabyte data samples b Physicist needs unfettered access to data Would like effective use of desktop machineWould like effective use of desktop machine Canned analysis wont doCanned analysis wont do b CPU/data access requirements are infinite

6 Faster networks? Faster networks will not solve our problems anytime soonFaster networks will not solve our problems anytime soon No matter how fast networks are they are always saturated.No matter how fast networks are they are always saturated. As networks become saturated latency becomes highAs networks become saturated latency becomes high

7 Why Agent Technology? b By encapsulating users analysis code as a “user agent” we can send it to the data, wide-area network bandwidth requirements become trivial Analysis modules are typically small <10’s kBytesAnalysis modules are typically small <10’s kBytes HEP output is typically histograms (binned) and scatterplots, which are both smallHEP output is typically histograms (binned) and scatterplots, which are both small b Possible to do GUI based analysis of large datasets using 28.8 modem connection b Give user the impression his analysis is running locally.

8 Why Java for Agent Technology? b Java produces machine independent bytecodes Trivial to move from one machine to anotherTrivial to move from one machine to another Network handling and Remote Method Invocation (RMI c.f. Corba) built-inNetwork handling and Remote Method Invocation (RMI c.f. Corba) built-in (Remote) Dynamic loading build-in(Remote) Dynamic loading build-in Multithreaded servers easy to writeMultithreaded servers easy to write Built-in Java “Sandbox” can be used to restrict agentsBuilt-in Java “Sandbox” can be used to restrict agents

9 Why Java for Data Analysis b Easy to learn yet very powerful, fully OO language Very wide industry supportVery wide industry support Just In Time compilation = FastJust In Time compilation = Fast Dynamic Optimization = FasterDynamic Optimization = Faster Very fast code, load, test, fix cycleVery fast code, load, test, fix cycle Built in debugger, including remote debuggingBuilt in debugger, including remote debugging Numerical functionality goodNumerical functionality good –Java Grande Forum enhancing numerical support

10 “Java Analysis Studio” Network Data Server DIM Remote Data Desktop Client DIM Local Data Network Data Controller Distributed Data Data Server DIM Data Server DIM Data Server DIM Data Server DIM Data Server DIM Data Server DIM

11 Demo

12 Network Performance View (Histogram) Model (Data Source) View AdapterModel Adapter  Caching  Prefetching of data  Data clumping, streaming

13 More Information b Java http://java.sun.comhttp://java.sun.comhttp://java.sun.com b Java Analysis Studio http://www-sldnt.slac.stanford.edu/jashttp://www-sldnt.slac.stanford.edu/jashttp://www-sldnt.slac.stanford.edu/jas b Java Grande Forum (numeric computing in Java) http://www.javagrande.org/http://www.javagrande.org/http://www.javagrande.org/ Desktop access to remote resourcesDesktop access to remote resources –http://www-fp.mcs.anl.gov/~gregor/datorr/ http://www-fp.mcs.anl.gov/~gregor/datorr/


Download ppt "Agent Technology for Data Analysis Tony Johnson - SLAC 21 st October 1998 WORKSHOP ON SCIENTIFIC DATA MANAGEMENT PROBLEMS AND SOLUTIONS."

Similar presentations


Ads by Google