Download presentation
Presentation is loading. Please wait.
Published byViolet Neal Modified over 9 years ago
1
Interactive Data Analysis on the “Grid” Tech-X/SLAC/PPDG:CS-11 Balamurali Ananthan (bala@txcorp.com) bala@txcorp.com David Alexander (alexanda@txcorp.com) alexanda@txcorp.com Tony Johnson (tony_johnson@slac.stanford.edu) tony_johnson@slac.stanford.edu Victor Serbo (serbo@slac.stanford.edu) serbo@slac.stanford.edu Presented at Computing in High Energy Physics Interlaken, Switzerland, September 2004
2
Focus of our work Interactive Data Analysis on the Grid Very quick (<1 second 100’s seconds) turnaround Very quick (<1 second 100’s seconds) turnaround Intermediate results presented in real time Intermediate results presented in real time Plots update as analysis proceeds Output from analysis displayed immediately High degree of interactivity High degree of interactivity Change cuts/binning etc. and see immediate results Goal – seamless interactive computing on the web.
3
Starting Point JAS2 analysis client supports Local Analysis Local Analysis Data, analysis code and GUI client live on same machine Client-Server Analysis Client-Server Analysis Data and analysis runs on remote machine, GUI client runs on local machine (uses Java RMI as network protocol) In 2002 we added GRID based analysis GUI client runs on local machine GUI client runs on local machine Data and Analysis runs in parallel on a farm of remote machines Data and Analysis runs in parallel on a farm of remote machines Initial implementation used Globus2 + Java RMI Initial implementation used Globus2 + Java RMI In all three modes goal is for physicist to feel that he is interacting with his local machine All three modes look almost identical to use All three modes look almost identical to use Try to hide as much of the Grid from the end-user as practical Try to hide as much of the Grid from the end-user as practical
4
JAS2 Grid Client
5
Current Project Builds on Earlier Work Grid Services based on OGSI/Globus 3 Switch to using WS-RF (Globus 4?) in future Reuse existing Globus facilities where possible Reuse existing Globus facilities where possible Define new services if not already available Define new services if not already available Design loosely-coupled services to encourage re-use Separate interface from implementation Interfaces: Collaborate with CS-11/PPDG/ARDA Interfaces: Collaborate with CS-11/PPDG/ARDA Reference Implementation: JAS-DAGS (Dataset Analysis Grid Service) Reference Implementation: JAS-DAGS (Dataset Analysis Grid Service) Use JAS3 as reference analysis client Currently in development Plan for initial use for International Linear Collider Simulation Studies Plan for initial use for International Linear Collider Simulation Studies
6
Dataset Catalog Service First component developed Interface collaboratively designed as part of PPDG-CS11 project Interface collaboratively designed as part of PPDG-CS11 project Aims to separate interface from implementation Aims to separate interface from implementation We have a reference implementation Based on Java and simple “in-memory” XML database Based on Java and simple “in-memory” XML database Designed to make it easy to put same interface on top of other existing data catalog systems Has also been deployed as a Clarens service
7
Dataset Catalog Service Allows user to “browse” dataset hierarchy Allows user to “search” using “meta-data” associated with each dataset Output Grid Service Handle (GSH) of the Dataset Locator Grid Service Handle (GSH) of the Dataset Locator The Locator service that knows the actual location of the Dataset. String ID of the Dataset String ID of the Dataset An opaque string interpreted only by the dataset locator
8
DAGS Dataset Analysis Grid Service Aim to produce complete interactive data analysis system Aim to produce complete interactive data analysis system Loosely based on CS-11 API’s Migrate from RMI->OGSA in stages to maintain working system at each stage Key design goals Only requires Globus (+JavaVM) on worker nodes Only requires Globus (+JavaVM) on worker nodes Everything else dynamically deployed Specialized analysis services only need to be installed on specific gateway nodes. Specialized analysis services only need to be installed on specific gateway nodes. Few services need to be visible outside firewall. Few services need to be visible outside firewall. No Grid software on Client node (except Java COG) No Grid software on Client node (except Java COG)
9
WORKER NODE 1WORKER NODE 2 JAS3 Client Dataset Analysis Manager Service Dataset Catalog Service Index Service Dataset Locator Service Data Splitter Service Reliable File Transfer Service Analysis Server Managed Job Service Reliable File Transfer Service Managed Job Service Analysis TaskResults Analysis Job Description Results Dataset IDDataset query Result Merging Service (AIDA based) Firewall Caching Service Data Chooser Plugin Proxy Login Plugin DAGS client DAGS Conceptua l Diagram
10
Performance JAS2 system used Java Remote Method Invocation (RMI). Current system still uses RMI in some areas, but intention is to migrate to OGSA Performance is a real problem: Trivial Service Invocation (AuctionService) over 10Mbit LAN Trivial Service Invocation (AuctionService) over 10Mbit LAN all times for 100 calls, excluding first call all times for 100 calls, excluding first call RMI: 100 calls - 96ms Globus3.2 (non-secure): 100 calls - 22 seconds Globus3.2 (secure): 100 calls - 112 seconds Problems may be partly related to Globus implementation, but are clearly also partly fundamental problems with XML encoding/decoding and web-service protocol Problems may be partly related to Globus implementation, but are clearly also partly fundamental problems with XML encoding/decoding and web-service protocol Possible workarounds Possible workarounds “fast web services” http://java.sun.com/developer/technicalArticles/WebServices/fast WS/ http://java.sun.com/developer/technicalArticles/WebServices/fast WS/ http://java.sun.com/developer/technicalArticles/WebServices/fast WS/ or “clarens + xml-rpc” or …
11
Plans Deploy Dataset Catalog Interface with some real data sources International Linear Collider Simulation Data International Linear Collider Simulation Data Some interest in interface to POOL Some interest in interface to POOL Deploy full DAGS system and try with real users First target will be linear collider simulation studies First target will be linear collider simulation studies Work on interoperability with other systems Clarens/Rendezvous service Clarens/Rendezvous service gLite? gLite? One goal of switching to OGSI was to use interoperable modules One goal of switching to OGSI was to use interoperable modules This requires development of “standard” interfaces which provide for flexibility in the way in which they will be used It is unclear that the HEP community has the motivation to do this
12
Conclusion We are making progress on developing a Globus 3 based interactive data system Aim to have usable system by end 2004 Aim to have usable system by end 2004 Globus/OGSI/WS-RF is certainly not the easiest way to implement interactive data analysis Performance is a problem Performance is a problem Workarounds exist not clear if/when this will be addressed by core Globus software not clear if/when this will be addressed by core Globus software Looking at other technologies for better performance Looking at other technologies for better performance Interoperability and Component Reuse Interoperability and Component Reuse Some progress but not so far as effective as was hoped for
13
Links DAGS http://www.slac.stanford.edu/~banantha/dags http://www.slac.stanford.edu/~banantha/dags http://www.slac.stanford.edu/~banantha/dags http://grid.txcorp.com/dags http://grid.txcorp.com/dags http://grid.txcorp.com/dags CS11 http://www.ppdg.net/pa/ppdg-pa/idat/ http://www.ppdg.net/pa/ppdg-pa/idat/ http://www.ppdg.net/pa/ppdg-pa/idat/ JAS3 http://jas.freehep.org/jas3/ http://jas.freehep.org/jas3/ http://jas.freehep.org/jas3/ AIDA http://aida.freehep.org/ http://aida.freehep.org/ http://aida.freehep.org/ http://java.freehep.org/JAIDA/ http://java.freehep.org/JAIDA/ http://java.freehep.org/JAIDA/ Clarens http://clarens.sourceforge.net/ http://clarens.sourceforge.net/ http://clarens.sourceforge.net/
14
Screenshots
15
Some Screenshots Starting Work Manager.. Starting Grid Service Manager..
16
Screenshots(cont…) Starting MMJFS on the end nodes…
17
Starting JAS Client..
18
JAS Client..
19
Resulting Histogram…
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.