Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Establishing an inter-organisational OGSA Grid: Lessons Learned Wolfgang Emmerich London Software Systems, Dept. of Computer Science University College.

Similar presentations


Presentation on theme: "1 Establishing an inter-organisational OGSA Grid: Lessons Learned Wolfgang Emmerich London Software Systems, Dept. of Computer Science University College."— Presentation transcript:

1 1 Establishing an inter-organisational OGSA Grid: Lessons Learned Wolfgang Emmerich London Software Systems, Dept. of Computer Science University College London Gower St, London WC1E 6BT, U.K http://www.sse.ucl.ac.uk/UK-OGSA

2 2 An Experimental UK OGSA Testbed Established 12/03-12/04 Four nodes: –UCL (coordinator) –NeSC –NEReSC –LeSC Deployed Globus Toolkit 3.2 throughout onto Heterogeneous HW/OS –Linux –Solaris –Windows XP

3 3 Experience with GT3.2 Installation Different levels of experience within team Heterogeneity –HW (Intel/SPARC) –Operating system (Windows/Solaris/Linux) –Servlet container (Tomcat/GT3 container) Interaction with previous GT versions Departure from web service standards prevented standard tool use –JMeter –Development environments (Eclipse) –Exception management tools (Amberpoint) Interaction with system administration Platform dependencies

4 4 Performance and Scalability Developed GTMark –Server-side load model: SciMark 2.0 (http://math.nist.gov/SciMark) –Client-side load model, configuration and metrics collection based on J2EE benchmark StockOnline Configurable Benchmark –Static vs dynamic discovery of nodes –Loads for fixed period of time or until steady state obtained –Constant or variation of concurrent requests

5 5 Performance Results

6 6 Scalability Results

7 7 Performance Results Performance and scalability of GT3.2 with Tomcat/Axis surprisingly good Performance overhead of security is negligible Good scalability - reached 96% of theoretical maximum Tomcat performs better than GT3.2 container on slow machines Surprising results on raw CPU performance

8 8 Reliability Tomcat more reliable than GT3.2 container. –Tomcat container sustained 100% reliability under load –GT3.2 container failed once every 300 invocations (99.67% reliability) Denial of Service Attack possible by –Concurrently invoking operation on the same service instance (they are not thread safe!) –Fully exhausting resources Problem of hosting more than one service in one container –Trade-off between reliability and reuse of containers across multiple users/services.

9 9 Security Interesting effect of firewalls on testing and debugging Accountability and audit trails demand users be given individual accounts on each node Overhead of node and user certificates (they always expire at the wrong time) Current security model does not scale: –Assuming cost of £18/Admin hour –10 users per node (site) –It will cost approx. £300,000 to set up a 100 node grid with 1000 users –It will be prohibitively expensive to scale up to 1,000 nodes(with admin costs in excess of £6M)

10 10 Deployment How do admins get grid middleware deployed systematically onto grid nodes? How can users get the services onto remote hosts? We tried out SmartFrog (http://www.smartfrog.org) Worked very well inside a node. Impossible across organisations: –SmartFrog daemon would need to execute actions with root privileges which some site admins just did not agree to –Security paramount (SmartFrog would be the perfect virus distribution engine) –SmartFrog’s security infastructure incompatible with GT 3.2 infrastructure

11 11 Looking Ahead Installation efforts need to be reduced significantly –Binary distributions –For a few selected HW/OS platforms Standards compliance –Track standards by all means –Otherwise no economies of scale Management console –Add / remove grid hosts –Need to be able to monitor status of grid resources –Across organisational boundaries More lightweight security model needed –Role-based Access Control –Trust-delegation Deployment is a first-class citizen –Avoid adding as an afterthought –Needs to be built into middleware stack

12 12 Conclusions Very interesting experience Building a distributed system across organisational boundaries is different from building a system over a LAN Insights that might prove useful for –OMII –Globus –ETF There is a lot more work to do before we realize the vision of the Grid!

13 13 Acknowledgements A large number of people have helped with this project, including –Dave Berry (NeSC) –Paul Brebner (UCL, now CSIRO) –Tom Jones (UCL, now Symantec) –Oliver Malham (NeSC) –David McBride (LeSC) –Savas Parastatidis (NEReSC) –Steven Newhouse (OMII) –Jake Wu (NEReSC) For further details (including IGR) check out http://sse.cs.ucl.ac.uk/UK-OGSA


Download ppt "1 Establishing an inter-organisational OGSA Grid: Lessons Learned Wolfgang Emmerich London Software Systems, Dept. of Computer Science University College."

Similar presentations


Ads by Google