Tycho: A Resource Discovery and Messaging Framework for Distributed Applications Matthew Grove Viva Presentation, November 2006.

Slides:



Advertisements
Similar presentations
Data Management Expert Panel. RLS Globus-EDG Replica Location Service u Joint Design in the form of the Giggle architecture u Reference Implementation.
Advertisements

A Scalable Virtual Registry Service for jGMA Matthew Grove CCGRID WIP May 2005.
8th December Presented by: Prof Mark Baker SSE, University of Reading Tel:
Extensible Networking Platform IWAN 2005 Extensible Network Configuration and Communication Framework Todd Sproull and John Lockwood
Scale Up Access to your 4GL Application using Web Services
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
Distributed Heterogeneous Data Warehouse For Grid Analysis
Network Management Overview IACT 918 July 2004 Gene Awyzio SITACS University of Wollongong.
Technical Architectures
A New Computing Paradigm. Overview of Web Services Over 66 percent of respondents to a 2001 InfoWorld magazine poll agreed that "Web services are likely.
Web Servers How do our requests for resources on the Internet get handled? Can they be located anywhere? Global?
The Open Grid Service Architecture (OGSA) Standard for Grid Computing Prepared by: Haoliang Robin Yu.
Nikolay Tomitov Technical Trainer SoftAcad.bg.  What are Amazon Web services (AWS) ?  What’s cool when developing with AWS ?  Architecture of AWS 
Jun Peng Stanford University – Department of Civil and Environmental Engineering Nov 17, 2000 DISSERTATION PROPOSAL A Software Framework for Collaborative.
Middleware for P2P architecture Jikai Yin, Shuai Zhang, Ziwen Zhang.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Understanding and Managing WebSphere V5
Internet GIS. A vast network connecting computers throughout the world Computers on the Internet are physically connected Computers on the Internet use.
Client/Server Architectures
Terminal Services in Windows Server ® 2008 Infrastructure Planning and Design.
IT 210 The Internet & World Wide Web introduction.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
Chapter 7: Using Windows Servers to Share Information.
JGMA: A Reference Implementation of the Grid Monitoring Architecture Mat Grove Distributed Systems Group University of Portsmouth
Trimble Connected Community
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
Web Services Mohamed Fahmy Dr. Sherif Aly Hussein.
Pattern Oriented Software Architecture for Networked Objects Based on the book By Douglas Schmidt Michael Stal Hans Roehnert Frank Buschmann.
An XMPP (Extensible Message and Presence Protocol) based implementation for NHIN Direct 1.
® IBM Software Group © 2007 IBM Corporation J2EE Web Component Introduction
Through the development of advanced middleware, Grid computing has evolved to a mature technology in which scientists and researchers can leverage to gain.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
Scalable Systems Software Center Resource Management and Accounting Working Group Face-to-Face Meeting October 10-11, 2002.
Scalable Web Server on Heterogeneous Cluster CHEN Ge.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
Web Services Based on SOA: Concepts, Technology, Design by Thomas Erl MIS 181.9: Service Oriented Architecture 2 nd Semester,
Event-Based Hybrid Consistency Framework (EBHCF) for Distributed Annotation Records Ahmet Fatih Mustacoglu Advisor: Prof. Geoffrey.
1 Advanced Software Architecture Muhammad Bilal Bashir PhD Scholar (Computer Science) Mohammad Ali Jinnah University.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
Plethora: A Wide-Area Read-Write Storage Repository Design Goals, Objectives, and Applications Suresh Jagannathan, Christoph Hoffmann, Ananth Grama Computer.
CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003.
E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA gLite Information System Pedro Rausch IF.
Introduction to Grids By: Fetahi Z. Wuhib [CSD2004-Team19]
NETWORKING FUNDAMENTALS. Network+ Guide to Networks, 4e2.
Worldwide Lexicon Brian McConnell May, WWL – Brian McConnell Worldwide Lexicon Intro Automatic discovery of dictionary, semantic net and translation.
1 Registry Services Overview J. Steven Hughes (Deputy Chair) Principal Computer Scientist NASA/JPL 17 December 2015.
A Scalable Virtual Registry Service for jGMA Matthew Grove DSG Seminar 3 rd May 2005.
Tycho: A General Purpose Virtual Registry and Asynchronous Messaging System Matthew Grove ACET Invited Talk February 2006.
Providing web services to mobile users: The architecture design of an m-service portal Minder Chen - Dongsong Zhang - Lina Zhou Presented by: Juan M. Cubillos.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
- GMA Athena (24mar03 - CHEP La Jolla, CA) GMA Instrumentation of the Athena Framework using NetLogger Dan Gunter, Wim Lavrijsen,
The Globus Toolkit The Globus project was started by Ian Foster and Carl Kesselman from Argonne National Labs and USC respectively. The Globus toolkit.
Web Services. Web Service: Simple definition : “ Service Offered On the Web “ Technically : “ A Web Service is a programmable application component that.
E-commerce Architecture Ayşe Başar Bener. Client Server Architecture E-commerce is based on client/ server architecture –Client processes requesting service.
A System for Monitoring and Management of Computational Grids Warren Smith Computer Sciences Corporation NASA Ames Research Center.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
Added Value to XForms by Web Services Supporting XML Protocols Elina Vartiainen Timo-Pekka Viljamaa T Research Seminar on Digital Media Autumn.
AMSA TO 4 Advanced Technology for Sensor Clouds 09 May 2012 Anabas Inc. Indiana University.
DISTRIBUTED FILE SYSTEM- ENHANCEMENT AND FURTHER DEVELOPMENT BY:- PALLAWI(10BIT0033)
Outline Introduction and motivation, The architecture of Tycho,
Chapter 7: Using Windows Servers
Outline Introduction. Changes made to the Tycho design from last time (June 2005). Example Tycho setup. Tycho benchmark motivations and methodology. Some.
The Open Grid Service Architecture (OGSA) Standard for Grid Computing
StratusLab Final Periodic Review
StratusLab Final Periodic Review
A Messaging Infrastructure for WLCG
Introduction to J2EE Architecture
Oracle Architecture Overview
Grid Systems: What do we need from web service standards?
Presentation transcript:

Tycho: A Resource Discovery and Messaging Framework for Distributed Applications Matthew Grove Viva Presentation, November 2006

1 Outline Research Goals, An Overview Of Tycho, Comparative Benchmarks, Applications of Tycho, Tycho Swarm, a Distribution File Utility - (Demo), Summary.

2 Some Background Two key services for distributed systems are a mechanism for discovering remote components (such as a registry) and then sending messages between these components: –These two services are interdependent. Current solutions require the application scientists to assemble their systems from a diverse range of services. One approach has been to produce toolkits which have pre-selected sets of service bundled together, for example Globus.

3 Research Goals The thesis of this research work is that by combining registry and messaging into a single software framework, the task of binding together distributed systems can be simplified. The proposed solution uses an Internet-based architecture that keeps complexity at the edges of a robust and secure set of core services - a novel approach! This framework facilitates extensibility while limiting the installation and management costs of using the software. The design and development of the framework - known as Tycho - has an overarching goal of reducing the complexity of developing distributed applications.

4 High-level Requirements These are the desirable features for Tycho - as argued in the dissertation: –Scalability, be able to cope with the sizes typical of modern distributed systems, –High-performance, –Extensibility, be able to add new features and interoperate with other systems, –Security out of the box, –Manageability, ease of installation and use: For example minimizing elememnts like software dependencies, firewall requirements and the amount of configuration needed to deploy Tycho.

5 The Tycho Implementation Tycho is the reference implementation of the framework developed during the PhD: The Tycho components are: –Mediators, –Clients (Producers and Consumers), –Utilities: The Tycho mediator provides services that allow clients to discover each other using a Virtual Registry (VR) made up of a network of mediators – this also aids communication over both LAN and WAN. Utilities are extensions to Tycho’s functionality. Tycho used to be called javaGMA or jGMA (poor choice of name!)

6 Tycho’s Architecture

7 General Design Philosophy Reuse existing software components, if possible, rather than reinvent existing services or functionality. Try to make use of existing software infrastructure. Ensure that Tycho is simple to install, configure and use. Provide a ‘basic release’ with the ability to extend functionality with a further more sophisticated component - Tycho utilities. Because we require portability and interoperability with other distributed systems, Java was a good choice of implementation language.

8 Tycho Mediator Implementation Tycho provides a choice of implementations for each core service. Tycho’s design described in a paper for a "Work-in-Progress Novel Grid Technologies" track of the IEEE International Conference Cluster Computing and Grid 2005 (CCGrid 2005).

9 Tycho Clients & Utilities The Tycho Connector provides the API for building producers and consumers. Extra functionality can be added as utilities.

10 An Example of Tycho’s Setup

11 Tycho Benchmarks Three rounds of benchmarking to measure the performance of Tycho compared to state-of-the-art and widely used systems: –Communications - measured the performance of inter-client and inter-mediator messaging for Tycho and NaradaBrokering. –Virtual Registry tests - measured and compared the performance of the Tycho VR to Globus MDS4 and gLite R-GMA. –Component Tests - different components of the VR were tested in various configurations. Results presented in a paper in proceedings of the IEEE International Conference on Cluster Computing 2006 (Cluster 2006).

12 Sample VR Benchmark Results MDS4 out of memory

13 Benchmarks Results Summary Tycho has a better performance and client-scalability than both R-GMA, MDS4 and NaradaBrokering. R-GMA, MDS4 and NaradaBrokering all crashed during testing when they exceeded the maximum memory available for the tests (1.5 Gbytes). Memory management in Java systems is an issue: –Without limited buffering or flow control, consuming the Java heap is a problem. Storing information internally using XML seems to be a source for some of these memory problems: –Java database solutions such as HSQDLB can provide a high- performance solution for off-loading some of the storage requirements to disk.

14 Tycho Core – Future Work Some more performance improvements: – Caching of local mediator queries to reduce response times, – Use of a hybrid VR-interconnect to use IRC for query routing and HTTP for transporting large responses. Additional functionality can be added to provide advanced services: – WS-based transport handlers for interoperability.

15 Tycho Applications We developed a number of applications to further validate the implementation. These include: –Demonstrations of publishing and discovering distributed webcams, –Remote resource discovery for the VOTechBroker project: Part of the European Virtual Observatory project, Tycho provides automatic resource discovery for job submission. –Binding components together for the Semantic Log Analyser (Slogger) project: Here Tycho helps locate and gather distributed logs for analysis.

16 Content Distribution With Tycho We wanted to develop a Tycho utility that would demonstrate and validate the utility concept: –We wanted to create something useful! We created a content distribution system call the Tycho swarm utility. The swarm utility provides content distribution similar to BitTorrent and overcomes the common ‘2 Gigabyte file size problem’. Content is split into ‘chunks’ and the VR is used to store chunk availability. Peers use the VR to locate each other and decide what chunks to download. Tycho messages are used to transfer the chunks between peers and peers cooperate to distribute the content throughout the swarm.

17 Swarm Utility Architecture

18 Swarm Utility Summary. The utility was developed to test the potential of Tycho utilities and also further stress test the overall infrastructure: –By simultaneously utilising the VR and messaging functionality, –Storing and updating thousands of entry records in the VR, –Sending thousands of multi-megabyte messages between clients. Its potential uses include: –Distributing files for collaboration purposes, –Staging data for computation, –Mirroring and managing large data sets.

19 Swarm Utility Demo

20 Summary The reference implementation of Tycho has been completed. Tycho has been released under the LGPL Open Source license: – The focus now is on developing Tycho utilities to provide more feature rich functionally. This work has been summarised in a paper accepted for a special issue of The Journal of Supercomputing.

21 Research Goals Scalability and high-performance have been demonstrated by the benchmarking. Extensibility has been shown with the development of the swarm utility and the different services and protocols supported by Tycho. Tycho has security ‘out of the box’, using HTTPS and passwords or certificates for wide-area access control and encryption - no comparable system we reviewed has this currently. Manageability has been maximised, Tycho requires one firewall port, has no external dependencies other than a JVM and can run with zero configuration.

22 Some Experiences / Observations Java developers should think carefully about how memory is used in their applications. Systems which store their data internally as XML will probably have relatively poor performance and require large amounts of memory and resources to work. If you use a servlet container, Jetty offers much better performance than Apache Tomcat. Instead of using a separate database, consider the Java- based HSQLDB, we have shown it can achieve excellent performance and it removes an external dependency from your software. Java is not a magic bullet for portability, systems such as R- GMA are evidence of this.

23 Project Web page: – The DSG Web page: – The ACET Web page: – Links