1 Object Level Physics Data Replication in the Grid Koen Holtman Caltech/CMS ACAT’2000, Fermilab October 16-20, 2000.

Slides:



Advertisements
Similar presentations
GridPP July 2003Stefan StonjekSlide 1 SAM middleware components Stefan Stonjek University of Oxford 7 th GridPP Meeting 02 nd July 2003 Oxford.
Advertisements

WP2: Data Management Gavin McCance University of Glasgow November 5, 2001.
Welcome to Middleware Joseph Amrithraj
ICS 434 Advanced Database Systems
10/24/2010 Distributed Software Development. 10/24/2010 Content Recommendation and Reuse Current state – Beta Prototype Hamid Riaz, Loredana Baračić.
Building a Large Location Table to Find Replicas of Physics Objects Koen Holtman Heinz Stockinger CERN/CMS CHEP ’2000 Feb 7-11, 2000.
Andrew McNab - Manchester HEP - 6 November Old version of website was maintained from Unix command line => needed (gsi)ssh access.
Technical Architectures
Voyager Interest Group Voyager Access Reports: what they are and how they work October 29, 2008.
GRID DATA MANAGEMENT PILOT (GDMP) Asad Samar (Caltech) ACAT 2000, Fermilab October , 2000.
Securing Enterprise Applications Rich Cole. Agenda Sample Enterprise Architecture Sample Enterprise Architecture Example of how University Apps uses Defense.
Distributed Data Management for Compute Grid Presented by Michael Di Stefano Founder of Author of Meeting: Tuesday, September 13 th, 2005.
Computer Science 101 Web Access to Databases Overview of Web Access to Databases.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
The Client/Server Database Environment
Client – Server Architecture A Basic Introduction Kathleen R. Murray, Ph.D. May 2002.
Minerva Infrastructure Meeting – October 04, 2011.
Software Development Architectures Ankur K. Rajopadhye Louisiana Tech University.
Edinburgh University Experimental Particle Physics Alasdair Earl PPARC eScience Summer School September 2002.
Report : Zhen Ming Wu 2008 IEEE 9th Grid Computing Conference.
Management of Source Code Integrity Presented by O/o the Accountant General (A&E), Jammu and Kashmir.
Week 7 Lecture Web Database Development Samuel Conn, Asst. Professor
Chapter 1: Introduction to Web
Don Quijote Data Management for the ATLAS Automatic Production System Miguel Branco – CERN ATC
ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader.
M1G Introduction to Database Development 6. Building Applications.
CS480 Computer Science Seminar Introduction to Microsoft Solutions Framework (MSF)
Scalable Systems Software Center Resource Management and Accounting Working Group Face-to-Face Meeting October 10-11, 2002.
Client – Server Architecture. Client Server Architecture A network architecture in which each computer or process on the network is either a client or.
Topaz : A GridFTP extension to Firefox M. Taufer, R. Zamudio, D. Catarino, K. Bhatia, B. Stearn University of Texas at El Paso San Diego Supercomputer.
File and Object Replication in Data Grids Chin-Yi Tsai.
Information System Development Courses Figure: ISD Course Structure.
Database Essentials. Key Terms Big Data Describes a dataset that cannot be stored or processed using traditional database software. Examples: Google search.
Kingdom of Saudi Arabia Ministry of Higher Education Al-Imam Muhammad Ibn Saud Islamic University College of Computer and Information Sciences Chapter.
ILDG Middleware Status Chip Watson ILDG-6 Workshop May 12, 2005.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
1 Grid Related Activities at Caltech Koen Holtman Caltech/CMS PPDG meeting, Argonne July 13-14, 2000.
What’s new in Kentico CMS 5.0 Michal Neuwirth Product Manager Kentico Software.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Code Applications Tamas Kiss Centre for Parallel.
The Advanced Data Searching System The Advanced Data Searching System with 24 February APCTP 2010 J.H Kim & S. I Ahn & K. Cho on behalf of the Belle-II.
Copyright © cs-tutorial.com. Overview Introduction Architecture Implementation Evaluation.
09/02 ID099-1 September 9, 2002Grid Technology Panel Patrick Dreher Technical Panel Discussion: Progress in Developing a Web Services Data Analysis Grid.
Globus Toolkit Massimo Sgaravatto INFN Padova. Massimo Sgaravatto Introduction Grid Services: LHC regional centres need distributed computing Analyze.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Applications.
GCRC Meeting 2004 BIRN Coordinating Center Software Development Vicky Rowley.
STAR Collaboration, July 2004 Grid Collector Wei-Ming Zhang Kent State University John Wu, Alex Sim, Junmin Gu and Arie Shoshani Lawrence Berkeley National.
NOVA A Networked Object-Based EnVironment for Analysis “Framework Components for Distributed Computing” Pavel Nevski, Sasha Vanyashin, Torre Wenaus US.
Web Server.
Objective What is RFT ? How does it work Architecture of RFT RFT and OGSA Issues Demo Questions.
Communications & Networks National 4 & 5 Computing Science.
IT System Administration Lesson 3 Dr Jeffrey A Robinson.
Oracle to MySQL synchronization Gianni Pucciani CERN, University of Pisa.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
TM 8-1 Copyright © 1999 Addison Wesley Longman, Inc. Client/Server and Middleware.
10 May 2001WP6 Testbed Meeting1 WP5 - Mass Storage Management Jean-Philippe Baud PDP/IT/CERN.
Client – Server Architecture A Basic Introduction 1.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Cloud Computing ILAS project DONE BY:. Table of content INTRODUCTION. ◦ Cloud computing in general ◦ What are the things that worked during the implementation.
Enabling Grids for E-sciencE CMS/ARDA activity within the CMS distributed system Julia Andreeva, CERN On behalf of ARDA group CHEP06.
Grid Activities in CMS Asad Samar (Caltech) PPDG meeting, Argonne July 13-14, 2000.
Grid Execution Management for Legacy Code Architecture Exposing legacy applications as Grid services: the GEMLCA approach Centre.
1 CMS Virtual Data Overview Koen Holtman Caltech/CMS GriPhyN all-hands meeting, Marina del Rey April 9, 2001.
Current Globus Developments Jennifer Schopf, ANL.
Using Correlated Tracing to Diagnose Query Level Performance What’s slowing down my app? Jerome Halmans Senior Software Development Engineer Microsoft.
IPv6 Readiness of Server ICT333-Information Technology Project Presented by IT08.
The Client/Server Database Environment
University of Technology
File Transfer Protocol
February 16th, 2004 Class Meeting 5
Presentation transcript:

1 Object Level Physics Data Replication in the Grid Koen Holtman Caltech/CMS ACAT’2000, Fermilab October 16-20, 2000

2 Introduction Object replication: Replicate (parts of) particular events (from a big store to a smaller store) Doing this in a convenient way is important to make tier 3-4 regional centre systems useful Got prototype object replication tool working now Uses Grid middleware (Globus), Objectivity Replicates objects from big data store to desktop machine Demo at SC’2000 Will do some integration into CMS physics analysis system (ORCA)

3 Data model Physics objects: read-only (versioning, not overwriting) Objects are organised as a sparse (SQL) Table Rows=events, event ID=`numbering domain + 64 bit integer' Columns=object types+versions, ID = a string (or URL) Software tools know how to resolve from row IDs+column IDs to the objects Can do this quickly in an iterator

4 Most basic use case Replicate a subset of objects to user desktop machine Do not ship objects but (database) files with object sets Works as follows…. Note: since 2 weeks ORCA also supports something like this

5 More complex use case Replicate some more objects This is where tool support (and strong catalog technology) really starts to become useful! Catalog uses per-file object indices sorted on the 64 bit number of the corresponding event Can do fast set operations

6 Use of Globus middleware Globus components currently used: Communication with server: GSI-authenticated TCP/IP connection (re-using code from GDMP) Shipping files: GSI authenticated FTP Will directly benefit from future improvements in FTP space Good experience using Globus components so far Intend to use of additional Globus components in future, when extending the prototype Object Replication Server Object Replication Client GSI authenticated connection GSI FTP

7 Some performance results Replicating 1900 objects of 100KB each With both client and server on my desktop Linux machine: 5.8 MB/s Replicating to another building: 0.72 MB/s (network bound)

8 Conclusion Current status: Prototype object replication tool is working Future plans: Integration with GDMP Integration with CMS ORCA Whether this will move beyond the prototype stage will depend on CMS production needs Prototype will function as input to architecture discussions in Grid efforts (files vs. objects) Prototype development will move from client- server to N-party scenario