POOL/RLS Experience Current CMS Data Challenges shows clear problems wrt to the use of RLS Partially due to the normal “learning curve” on all sides in.

Slides:



Advertisements
Similar presentations
Data Management Expert Panel. RLS Globus-EDG Replica Location Service u Joint Design in the form of the Giggle architecture u Reference Implementation.
Advertisements

Database System Concepts and Architecture
Database Architectures and the Web
Transaction.
The POOL Persistency Framework POOL Summary and Plans.
RLS Production Services Maria Girone PPARC-LCG, CERN LCG-POOL and IT-DB Physics Services 10 th GridPP Meeting, CERN, 3 rd June What is the RLS -
Technical Architectures
Oxford Jan 2005 RAL Computing 1 RAL Computing Implementing the computing model: SAM and the Grid Nick West.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 1- 1.
MIS 710 Module 0 Database fundamentals Arijit Sengupta.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 1- 1 Chapter 1 - Introduction: Databases and Database Users - Outline Types of Databases and.
1 Dr. Markus Hillenbrand, ICSY Lab, University of Kaiserslautern, Germany A Generic Database Web Service for the Venice Service Grid Michael Koch, Markus.
Database Design – Lecture 16
Robotics Simulation (Skynet) Andrew Townsend Advisor: Professor Grant Braught.
Scalable Web Server on Heterogeneous Cluster CHEN Ge.
Session-8 Data Management for Decision Support
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 12 Distributed Database Management Systems.
Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Copyright © 2004 Pearson Education, Inc. Slide 2-1 Data Models Data Model: A set.
Distributed Information Systems. Motivation ● To understand the problems that Web services try to solve it is helpful to understand how distributed information.
Replica Management Services in the European DataGrid Project Work Package 2 European DataGrid.
POOL Status and Plans Dirk Düllmann IT-DB & LCG-POOL Application Area Meeting 10 th March 2004.
Metadata Mòrag Burgon-Lyon University of Glasgow.
Topic Distributed DBMS Database Management Systems Fall 2012 Presented by: Osama Ben Omran.
Distributed File Systems 11.2Process SaiRaj Bharath Yalamanchili.
D. Duellmann - IT/DB LCG - POOL Project1 The LCG Pool Project and ROOT I/O Dirk Duellmann What is Pool? Component Breakdown Status and Plans.
Object storage and object interoperability
ESG-CET Meeting, Boulder, CO, April 2008 Gateway Implementation 4/30/2008.
EJB Enterprise Java Beans JAVA Enterprise Edition
1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September 2008, revised October 2009.
POOL & ARDA / EGEE POOL Plans for 2004 ARDA / EGEE integration Dirk Düllmann, IT-DB & LCG-POOL LCG workshop, 24 March 2004.
D. Duellmann - IT/DB LCG - POOL Project1 Internal Pool Release V0.2 Dirk Duellmann.
LHCC Referees Meeting – 28 June LCG-2 Data Management Planning Ian Bird LHCC Referees Meeting 28 th June 2004.
A quick summary and some ideas for the 2005 work plan Dirk Düllmann, CERN IT More details at
Introduction: Databases and Database Systems Lecture # 1 June 19,2012 National University of Computer and Emerging Sciences.
System Software Laboratory Databases and the Grid by Paul Watson University of Newcastle Grid Computing: Making the Global Infrastructure a Reality June.
Binary Change Set Composition Tijs van der Storm Centrum voor Wiskunde en Infromatica
Chapter 1 Database and Database Users
CS 540 Database Management Systems
CS4222 Principles of Database System
Introduction To DBMS.
Database Management.
EGEE Middleware Activities Overview
(on behalf of the POOL team)
Outline Types of Databases and Database Applications Basic Definitions
N-Tier Architecture.
POOL persistency framework for LHC
LCG Distributed Deployment of Databases A Project Proposal
Dirk Düllmann CERN Openlab storage workshop 17th March 2003
Chapter 19: Distributed Databases
LCG middleware and LHC experiments ARDA project
Francesco Giacomini – INFN JRA1 All-Hands Nikhef, February 2008
#01 Client/Server Computing
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
Ministry of Higher Education
Chapter 2: Database System Concepts and Architecture
Data, Databases, and DBMSs
Database management concepts
Grid Data Integration In the CMS Experiment
Dr. Awad Khalil Computer Science Department AUC
Introduction to Databases Transparencies
Lecture 1: Multi-tier Architecture Overview
Outline Introduction Background Distributed DBMS Architecture
Database management concepts
What is a Database? Textbook: Chapter 1 © 1999 UW CSE 2/23/2019.
POOL Status & Release Plan for V0.4
What is a Database? © 1997 UW CSE 4/11/2019.
Database System Concepts and Architecture
Dr. Awad Khalil Computer Science Department AUC
#01 Client/Server Computing
Database management systems
Presentation transcript:

POOL/RLS Experience Current CMS Data Challenges shows clear problems wrt to the use of RLS Partially due to the normal “learning curve” on all sides in using a new systems Some reasons are Not yet fully optimised service Inefficient use of the query facilities POOL and RLS service people works closely with production teams to understand their issues Which queries are needed? How to structure the meta data? Which catalog interface? Which indices? POOL & ARDA / EGEE D.Duellmann

More POOL/RLS Experience But poor performance also due to known RLS design problems! File names and related meta data are used for queries Current RLS split of mapping data from file meta data (LRC vs. RMC) results in rather poor performance for combined queries Forces the applications (eg POOL) to perform large joins on the client side rather than fully exploit the database backend Many catalog operations are bulk operations Current RLS interface is very low level and results in large overheads on bulk operations (too many network round-trips) Transaction support would greatly simplify the deployment A partially successful bulk insert/update requires recovery “by hand” These are not really special requirements imposed by POOL Still acceptable performance and scalability needs a catalog design which keeps the data which is used in one query close to each other Try to work around some of this know issues on the POOL side POOL & ARDA / EGEE D.Duellmann

Summary POOL Focus for 2004 POOL will be a major ARDA/EGEE client Consolidation and Optimisation RDBMS vendor independence Common model for distributed, heterogeneous meta data catalogs ConditionsDB production release and integration with POOL POOL will be a major ARDA/EGEE client Needs to stay aligned with ARDA concepts and EGEE services Provider of persistent object storage and collections Joint work package between POOL and ARDA in particular in the Collections area Need more active experiment involvement Gaining valuable real life (data challenge) experience with POOL/RLS as input for next round Produces concrete experiment requirements as input to ARDA/EGEE POOL may be able to workaround some of the RLS design problems A real solution will be required from ARDA/EGGE to achieve the performance and scalability goals POOL & ARDA / EGEE D.Duellmann

Input for a next software generation Catalogs of “things” annotated with their meta data exist all over the system These catalogs services could/should share the implementation and distribution mechanism Separation of catalog mapping data from associated meta data makes meta data almost useless Efficient queries require that mapping and meta data are handled by (in!) one same database backend Higher level interface for bulk insert and bulk query is required The current use of SOAP RPC call for each individual data entry will not scale to larger productions Transaction concept is required for a maintainable stable production environment User transactions may span span several services! POOL & ARDA / EGEE D.Duellmann