On the Verge of One Petabyte – the Story Behind the BaBar Database System Jacek Becla Stanford Linear Accelerator Center For the BaBar Computing Group.

Slides:



Advertisements
Similar presentations
B A B AR and the GRID Roger Barlow for Fergus Wilson GridPP 13 5 th July 2005, Durham.
Advertisements

Windows Server ® 2008 File Services Infrastructure Planning and Design Published: June 2010 Updated: November 2011.
Database System Concepts and Architecture
Batch Production and Monte Carlo + CDB work status Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
Reconstruction and Analysis on Demand: A Success Story Christopher D. Jones Cornell University, USA.
EventStore Managing Event Versioning and Data Partitioning using Legacy Data Formats Chris Jones Valentin Kuznetsov Dan Riley Greg Sharp CLEO Collaboration.
Jean-Yves Nief, CC-IN2P3 Wilko Kroeger, SCCS/SLAC Adil Hasan, CCLRC/RAL HEPiX, SLAC October 11th – 13th, 2005 BaBar data distribution using the Storage.
1 Andrew Hanushevsky - HEPiX, October 6-8, 1999 Mass Storage For BaBar at SLAC Andrew Hanushevsky Stanford.
Microsoft ® Application Virtualization 4.5 Infrastructure Planning and Design Series.
Microsoft ® Application Virtualization 4.6 Infrastructure Planning and Design Published: September 2008 Updated: February 2010.
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
Database Services for Physics at CERN with Oracle 10g RAC HEPiX - April 4th 2006, Rome Luca Canali, CERN.
2/10/2000 CHEP2000 Padova Italy The BaBar Online Databases George Zioulas SLAC For the BaBar Computing Group.
The Worlds of Database Systems Chapter 1. Database Management Systems (DBMS) DBMS: Powerful tool for creating and managing large amounts of data efficiently.
Nightly Releases and Testing Alexander Undrus Atlas SW week, May
BaBar Grid Computing Eleonora Luppi INFN and University of Ferrara - Italy.
Jean-Yves Nief CC-IN2P3, Lyon HEPiX-HEPNT, Fermilab October 22nd – 25th, 2002.
8th November 2002Tim Adye1 BaBar Grid Tim Adye Particle Physics Department Rutherford Appleton Laboratory PP Grid Team Coseners House 8 th November 2002.
March 6, 2009Tofigh Azemoon1 Real-time Data Access Monitoring in Distributed, Multi Petabyte Systems Tofigh Azemoon Jacek Becla Andrew Hanushevsky Massimiliano.
ALICE Upgrade for Run3: Computing HL-LHC Trigger, Online and Offline Computing Working Group Topical Workshop Sep 5 th 2014.
N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER Charles Leggett The Athena Control Framework in Production, New Developments and Lessons Learned.
David N. Brown Lawrence Berkeley National Lab Representing the BaBar Collaboration The BaBar Mini  BaBar  BaBar’s Data Formats  Design of the Mini 
Data Distribution and Management Tim Adye Rutherford Appleton Laboratory BaBar Computing Review 9 th June 2003.
21 st October 2002BaBar Computing – Stephen J. Gowdy 1 Of 25 BaBar Computing Stephen J. Gowdy BaBar Computing Coordinator SLAC 21 st October 2002 Second.
9 February 2000CHEP2000 Paper 3681 CDF Data Handling: Resource Management and Tests E.Buckley-Geer, S.Lammel, F.Ratnikov, T.Watts Hardware and Resources.
Lessons Learned from Managing a Petabyte Jacek Becla Stanford Linear Accelerator Center (SLAC) Daniel Wang now University of CA in Irvine, formerly SLAC.
LOGO PROOF system for parallel MPD event processing Gertsenberger K. V. Joint Institute for Nuclear Research, Dubna.
LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Jean-Yves Nief CC-IN2P3, Lyon An example of a data access model in a Tier 1.
EGEE is a project funded by the European Union under contract IST HEP Use Cases for Grid Computing J. A. Templon Undecided (NIKHEF) Grid Tutorial,
LOGO Development of the distributed computing system for the MPD at the NICA collider, analytical estimations Mathematical Modeling and Computational Physics.
Hall-D/GlueX Software Status 12 GeV Software Review III February 11[?], 2015 Mark Ito.
Analysis trains – Status & experience from operation Mihaela Gheata.
CERN IT Department CH-1211 Genève 23 Switzerland t Frédéric Hemmer IT Department Head - CERN 23 rd August 2010 Status of LHC Computing from.
26 September 2000Tim Adye1 Data Distribution Tim Adye Rutherford Appleton Laboratory BaBar Collaboration Meeting 26 th September 2000.
BaBar -Overview of a running (hybrid) system Peter Elmer Princeton University 5 June, 2002.
The Million Point PI System – PI Server 3.4 The Million Point PI System PI Server 3.4 Jon Peterson Rulik Perla Denis Vacher.
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
CHEP 2000, Padova, Feb 2000 Operational Experience with the B A B AR Database David R. Quarrie Lawrence Berkeley National Laboratory for B A B AR Computing.
STAR C OMPUTING Plans for Production Use of Grand Challenge Software in STAR Torre Wenaus BNL Grand Challenge Meeting LBNL 10/23/98.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
A Fully Automated Fault- tolerant System for Distributed Video Processing and Off­site Replication George Kola, Tevfik Kosar and Miron Livny University.
9/29/2004 CHEP-2004 (Interlaken, Switzerland) 1 CDB – Distributed Conditions Database of the BaBar Experiment Igor A. Gaponenko (LBNL)
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
A UK Computing Facility John Gordon RAL October ‘99HEPiX Fall ‘99 Data Size Event Rate 10 9 events/year Storage Requirements (real & simulated data)
Tackling I/O Issues 1 David Race 16 March 2010.
1 CEG 2400 Fall 2012 Network Servers. 2 Network Servers Critical Network servers – Contain redundant components Power supplies Fans Memory CPU Hard Drives.
D0 File Replication PPDG SLAC File replication workshop 9/20/00 Vicky White.
Hands-On Microsoft Windows Server 2008 Chapter 7 Configuring and Managing Data Storage.
PetaCache: Data Access Unleashed Tofigh Azemoon, Jacek Becla, Chuck Boeheim, Andy Hanushevsky, David Leith, Randy Melen, Richard P. Mount, Teela Pulliam,
Jianming Qian, UM/DØ Software & Computing Where we are now Where we want to go Overview Director’s Review, June 5, 2002.
CDF SAM Deployment Status Doug Benjamin Duke University (for the CDF Data Handling Group)
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
11th September 2002Tim Adye1 BaBar Experience Tim Adye Rutherford Appleton Laboratory PPNCG Meeting Brighton 11 th September 2002.
PROOF system for parallel NICA event processing
BESIII data processing
GridPP10 Meeting CERN June 3 rd 2004
Database Replication and Monitoring
SuperB and its computing requirements
Maximum Availability Architecture Enterprise Technology Centre.
The ZEUS Event Store An object-oriented tag database for physics analysis Adrian Fox-Murphy, DESY CHEP2000, Padova.
Universita’ di Torino and INFN – Torino
Upgrading to Microsoft SQL Server 2014
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
Grid Canada Testbed using HEP applications
Operational Dataset Update Functionality Included in the NCAR Research Data Archive Management System Zaihua Ji Doug Schuster Steven Worley Computational.
Ch 4. The Evolution of Analytic Scalability
Using an Object Oriented Database to Store BaBar's Terabytes
ATLAS DC2 & Continuous production
An Interactive Browser For BaBar Databases
Presentation transcript:

On the Verge of One Petabyte – the Story Behind the BaBar Database System Jacek Becla Stanford Linear Accelerator Center For the BaBar Computing Group

2 of 18CHEP’03 Outline u Talk will cover –Our experience with running large scale DB system  Achievements, issues  New development and what drives it –Main focus on period since last CHEP

3 of 18CHEP’03 Providing Persistency for BaBar u Growing complexity and demands u Changing requirements u Hitting unforeseen limits in many places u Non-trivial maintenance –Most problems are persistent-technology independent –System becoming more and more distributed u Very lively environment –Production not as stable as one would imagine

4 of 18CHEP’03 Some Numbers u 750+ TB of data u 0.5+ million DB files u Several billion events u 60+ million collections u simultaneous analysis jobs accessing DB common

5 of 18CHEP’03 Data Availability is Essential u PromptCalibration –Rapid feedback, keeping up with detector u Event Reco (ER) –Data available for analysis within a week u Reprocessing –All data reprocessed before conferences u Analysis –Outages < 4%  driven mostly by power outages, hardware failures

6 of 18CHEP’03 What Changed Since Sep'01/last CHEP? u Event Reconstruction –4 output physics streams  20 –20 output streams  pointer collections –Rolling calibrations now separated –Runs now processed in parallel –Raw and rec not persisted anymore –Planning to run skim production separately continued…

7 of 18CHEP’03 What Changed Since Sep'01/last CHEP? u Simulation Production –1.5  3 MC events per real event –~8  ~24 production sites u Analysis –Bridge federations now fully functional –Significant system growth  29 data servers, 34 lock/journal servers  66TB disk space, 101 slave federations

8 of 18CHEP’03 Some Challenges Setting up ER/REP in Padova –All Linux based Recovery from Linux-client crashes leaving connections on server side  Three data corruptions 1.Understood and fixed – race condition: file descriptors closed/reopened incorrectly 2.Never understood, went away after power outage (Dec'02)  Not sure who is at fault: Objectivity? Linux kernel? 3.Problems with B-Tree index updates in Temporal Database Imposed by our software –Lock collisions –Large number of skim collections  Overflowing containers

9 of 18CHEP’03 Some New Features u Bridge federations –all 3 phases deployed u Data compression u New Conditions DB (CDB) u Automatic load balancing

10 of 18CHEP’03 Conditions DB u Main features –New conceptual model for metadata  2-d space of validity and insertion time, revisions, persistent configurations, types of conditions, hierarchical namespace for conditions –Flexible user data clustering –Support for distributed updates and use –State ID –Scalability problems solved –Significant ( x) speedup for critical use cases u Status –In production since Fall’02 –Data converted to new format –Working on distributed management tools

11 of 18CHEP’03 AMS Load Balancing u Dynamically stages in/replicates files –Based on configurable parameters and host load u Increases fault tolerance –Data servers can be taken offline transparently u Scalable –Hierarchical u Currently being tested Dynamic Selection DistinguishedAMS

12 of 18CHEP’03 Size u Raw/rec not persisted –Event: ~200 kB  ~20kB u Continues to grow fast –Higher luminosity –115 skims –Reprocessing all data every year –More MC events (1.5:1  3:1) u Reducing size –Event store redesign (see talk by Yemi tomorrow) –Data compression (achieving ~2:1 compression)

13 of 18CHEP’03 Media Attention u World’s largest database –500 TB – see SLAC press release (Apr/02) u Many ideas/problems/solutions common to any large scale database system u Newspaper and local TV coverage –Non-HEP attention

14 of 18CHEP’03 Size matters in data world Mountains Of Data: 500 Terabytes And Counting A firm grip, or gagging on gigabytes? Stanford claims world's largest database 500,000 gigabytes and growing: SLAC houses world's largest database University database breaks world record Stanford Linear Accelerator Database Reaches 500,000 Gigabytes Stanford researchers may have world’s largest database

15 of 18CHEP’03 New Computing Model u Discussed Fall’02 u Main decisions –Two stage approach  Develop "new micro" in ROOT-based –alternative to nTuples  Develop full event store in ROOT-based –Deprecate ROOT-based conditions  Use existing Objy-based conditions u Main reasons to change –To follow general HEP trend –To allow interactive analysis in ROOT

16 of 18CHEP’03 Summary u DB system keeps up with excellent B-Factory performance –No major problems/showstoppers –Tackling with growing size, complexity and demands u Event store technology based on Objectivity –A good, working model, proven in production –Not well proven in analysis  Most users extract data to nTuples –Likely to be deprecated soon u May'99 - Mar'03 –Undoubtedly a successful chapter for the BaBar DB

17 of 18CHEP’03 Acknowledgements u Development Team –Andy Hanushevsky –Andy Salnikov (online databases) –Daniel Wang (started Sep’02) –David Quarrie (gone Oct’01) –Igor Gaponenko –Simon Patton (gone March ’02) –Yemi Adesanya u Operations Team –Adil Hasan –Artem Trunov –Wilko Kroeger –Tofigh Azemoon

18 of 18CHEP’03 Some Related BaBar Talks u Operation Aspects of Dealing with the Large BaBar Data Set –Category 8, Tuesday 3:30pm u The Redesigned BaBar Event Store – Believe the Hype –Category 8, Tuesday 4:50pm u BdbServer++: A User Instigated Data Location and Retrieval Tool –Category 2 u Distributing BaBar Data Using SRB –Category 2 u Distributed Offline Data Reconstruction in BaBar –Category 3, Tuesday, 6:10pm