CMS Conditions Data Access using FroNTier Lee Lueking CMS Offline Software and Computing 5 September 2007 CHEP 2007 – Distributed Data Analysis and Information.

Slides:



Advertisements
Similar presentations
DataGrid is a project funded by the European Union 22 September 2003 – n° 1 EDG WP4 Fabric Management: Fabric Monitoring and Fault Tolerance
Advertisements

ManageEngine TM Applications Manager 8 Monitoring Custom Applications.
Web Servers How do our requests for resources on the Internet get handled? Can they be located anywhere? Global?
Software Frameworks for Acquisition and Control European PhD – 2009 Horácio Fernandes.
Load Sharing and Balancing - Saravanan Mathialagan Masters in Computer Science Georgia State University.
F Fermilab Database Experience in Run II Fermilab Run II Database Requirements Online databases are maintained at each experiment and are critical for.
MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 7 Configuring File Services in Windows Server 2008.
Microsoft Load Balancing and Clustering. Outline Introduction Load balancing Clustering.
Understanding and Managing WebSphere V5
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
Microsoft ® Application Virtualization 4.6 Infrastructure Planning and Design Published: September 2008 Updated: February 2010.
Database monitoring and service validation Dirk Duellmann CERN IT/PSS and 3D
2/10/2000 CHEP2000 Padova Italy The BaBar Online Databases George Zioulas SLAC For the BaBar Computing Group.
CMS Calibration-Alignment Databases for The Magnet Test and Beyond Vincenzo Innocente On Behalf of CMS Collaboration 3D Workshop October 2005.
Bob Thome, Senior Director of Product Management, Oracle SIMPLIFYING YOUR HIGH AVAILABILITY DATABASE.
Module 14: Configuring Print Resources and Printing Pools.
Oracle10g RAC Service Architecture Overview of Real Application Cluster Ready Services, Nodeapps, and User Defined Services.
FroNtier: High Performance Database Access Using Standard Web Components in a Scalable Multi-tier Architecture Marc Paterno Fermilab CHEP 2004 Sept. 27-Oct.
Nightly Releases and Testing Alexander Undrus Atlas SW week, May
Network Management Tool Amy Auburger. 2 Product Overview Made by Ipswitch Affordable alternative to expensive & complicated Network Management Systems.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
Building a distributed software environment for CDF within the ESLEA framework V. Bartsch, M. Lancaster University College London.
+ discussion in Software WG: Monte Carlo production on the Grid + discussion in TDAQ WG: Dedicated server for online services + experts meeting (Thusday.
Module 10: Monitoring ISA Server Overview Monitoring Overview Configuring Alerts Configuring Session Monitoring Configuring Logging Configuring.
Marianne BargiottiBK Workshop – CERN - 6/12/ Bookkeeping Meta Data catalogue: present status Marianne Bargiotti CERN.
CHEP'07 September D0 data reprocessing on OSG Authors Andrew Baranovski (Fermilab) for B. Abbot, M. Diesburg, G. Garzoglio, T. Kurca, P. Mhashilkar.
Scalable Web Server on Heterogeneous Cluster CHEN Ge.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES PhEDEx Monitoring Nicolò Magini CERN IT-ES-VOS For the PhEDEx.
LHC: ATLAS Experiment meeting “Conditions” data challenge Elizabeth Gallas - Oxford - August 29, 2009 XLDB3.
Databases E. Leonardi, P. Valente. Conditions DB Conditions=Dynamic parameters non-event time-varying Conditions database (CondDB) General definition:
Web application for detailed real-time database transaction monitoring for CMS condition data ICCMSE 2009 The 7th International Conference of Computational.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
ALICE, ATLAS, CMS & LHCb joint workshop on
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
09/02 ID099-1 September 9, 2002Grid Technology Panel Patrick Dreher Technical Panel Discussion: Progress in Developing a Web Services Data Analysis Grid.
EGEE is a project funded by the European Union under contract IST HEP Use Cases for Grid Computing J. A. Templon Undecided (NIKHEF) Grid Tutorial,
CMS pixel data quality monitoring Petra Merkel, Purdue University For the CMS Pixel DQM Group Vertex 2008, Sweden.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
Database Monitoring Requirements Salvatore Di Guida (CERN) On behalf of the CMS DB group.
Elizabeth Gallas August 9, 2005 CD Support for D0 Database Projects 1 Elizabeth Gallas Fermilab Computing Division Fermilab CD Grid and Data Management.
 Load balancing is the process of distributing a workload evenly throughout a group or cluster of computers to maximize throughput.  This means that.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
The CMS Simulation Software Julia Yarba, Fermilab on behalf of CMS Collaboration 22 m long, 15 m in diameter Over a million geometrical volumes Many complex.
Grid Deployment Enabling Grids for E-sciencE BDII 2171 LDAP 2172 LDAP 2173 LDAP 2170 Port Fwd Update DB & Modify DB 2170 Port.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Michele de Gruttola 2008 Report: Online to Offline tool for non event data data transferring using database.
3D Testing and Monitoring Lee Lueking LCG 3D Meeting Sept. 15, 2005.
Analysis Tools at D0 PPDG Analysis Grid Computing Project, CS 11 Caltech Meeting Lee Lueking Femilab Computing Division December 19, 2002.
November 1, 2004 ElizabethGallas -- D0 Luminosity Db 1 D0 Luminosity Database: Checklist for Production Elizabeth Gallas Fermilab Computing Division /
11th November Richard Hawkings Richard Hawkings (CERN) ATLAS reconstruction jobs & conditions DB access  Conditions database basic concepts  Types.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
FroNtier Stress Tests at Tier-0 Status report Luis Ramos LCG3D Workshop – September 13, 2006.
ATLAS FroNTier cache consistency stress testing David Front Weizmann Institute 1September 2009 ATLASFroNTier chache consistency stress testing.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
1 Evaluation of Cooperative Web Caching with Web Polygraph Ping Du and Jaspal Subhlok Department of Computer Science University of Houston presented at.
ATLAS The ConditionDB is accessed by the offline reconstruction framework (ATHENA). COOLCOnditions Objects for LHC The interface is provided by COOL (COnditions.
Interstage BPM v11.2 1Copyright © 2010 FUJITSU LIMITED INTERSTAGE BPM ARCHITECTURE BPMS.
Time critical condition data handling in the CMS experiment during the first data taking period CHEP 2010 Software Engineering, Data Stores, and Databases.
FroNTier at BNL Implementation and testing of FroNTier database caching and data distribution John DeStefano, Carlos Fernando Gamboa, Dantong Yu Grid Middleware.
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
Bentley Systems, Incorporated
Database Replication and Monitoring
Netscape Application Server
Lee Lueking WLCG Workshop DB BoF 22 Jan. 2007
Web Development Web Servers.
Conditions Data access using FroNTier Squid cache Server
Grid Canada Testbed using HEP applications
Patrick Dreher Research Scientist & Associate Director
Mark Quirk Head of Technology Developer & Platform Group
Presentation transcript:

CMS Conditions Data Access using FroNTier Lee Lueking CMS Offline Software and Computing 5 September 2007 CHEP 2007 – Distributed Data Analysis and Information Management

5 September, 2007CMS FroNTier2 Outline Motivation Implementation Details Deployment Overview Performance Acknowledgements The Frontier Team: Barry Blumenfeld (JHU), David Dykstra (FNAL), Eric Wicklund (FNAL)

5 September, 2007CMS FroNTier3 Motivation CMS conditions data includes calibration, alignment, and configuration information used for offline detector event data processing. Conditions data is keyed by time (run number) and defined to be immutable, i.e. new entries require new tags (versions). A given object may be used by thousands of jobs. Caching such info close to the processing activity provides significant performance gains. Readily deployable, highly reliable and easily maintainable web proxy/caching servers are a logical solution.

5 September, 2007CMS FroNTier4 CMS Software Stack POOL-ORA (Object Relational Access) is used to map C++ objects to Relational schema. A CORAL-Frontier plugin provides read- only access to the POOL DB objects via Frontier. User Application EDM Framework EDM EventSetup POOL-ORA Frontier/Cache Database CORAL FrontierOracle

5 September, 2007CMS FroNTier5 Implementation Pool and CORAL generate SQL queries from the CMSSW C++ objects. The FroNTier client converts the SQL into an HTTP GET and sends it over the network to the FroNTier server. The FroNTier server, a servlet under Tomcat, unpacks the SQL request, sends it to the DB server, and retrieves the needed data. The data is optionally compressed, and then packed into an HTTP formatted stream sent back to the client. Squid proxy/caching server(s) between the FroNTier server and client caches requested objects, significantly improving performance and reducing load on the DB.

5 September, 2007CMS FroNTier6 Advantages The system uses standard tools –Highly reliable – Tomcat, Squids well proven –Easy installation – Tar ball and script –Highly configurable – Customize to site environment, security, etc. –Well documented – Books and web sites –Readily monitored – SNMP, MRTG, awstats Easy administration at Tier-N centers –No DBA’s needed beyond central –Caches are loaded on demand and self managing

5 September, 2007CMS FroNTier7 Overview ORCOFF ORCON Offline FroNTier Launch- pad Squid 1 Squid 4 HLT FroNTier Server(s) HLT Filter Farm SQUID Hierarchy Wide Area Network Wide Area Network Tier0 Farm Detector Site (Point 5) Offline

5 September, 2007CMS FroNTier8 FroNTier “Launchpad” Squid caching proxy –Load shared with Round-Robin DNS –Configured in “accelerator mode” –“Wide open frontier”* –Peer caching removed because it was incompatible with collapsed forwarding. Tomcat - standard FroNTier servlet –Distributed as WAR (Web ARchive) Unpack in Tomcat webapps dir Change 2 files if name is different –One xml file describes DB connection DB Squid Tomcat FroNTier servlet FroNTier servlet Round Robin DNS server1 server2 * The squids in the launchpad ONLY talk to the Frontier Tomcat servers. No registration or ACL’s required.

5 September, 2007CMS FroNTier9 Cache Coherency (1/2) So-called “metadata” for conditions data. These are names or “tags” that refer to a specific set of IOV’s (Interval Of Validity) and associated payloads in the pool-ora repository. By decree, data in Conditions IOV can NOT change. Therefore, caching is OK. RunPOOL Token 100To payload 1 200To payload 2 300To payload 3 500To payload 4 New Run New payload Appended Name (tag)POOL Token “Test”To container A “Online”To container B “prod”To container C “stuff”To Container D Container B Metadata Conditions IOV BUT… New IOV’s and payloads can be appended to in order to extend the IOV range. This is NOT OK for caching. Lower end of run range

5 September, 2007CMS FroNTier10 Cache Coherency (2/2) After considering many ideas, adopted following solution: –All objects that are cached have expiration times –Shortlived: Metadata objects, including the pointers to payload objects, expire on a short time period –Longlived: Payload objects have a long expiration time. The values of the short and long times are adjusted according to where the data is being used. For example: –Online: the calibrations change quickly as new data is added for upcoming runs. –Tier 0: calibrations change on the order of a few hours as new runs appear for reconstruction. –Tier 1 +: Conditions data may be stable for weeks. The value of the short and long expirations is modified at the central FroNTier server, so they can easily be tuned as needed.

5 September, 2007CMS FroNTier11 Squid Deployment Status Late 2005, 10 centers used for testing Additional installation May through Oct used for CSA06 Additional sites for CSA07 possible Very few problems with the installation procedures CMS provides. Sites Deployed

5 September, 2007CMS FroNTier12 Service Availability Monitoring (SAM) Eric Wicklund/ Stefano Belforte Check integrity of local squid, and Overall Frontier system and connection to CERN Heartbeat monitoring tests are run every 2 hours 40 sites included

5 September, 2007CMS FroNTier13 Launchpad Operation Number of daily objects varies widely Peak day (10**8) was fail-over from local squid at Bari, good test of infrastructure / Day100 GB/Day Object size depends on type of activity occurring All activity on Launchpad: Production, Development and Testing Dec 06Aug 07Dec 06Aug 07 Log

5 September, 2007CMS FroNTier14 Specific Challenges ParameterHLTTier0 # Nodes # Processes~16k~3k Startup<10 sec all clients <100 sec per client Client AccessSimultaneousStaggered Cache Load< 1 Min N/A Tot Obj Size100 MB*150 MB* New Objects100% / run* # Squids1 per nodeScalable (2-8) * Worst case scenario HLT (High Level Trigger) –Startup time for Cal/Ali < 10 seconds. –Simultaneous –Uses hierarchy of squid caches Tier0 (Prompt Reconstruct) –Startup time for conditions load < 1% of total job time. –Usually staggered –DNS Round Robin should scale to 8 squids

5 September, 2007CMS FroNTier15 Starting Many Jobs Simultaneously Online HLT Problem –All nodes start same application at the same time –Pre-loading data must be < 1 minute –Loading data to jobs must be < 10 seconds –Estimating 100MB of data, 2000 nodes, 8 jobs/node 100 * 2000 * 8 = 1.6TB –Asymmetrical network Nodes organized in 50 racks of 40 nodes each non-blocking gigabit intra-rack, gigabit inter-rack

5 September, 2007CMS FroNTier16 Starting Many Jobs Simultaneously Online HLT Solution –Configured to pre-load simultaneously in tiers –Each squid feeding 4 means 6 tiers for 2000 nodes 50 racks reached in 3 tiers, 3 tiers inside each rack –Measurements on test cluster indicate requirements can be met bottleneck becomes the conversion from DB to httpin FroNTier server 10-second loading always reads from pre-filled local squid

5 September, 2007CMS FroNTier17 Testing w/ Multiple NICs Potential Tier-0 and Tier-1 Augmentation On multi-processor/multi-core machines CPU resources under utilized. Using multiple network interfaces can resolve this. Two approaches tried –Multi-homed (2 ip addresses): machine looks like multiple nodes –Bonded interfaces (1 ip address): multiple NICs used together to increase throughput. Using Bonded approach at FNAL –Works best w/ specific load balancing approach –Able to improve throughput by factor of two over single unbonded NIC –Requires 2 squids on same server machine, ~200 MBps (Squid single threaded). Many sites, for now, prefer just adding more machines to make network management simpler. This may change. Squid 1 Squid 2 NIC 1NIC 2 Switch

5 September, 2007CMS FroNTier18 Summary FroNTier is used by CMS for all access to all conditions data. The ease of deployment, stability of operation, and high performance make the FroNTier approach well suited to the GRID environment being used for CMS offline, as well as for the online environment used by the CMS High Level Trigger (HLT). The use of standard software, such as Squid and various monitoring tools, make the system reliable, highly configurable and easy to maintain. We have gained significant operational experience over the last year in CMS, there are currently 40 squid sites being monitored and many more expected.

Finish

5 September, 2007CMS FroNTier20 Example Calibration and Alignment Object Sizes (Monte Carlo) Detector Sub-System (not all systems included) Data size (MB) Non- compressed Compressed (zipped) HCAL10.4 ECAL73.2 Drift Tubes12? Si Track20? Pixel Track130? Current Total in MC 280? Table shows examples of the size of conditions data for a few sub-detectors Zipping of data can significantly reduce the size of network transfers, at the cost of some server and client performance. –Online and transfers local to CERN it does not help –For sites remote to CERN it is useful Object sizes and zipping factors for real detector data may be quite different than for MC Compression factors unrealistic for MC data