10 Oct. 2001CMS/ROOT Workshop1 Networking, Remote Access, SQL, PROOF and GRID Fons Rademakers René Brun.

Slides:



Advertisements
Similar presentations
CCNA – Network Fundamentals
Advertisements

June, 20013rd ROOT Workshop1 PROOF and ROOT Grid Features Fons Rademakers.
1 PROOF & GRID Update Fons Rademakers. 2 Parallel ROOT Facility The PROOF system allows: parallel execution of scripts parallel analysis of trees in a.
O. Stézowski IPN Lyon AGATA Week September 2003 Legnaro Data Analysis – Team #3 ROOT as a framework for AGATA.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
June 21, PROOF - Parallel ROOT Facility Maarten Ballintijn, Rene Brun, Fons Rademakers, Gunter Roland Bring the KB to the PB.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
ROOT Tutorials - Session 41 ROOT Tutorials – Session 4 Remote File Access, Networking SQL & Threads Fons Rademakers.
Control and monitoring of on-line trigger algorithms using a SCADA system Eric van Herwijnen Wednesday 15 th February 2006.
 Distributed Software Chapter 18 - Distributed Software1.
Process-to-Process Delivery:
PROOF: the Parallel ROOT Facility Scheduling and Load-balancing ACAT 2007 Jan Iwaszkiewicz ¹ ² Gerardo Ganis ¹ Fons Rademakers ¹ ¹ CERN PH/SFT ² University.
Database System Concepts and Architecture Lecture # 3 22 June 2012 National University of Computer and Emerging Sciences.
PROOF - Parallel ROOT Facility Kilian Schwarz Robert Manteufel Carsten Preuß GSI Bring the KB to the PB not the PB to the KB.
Status of SQL and XML I/O Sergey Linev, GSI, Darmstadt, Germany.
June 2001ROOT 2001 FNAL1 What Else is New Since ROOT 2000? Fons Rademakers.
AliEn uses bbFTP for the file transfers. Every FTD runs a server, and all the others FTD can connect and authenticate to it using certificates. bbFTP implements.
Globus Striped GridFTP Framework and Server Raj Kettimuthu, ANL and U. Chicago.
By Lecturer / Aisha Dawood 1.  You can control the number of dispatcher processes in the instance. Unlike the number of shared servers, the number of.
MySQL. Dept. of Computing Science, University of Aberdeen2 In this lecture you will learn The main subsystems in MySQL architecture The different storage.
ROOT Tutorials - Session 101 PROOT Tutorials – Session 10 PROOF, GRID, AliEn Fons Rademakers Bring the KB to the PB not the PB to the KB.
1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.
File and Object Replication in Data Grids Chin-Yi Tsai.
Database Systems: Design, Implementation, and Management Tenth Edition Chapter 12 Distributed Database Management Systems.
Oracle 10g Database Administrator: Implementation and Administration Chapter 2 Tools and Architecture.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
1 Marek BiskupACAT2005PROO F Parallel Interactive and Batch HEP-Data Analysis with PROOF Maarten Ballintijn*, Marek Biskup**, Rene Brun**, Philippe Canal***,
Database Architectures Database System Architectures Considerations – Data storage: Where do the data and DBMS reside? – Processing: Where.
4/5/2007Data handling and transfer in the LHCb experiment1 Data handling and transfer in the LHCb experiment RT NPSS Real Time 2007 FNAL - 4 th May 2007.
ROOT for Data Analysis1 Intel discussion meeting CERN 5 Oct 2003 Ren é Brun CERN Distributed Data Analysis.
ROOT I/O for SQL databases Sergey Linev, GSI, Germany.
ROOT and Federated Data Stores What Features We Would Like Fons Rademakers CERN CC-IN2P3, Nov, 2011, Lyon, France.
ROOT-CORE Team 1 PROOF xrootd Fons Rademakers Maarten Ballantjin Marek Biskup Derek Feichtinger (ARDA) Gerri Ganis Guenter Kickinger Andreas Peters (ARDA)
Test Results of the EuroStore Mass Storage System Ingo Augustin CERNIT-PDP/DM Padova.
The Problems HTTP is disconnected So many database vendors Create a simple consistent versatile interface on the data Look at ADO.NET classes OleDb SQL.
FTP File Transfer Protocol Graeme Strachan. Agenda  An Overview  A Demonstration  An Activity.
PROOF and ALICE Analysis Facilities Arsen Hayrapetyan Yerevan Physics Institute, CERN.
FAT File Allocation Table
ROOT Data bases access1 LCG Data base deployment workshop 11 October Ren é Brun CERN.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Session 1 Module 1: Introduction to Data Integrity
Slide 1/29 Informed Prefetching in ROOT Leandro Franco 23 June 2006 ROOT Team Meeting CERN.
9/28/2005Philippe Canal, ROOT Workshop TTree / SQL Philippe Canal (FNAL) 2005 Root Workshop.
March, PROOF - Parallel ROOT Facility Maarten Ballintijn Bring the KB to the PB not the PB to the KB.
Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal.
Distributed Logging Facility Castor External Operation Workshop, CERN, November 14th 2006 Dennis Waldron CERN / IT.
Oracle9i Performance Tuning Chapter 11 Advanced Tuning Topics.
File Transfer And Access (FTP, TFTP, NFS). Remote File Access, Transfer and Storage Networks For different goals variety of approaches to remote file.
Introduction Contain two or more CPU share common memory and peripherals. Provide greater system throughput. Multiple processor executing simultaneous.
Globus Data Storage Interface (DSI) - Enabling Easy Access to Grid Datasets Raj Kettimuthu, ANL and U. Chicago DIALOGUE Workshop August 2, 2005.
Multimedia Retrieval Architecture Electrical Communication Engineering, Indian Institute of Science, Bangalore – , India Multimedia Retrieval Architecture.
September, 2002CSC PROOF - Parallel ROOT Facility Fons Rademakers Bring the KB to the PB not the PB to the KB.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Em Spatiotemporal Database Laboratory Pusan National University File Processing : Database Management System Architecture 2004, Spring Pusan National University.
Sept. 2000CERN School of Computing1 PROOF and ROOT Grid Features Fons Rademakers.
PROOF on multi-core machines G. GANIS CERN / PH-SFT for the ROOT team Workshop on Parallelization and MultiCore technologies for LHC, CERN, April 2008.
ALICE Computing Data Challenge VI
Status of the CERN Analysis Facility
PROOF – Parallel ROOT Facility
The Client/Server Database Environment
PHP / MySQL Introduction
Introduction to client/server architecture
File Transfer and access
Support for ”interactive batch”
PROOF - Parallel ROOT Facility
ALICE Data Challenges Fons Rademakers Click to add notes.
Database System Architectures
Presentation transcript:

10 Oct. 2001CMS/ROOT Workshop1 Networking, Remote Access, SQL, PROOF and GRID Fons Rademakers René Brun

10 Oct. 2001CMS/ROOT Workshop2 Networking

10 Oct. 2001CMS/ROOT Workshop3 Networking Classes Provide a simple but complete set of networking classes TSocket, TServerSocket, TMonitor, TInetAddress, TMessage Optimize for large messages over WAN TPSocket, TPServerSocket Operating system independent Via abstract OS interface TSystem

10 Oct. 2001CMS/ROOT Workshop4 Setting Up a Connection // Open server socket waiting for connections on specified port TServerSocket *ss = new TServerSocket(9090, kTRUE); // Accept a connection and return a full-duplex communication socket TSocket *sock = ss->Accept(); // Close the server socket (unless we will use it later to wait for // another connection). ss->Close(); Server side // Open connection to server TSocket *sock = new TSocket(“pcrdm.cern.ch", 9090); Client side

10 Oct. 2001CMS/ROOT Workshop5 Sending Objects TH1 *hpx = new TH1F("hpx","This is the px distribution",100,-4,4); TMessage mess(kMESS_OBJECT); mess.WriteObject(hpx); sock->Send(mess); Client Side TMessage *mess; while (sock->Recv(mess)) { if (mess->What() == kMESS_OBJECT) { if (mess->GetClass()->InheritsFrom(“TH1”)) { TH1 *h = (TH1 *)mess->ReadObject();... } } else if (mess->What() == kMESS_STRING)... delete mess; } Server Side

10 Oct. 2001CMS/ROOT Workshop6 Long Fat Pipes Long fat pipes are WAN links with a large bandwidth*delay product For optimal performance keep pipe full By default this is not the case maximum TCP buffer size is 64KB for a pipe with a 192KB bandwidth*delay product the pipe is empty 60% of the time SourceDestination ACK

10 Oct. 2001CMS/ROOT Workshop7 TCP Window Scaling (RFC 1323) A solution is to use a TCP buffer size equal to the bandwidth*delay product This support for large TCP buffers (window scaling) is described in RFC 1323 Problem: system administrators are needed to change maximum TCP buffer sizes on source and destination machines, e.g. for Linux: echo > /proc/sys/net/core/rmem_max SourceDestination ACK

10 Oct. 2001CMS/ROOT Workshop8 Parallel Sockets Buffer is striped over multiple sockets in equal parts Ideal number of parallel sockets depends on bandwidth*delay product (assuming default 64KB TCP buffer size). No system manager needed to tune network Same performance as with large buffers SourceDestination ACK

10 Oct. 2001CMS/ROOT Workshop9 Setting Up a Parallel Socket Connection // Open server socket waiting for connections on specified port TServerSocket *ss = new TPServerSocket(9090, kTRUE); // Accept a connection and return a full-duplex communication socket TSocket *sock = ss->Accept(); // Close the server socket (unless we will use it later to wait for // another connection). ss->Close(); Server side // Open connection to server TSocket *sock = new TPSocket(“pcrdm.cern.ch", 9090); Client side

10 Oct. 2001CMS/ROOT Workshop10 Main Networking Features Objects can be passed by value over a network connection Easy multiplexing on multiple sockets via TMonitor or via the main event loop Support for non-blocking sockets Support for long fat pipes (Grid environment)

10 Oct. 2001CMS/ROOT Workshop11 Remote Access

10 Oct. 2001CMS/ROOT Workshop12 Remote Access Several classes deriving from TFile provide remote file access: TNetFile performs remote access via the special rootd daemon TWebFile performs remote read-only access via an Apache httpd daemon TRFIOFile performs remote access via the rfiod daemon (RFIO is part of the CERN SHIFT software). RFIO is the interface to several mass storage systems: Castor, HPSS

10 Oct. 2001CMS/ROOT Workshop13 Access Transparency // Local file, uses TFile TFile *f1 = TFile::Open(“local.root”) // Remote file, uses TNetFile and rootd TFile *f2 = TFile::Open(“root://cdfsga.fnal.gov/bigfile.root”) // Remote file, uses TRFIOFile and rfiod TFile *f3 = TFile::Open(“rfio:/alice/run678.root”) // Remote file, uses TWebFile and httpd TFile *f4 = TFile::Open(“

10 Oct. 2001CMS/ROOT Workshop14 The rootd Daemon The rootd is a “simple” byte server Listens on port 1094 (allocated by IANA) Each connection has its own instance Link stays open for duration of connection Secure authentication (via SRP protocol) Supports parallel sockets Also supports basic FTP command set Performance better than NFS or RFIO

10 Oct. 2001CMS/ROOT Workshop15 Parallel FTP Parallel FTP implemented via the TFTP class and the rootd daemon Uses the TPSocket class Supports all standard ftp commands Anonymous ftp Performance, CERN - GSI: wu-ftp: 1.4 MB/s TFTP: 2.8 MB/s

10 Oct. 2001CMS/ROOT Workshop16 SQL Interface

10 Oct. 2001CMS/ROOT Workshop17 SQL Interface RDBMS access via a set of abstract base classes TSQLServer, TSQLResult and TSQLRow Concrete implementations exist for: MySQL, Oracle, PostgreSQL, SapDB and ODBC DB specific modules (plug-ins) are dynamically loaded only when needed

10 Oct. 2001CMS/ROOT Workshop18 SQL Interface Usage { TSQLServer *db = TSQLServer::Connect("mysql://localhost/test", "nobody", ""); TSQLResult *res = db->Query("select count(*) from runcatalog " "where tag&(1<<2)"); int nrows = res->GetRowCount(); int nfields = res->GetFieldCount(); for (int i = 0; i < nrows; i++) { TSQLRow *row = res->Next(); for (int j = 0; j < nfields; j++) { printf("%s\n", row->GetField(j)); } delete row; } delete res; delete db; }

10 Oct. 2001CMS/ROOT Workshop19 Performance Comparison CREATE TABLE runcatalog ( dataset VARCHAR(32) NOT NULL, run INT NOT NULL, firstevent INT, events INT, tag INT, energy FLOAT, runtype ENUM('physics' 'cosmics','test'), target VARCHAR(10), timef TIMESTAMP NOT NULL, timel TIMESTAMP NOT NULL, rawfilepath VARCHAR(128), comments VARCHAR(80) ) class RunCatalog : public TObject { public: enum ERunType { kPhysics, kCosmics, kTest }; char fDataSet[32]; Int_t fRun; Int_t fFirstEvent; Int_t fEvents; Int_t fTag; Float_t fEnergy; ERunType fRunType; char fTarget[10]; UInt_t fTimeFirst; UInt_t fTimeLast; char fRawFilePath[128]; char fComments[80]; }; SQL C++/ROOT

10 Oct. 2001CMS/ROOT Workshop20 Performance Comparison Filling Entries 177 s real-time 0.3 MB/s 54.6 MB DB file 42 s real-time (43 via rootd) 3.4 MB/s 11.5 MB DB file (compression level 1) All results on PII 366, 256 MB RAM, RH 6.1 MySQLTTree

10 Oct. 2001CMS/ROOT Workshop21 Performance Comparison Select of 2 Columns 4.5 s real-time 12.1 MB/s 2.8 s real-time (2.9 via rootd) 4.1 MB/s (19.5 MB/s) SELECT dataset,rawfilepath FROM runcatalog WHERE tag&7 AND (run= OR run=300122) MySQLTTree

10 Oct. 2001CMS/ROOT Workshop22 Performance Comparison ROOT TTree's are in this case a factor 4 smaller Filling time of TTree's is 4.2 times faster Query time of TTree's is 2 times faster However, MySQL and especially Oracle have the typical advantages of: locking, transactions, roll-back, SQL, etc.

10 Oct. 2001CMS/ROOT Workshop23 Very Large Databases We propose to use a combination of the ROOT object store and an RDBMS to create VLDBs The RDBMS is typically used as catalog to keep track of the many ROOT object stores Used in the ALICE Data Challenges and by all experiments using ROOT I/O

10 Oct. 2001CMS/ROOT Workshop24 Regional TIER 1 TIER 2 ALICE Data challenges DAQ ROOT I/O CERN TIER 0 TIER 1 Raw Data Simulated Data GEANT3 GEANT4 FLUKA AliRoot CASTOR GRID Performance Monitoring ROOT File Catalogue

10 Oct. 2001CMS/ROOT Workshop25 ALICE Data Challenge III Event building (DATE) Recording (ROOT+RFIO)Mass Storage (CASTOR)

10 Oct. 2001CMS/ROOT Workshop26 ADC III Highlights Stable and high performances: Max throughput in DATE+ROOT: 240 MB/s Max throughput in DATE+ROOT+CASTOR: 120 MB/s Average during several days: 85 MB/s (>50 TB/week) Total amount of data over the complete chain (up to tape): CASTOR file system: > files of 1 GB (110 TB) File catalogue with 10 5 entries

10 Oct. 2001CMS/ROOT Workshop27 PROOF and GRID

10 Oct. 2001CMS/ROOT Workshop28 Parallel ROOT Facility The PROOF system allows: parallel execution of scripts parallel analysis of chains of trees on clusters of heterogeneous machines Its design goals are: transparency, scalability, adaptability Prototype developed in 1997 as proof of concept (only for simple queries resulting in 1D histograms)

10 Oct. 2001CMS/ROOT Workshop29 Parallel Script Execution root Remote PROOF Cluster proof TNetFile TFile Local PC $ root ana.C stdout/obj node1 node2 node3 node4 $ root root [0].x ana.C $ root root [0].x ana.C root [1] gROOT->Proof(“remote”) $ root root [0].x ana.C root [1] gROOT->Proof(“remote”) root [2] gProof->Exec(“.x ana.C”) ana.C proof proof = slave server proof proof = master server #proof.conf slave node1 slave node2 slave node3 slave node4 *.root TFile

10 Oct. 2001CMS/ROOT Workshop30 Parallel Tree Analysis root [0].! ls -l run846_tree.root -rw-r-r-- 1 rdm cr Feb 1 16:20 run846_tree.root root [1] TFile f("run846_tree.root") root [2] gROOT->Time() root [3] T49->Draw("fPx") Real time 0:0:11, CP time root [4] gROOT->Proof() *** Proof slave server : pcna49a.cern.ch started *** *** Proof slave server : pcna49b.cern.ch started *** *** Proof slave server : pcna49c.cern.ch started *** *** Proof slave server : pcna49d.cern.ch started *** *** Proof slave server : pcna49e.cern.ch started *** Real time 0:0:4, CP time root [5] T49->Draw("fPx") Real time 0:0:3, CP time 0.240

10 Oct. 2001CMS/ROOT Workshop31 Workflow For Tree Analysis Initialization Process Wait for next command Slave 1 Tree->Draw() Packet generator Initialization Process Wait for next command Slave NMaster Tree->Draw() GetNextPacket () SendObject(histo) Add histograms Display histograms 0, , , , , ,40 440,50 590,60

10 Oct. 2001CMS/ROOT Workshop32 PROOF Session Statistics root [6] T49->Print("p") Total events processed: Total number of packets: 147 Default packet size: 100 Smallest packet size: 20 Average packet size: Total time (s): 2.78 Average time between packets (ms): Shortest time for packet (ms): 99 Number of active slaves: 5 Number of events processed by slave 0: 1890 Number of events processed by slave 1: 2168 Number of events processed by slave 2: 2184 Number of events processed by slave 3: 2667 Number of events processed by slave 4: 1676

10 Oct. 2001CMS/ROOT Workshop33 PROOF Transparency On demand, make available to the PROOF servers any objects created in the client Return to the client all objects created on the PROOF slaves the master server will try to add “partial” objects coming from the different slaves before sending them to the client

10 Oct. 2001CMS/ROOT Workshop34 PROOF Scalability Scalability in parallel systems is determined by the amount of communication overhead (Amdahl’s law) Varying the packet size allows one to tune the system. The larger the packets the less communications is needed, the better the scalability Disadvantage: less adaptive to varying conditions on slaves

10 Oct. 2001CMS/ROOT Workshop35 PROOF Adaptability Adaptability means to be able to adapt to varying conditions (load, disk activity) on slaves By using a “pull” architecture the slaves determine their own processing rate and allows the master to control the amount of work to hand out disadvantage: too fine grain packet size tuning hurts scalability

10 Oct. 2001CMS/ROOT Workshop36 PROOF Error Handling Handling death of PROOF servers death of master fatal, need to reconnect death of slave master will resubmit packets of death slave to other slaves Handling of ctrl-c OOB message is send to master, and forwarded to slaves, causing soft/hard interrupt

10 Oct. 2001CMS/ROOT Workshop37 PROOF Authentication PROOF supports secure and un-secure authentication mechanisms Un-secure mangled password send over network Secure SRP, Secure Remote Password protocol (Stanford Univ.), public key technology Soon: Globus authentication

10 Oct. 2001CMS/ROOT Workshop38 PROOF Grid Interface PROOF can use Grid Resource Broker to detect which nodes in a cluster can be used in the parallel session PROOF can use Grid File Catalogue and Replication Manager to map LFN’s to chain of PFN’s PROOF can use Grid Monitoring Services Access will be via abstract GRID interface