Data management at T3s Hironori Ito Brookhaven National Laboratory.

Slides:



Advertisements
Similar presentations
ATLAS T1/T2 Name Space Issue with Federated Storage Hironori Ito Brookhaven National Laboratory.
Advertisements

Andrew McNab - EDG Access Control - 14 Jan 2003 EU DataGrid security with GSI and Globus Andrew McNab University of Manchester
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Services Abderrahman El Kharrim
USING THE GLOBUS TOOLKIT This summary by: Asad Samar / CALTECH/CMS Ben Segal / CERN-IT FULL INFO AT:
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Data Grid Web Services Chip Watson Jie Chen, Ying Chen, Bryan Hess, Walt Akers.
Outline Network related issues and thinking for FAX Cost among sites, who has problems Analytics of FAX meta data, what are the problems  The main object.
OSG End User Tools Overview OSG Grid school – March 19, 2009 Marco Mambelli - University of Chicago A brief summary about the system.
Setup your environment : From Andy Hass: Set ATLCURRENT file to contain "none". You should then see, when you login: $ bash Doing hepix login.
ATLAS DQ2 Deletion Service D.A. Oleynik, A.S. Petrosyan, V. Garonne, S. Campana (on behalf of the ATLAS Collaboration)
US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.
Integration Program Update Rob Gardner US ATLAS Tier 3 Workshop OSG All LIGO.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Simply monitor a grid site with Nagios J.
How to Install and Use the DQ2 User Tools US ATLAS Tier2 workshop at IU June 20, Bloomington, IN Marco Mambelli University of Chicago.
DDM-Panda Issues Kaushik De University of Texas At Arlington DDM Workshop, BNL September 29, 2006.
Tier 3 Storage Survey and Status of Tools for Tier 3 access, futures Doug Benjamin Duke University.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE middleware: gLite Data Management EGEE Tutorial 23rd APAN Meeting, Manila Jan.
BNL DDM Status Report Hironori Ito Brookhaven National Laboratory.
Enabling Grids for E-sciencE System Analysis Working Group and Experiment Dashboard Julia Andreeva CERN Grid Operations Workshop – June, Stockholm.
Architecture and ATLAS Western Tier 2 Wei Yang ATLAS Western Tier 2 User Forum meeting SLAC April
Enabling Grids for E-sciencE Introduction Data Management Jan Just Keijser Nikhef Grid Tutorial, November 2008.
My Name: ATLAS Computing Meeting – NN Xxxxxx A Dynamic System for ATLAS Software Installation on OSG Sites Xin Zhao, Tadashi Maeno, Torre Wenaus.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Grid Monitoring Tools Alexandre Duarte CERN.
Storage cleaner: deletes files on mass storage systems. It depends on the results of deletion, files can be set in states: deleted or to repeat deletion.
Author: Andrew C. Smith Abstract: LHCb's participation in LCG's Service Challenge 3 involves testing the bulk data transfer infrastructure developed to.
Nurcan Ozturk University of Texas at Arlington US ATLAS Transparent Distributed Facility Workshop University of North Carolina - March 4, 2008 A Distributed.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
Grid Deployment Enabling Grids for E-sciencE BDII 2171 LDAP 2172 LDAP 2173 LDAP 2170 Port Fwd Update DB & Modify DB 2170 Port.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Ricardo Rocha CERN (IT/GS) EGEE’08, September 2008, Istanbul, TURKEY Experiment.
ATLAS XRootd Demonstrator Doug Benjamin Duke University On behalf of ATLAS.
Managing Data DIRAC Project. Outline  Data management components  Storage Elements  File Catalogs  DIRAC conventions for user data  Data operation.
The ATLAS Cloud Model Simone Campana. LCG sites and ATLAS sites LCG counts almost 200 sites. –Almost all of them support the ATLAS VO. –The ATLAS production.
SEE-GRID-SCI Storage Element Installation and Configuration Branimir Ackovic Institute of Physics Serbia The SEE-GRID-SCI.
INFSO-RI Enabling Grids for E-sciencE ATLAS DDM Operations - II Monitoring and Daily Tasks Jiří Chudoba ATLAS meeting, ,
SAM Sensors & Tests Judit Novak CERN IT/GD SAM Review I. 21. May 2007, CERN.
EGEE is a project funded by the European Union under contract IST VO box: Experiment requirements and LCG prototype Operations.
David Adams ATLAS ATLAS distributed data management David Adams BNL February 22, 2005 Database working group ATLAS software workshop.
The CMS Top 5 Issues/Concerns wrt. WLCG services WLCG-MB April 3, 2007 Matthias Kasemann CERN/DESY.
Tier3 monitoring. Initial issues. Danila Oleynik. Artem Petrosyan. JINR.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Upcoming Features and Roadmap Ricardo Rocha ( on behalf of the.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
Service Availability Monitor tests for ATLAS Current Status Tests in development To Do Alessandro Di Girolamo CERN IT/PSS-ED.
1 Egrid portal Stefano Cozzini and Angelo Leto. 2 Egrid portal Based on P-GRADE Portal 2.3 –LCG-2 middleware support: broker, CEs, SEs, BDII –MyProxy.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Disk Space publication Simone Campana Fernando Barreiro Wahid.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
T3g software services Outline of the T3g Components R. Yoshida (ANL)
Distributed Analysis Tutorial Dietrich Liko. Overview  Three grid flavors in ATLAS EGEE OSG Nordugrid  Distributed Analysis Activities GANGA/LCG PANDA/OSG.
Finding Data in ATLAS. May 22, 2009Jack Cranshaw (ANL)2 Starting Point Questions What is the latest reprocessing of cosmics? Are there are any AOD produced.
Dynamic Data Placement: the ATLAS model Simone Campana (IT-SDC)
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES The AliEn File Catalogue Jamboree on Evolution of WLCG Data &
SAM Status Update Piotr Nyczyk LCG Management Board CERN, 5 June 2007.
1 DIRAC Data Management Components A.Tsaregorodtsev, CPPM, Marseille DIRAC review panel meeting, 15 November 2005, CERN.
Andrea Manzi CERN EGI Conference on Challenges and Solutions for Big Data Processing on cloud 24/09/2014 Storage Management Overview 1 24/09/2014.
Discussion on data transfer options for Tier 3 Tier 3(g,w) meeting at ANL ASC – May 19, 2009 Marco Mambelli – University of Chicago
Data Management at Tier-1 and Tier-2 Centers Hironori Ito Brookhaven National Laboratory US ATLAS Tier-2/Tier-3/OSG meeting March 2010.
VO Box discussion ATLAS NIKHEF January, 2006 Miguel Branco -
New Features of Xrootd SE Wei Yang US ATLAS Tier 2/Tier 3 meeting, University of Texas, Arlington,
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
Data Distribution Performance Hironori Ito Brookhaven National Laboratory.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America LFC Server Installation and Configuration.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI solution for high throughput data analysis Peter Solagna EGI.eu Operations.
a brief summary for users
Ricardo Rocha ( on behalf of the DPM team )
BNL Tier1 Report Worker nodes Tier 1: added 88 Dell R430 nodes
Extended OSG client for WLCG
Data Federation with Xrootd Wei Yang US ATLAS Computing Facility meeting Southern Methodist University, Oct 11-12, 2011.
Study course: “Computing clusters, grids and clouds” Andrey Y. Shevel
XROOTd for cloud storage
Brookhaven National Laboratory Storage service Group Hironori Ito
Site availability Dec. 19 th 2006
Presentation transcript:

Data management at T3s Hironori Ito Brookhaven National Laboratory

Types T3GS – Just like T2s Pros: – Can use all available production software – Sites are monitored 24/7 Cons: – Big overhead – Must be reliable T3G – Not like other ATLAS sites Pro – Minimum overhead – Reliability is not required Cons: – T3s are on your own when problem exist. (sometimes)

US T3 GS GS Sites are in ATLAS Tiers of ATLAS. – It is operated by BNL T3 DQ2 SS – Requirements SRM with space tokens Must accept ATLAS production proxy (as they are). – No special, manual registration at a site. Must pass a few tests – SAM test » lcg-cr, lcg-cp and lcg-del – ATLAS DDM functional tests Must register at OSG OIM Must publish to OSG BDII and CERN BDII via OSG Inter-op BDII – BNL publishes all SE only US T3s to OSG BDII. » T3 must request to BNL (via DDM queue in RT) by providing its SE information Must be able to respond to any ATLAS tickets within reasonable time.

T3 GS Management Use regular DQ2 tools – Subsription DaTRI – Deletion Central deletion dq2-delete-replicas – Consistency LFC is located in BNL for all T3 The content of LFC for specific T3 is delivered to corresponding T3 DDM site via DDM. – Sqlite format » Provide the fast search – Central catalog information » LFC has no dataset info.

T3 GS Management (continue… I) Use regular DQ2 tools – Consistency storageManagement.py – Work with the above LFC files – Scan local storages – Find SE and LFC dark files » SE dark files: exist in SE but not in LFC Select * from files where pfn_se is not null and pfn_lfc is null » LFC dark files: exists in LFC but not in SE Select * from files where pfn_se is null and pfn_lfc is not null – Delete dark files – Create logs » Log(s) is always created automatically. » All actions are stored in the log. – Obtain by » Svn checkout » Download via browser at

US T3G Not in Tiers of ATLAS. – Can’t use DQ2 SS Requirement – Grid enabled SE SRM or plain Gridftp server – Still register to OSG OIM Difference with GS – No need to accept ATLAS production proxy – No tests to pass

Data Tools in US T3G Use existing tools as much as possible. – Extend for future use dq2-get and dq2-ls – Dq2-get Plugins to use different transfer tools than lcg-cp – FTS plugins » Allow third party transfers between two remote SEs Supports SRMs as well as GridFTP » Allow queuing » Avoid chaotic lcg-cp » New dq2-client package will include this plugin by default The newest one is available at svn checkout Browswer download at pname=dq2plugin

Data Tools in US T3 G (continue… I) dq2-get and dq2-ls – dq2-get Global name space – Dq2 client developers are currently working on the change. – Store files with Global name space (LFC name space) » Same as LFC name space used in ATLAS production » Example DSN: data10_7TeV physics_JetTauEtmiss.merge.NTUP _JETMET.f293_p209_tid172219_00 LFN: NTUP_JETMET _ root.1 LFC LFN global name space /grid/atlas/dq2/data10_7TeV/NTUP_JETMET/f293_p209/d ata10_7TeV physics_JetTauEtmiss.merge.NTUP_J ETMET.f293_p209_tid172219_00/NTUP_JETMET _ root.1

Data Tools in US T3G (continue… II) dq2-get and dq2-ls – dq2-get Global name space – Use of SE as a file catalog » T3G has no LFC » Easy extensions for other file transfer mechanisms xROOTd-FRM Find/transfer files with remote FRMs automatically http(s) Many SEs do/will support http/https currently and/or in the future dCache/BestMan/DPM Make a new http plugins

Data Tools in US T3G (continue… III) dq2-get and dq2-ls – dq2-ls Global name space – dq2-ls currently requires LFC to find physical files – T3G has no LFC – dq2-ls will find physical files from the local(remote?) SE according to the global name space. » Dq2 developers are currently working on the change

Data Management at T3G T3 space must be managed by T3 administrators with minimum helps from T2s/T1 No central replica catalog No semi-central LFC No need to synchronize – Just delete files from SEs as needed. – All files in a given dataset are stored in one particular directory according to the global name space. delete-replica DSN  rm –rf /A/B/…/DSN List-datasets-site SITE  ls –R /base-data-directory

Thought on Global Name Space Great way to avoid local catalog – Cons: Performance issue on SE to list files? Expand the methods to access files – xRootd FRM – http/https: http/https is much easier, and has wide standard support Everyone knows how to use browser Many clients: – wget works everywhere. – Aria2 ( » Segmented download Stop-start transfer in the middle Use of multiple source sites for a single file Use of multiple streams from the single source hosts per single file Use of multiple downloads. » Casual test: Wget at 4MB/s aria2 at 60MB/s – LFC+dCache+http demos at