High Throughput Data Program at Fermilab R&D Parag Mhashilkar Grid and Cloud Computing Department Computing Sector, Fermilab Network Planning for ESnet/Internet2/OSG.

Slides:



Advertisements
Similar presentations
Storage Workshop Summary Wahid Bhimji University Of Edinburgh On behalf all of the participants…
Advertisements

Particle physics – the computing challenge CERN Large Hadron Collider –2007 –the worlds most powerful particle accelerator –10 petabytes (10 million billion.
Big Data over a 100G Network at Fermilab Gabriele Garzoglio Grid and Cloud Services Department Computing Sector, Fermilab CHEP 2013 – Oct 15, 2013.
A deepening of training needs in digital curation Claudia Engelhardt Framing the digital curation curriculum Florence, 6-7 May 2013.
Operations Portal & Client Connection: Real-time Project Management Tool.
Cloud Storage in Czech Republic Czech national Cloud Storage and Data Repository project.
Experience and proposal for 100 GE R&D at Fermilab Interactomes – May 22, 2012 Gabriele Garzoglio Grid and Cloud Computing Department Computing Sector,
Current Testbed : 100 GE 2 sites (NERSC, ANL) with 3 nodes each. Each node with 4 x 10 GE NICs Measure various overheads from protocols and file sizes.
U.S. Department of Energy’s Office of Science Basic Energy Sciences Advisory Committee Dr. Daniel A. Hitchcock October 21, 2003
 Contributing >30% of throughput to ATLAS and CMS in Worldwide LHC Computing Grid  Reliant on production and advanced networking from ESNET, LHCNET and.
CMS Applications Towards Requirements for Data Processing and Analysis on the Open Science Grid Greg Graham FNAL CD/CMS for OSG Deployment 16-Dec-2004.
NIKHEF Testbed 1 Plans for the coming three months.
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) NSF-wide Cyberinfrastructure Vision People, Sustainability, Innovation,
The LHC Computing Grid Project Tomi Kauppi Timo Larjo.
Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.
The LOGIIC Consortium Zachary Tudor, CISSP, CISM, CCP Program Director SRI International.
GlobusWorld 2012: Experience with EXPERIENCE WITH GLOBUS ONLINE AT FERMILAB Gabriele Garzoglio Computing Sector Fermi National Accelerator.
EGI: A European Distributed Computing Infrastructure Steven Newhouse Interim EGI.eu Director.
A. Mohapatra, HEPiX 2013 Ann Arbor1 UW Madison CMS T2 site report D. Bradley, T. Sarangi, S. Dasu, A. Mohapatra HEP Computing Group Outline  Infrastructure.
A comparison of distributed data storage middleware for HPC, GRID and Cloud Mikhail Goldshtein 1, Andrey Sozykin 1, Grigory Masich 2 and Valeria Gribova.
SMART GRID The Next Generation Electric Grid Kunkerati Lublertlop 11/30/2011 Electrical Engineering Department Southern Taiwan University.
Long Term Ecological Research Network Information System LTER Grid Pilot Study LTER Information Manager’s Meeting Montreal, Canada 4-7 August 2005 Mark.
ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader.
INFSO-RI Enabling Grids for E-sciencE The US Federation Miron Livny Computer Sciences Department University of Wisconsin – Madison.
100G R&D at Fermilab Gabriele Garzoglio (for the High Throughput Data Program team) Grid and Cloud Computing Department Computing Sector, Fermilab Overview.
Tier 3 Data Management, Tier 3 Rucio Caches Doug Benjamin Duke University.
100G R&D at Fermilab Gabriele Garzoglio Grid and Cloud Computing Department Computing Sector, Fermilab Overview Fermilab Network R&D 100G Infrastructure.
100G R&D at Fermilab Gabriele Garzoglio Grid and Cloud Computing Department Computing Sector, Fermilab Overview Fermilab Network R&D 100G Infrastructure.
GlobusWorld 2012: Experience with EXPERIENCE WITH GLOBUS ONLINE AT FERMILAB Gabriele Garzoglio Computing Sector Fermi National Accelerator.
Current Testbed : 100 GE 2 sites (NERSC, ANL) with 3 nodes each. Each node with 4 x 10 GE NICs Measure various overheads from protocols and file sizes.
PPDG and ATLAS Particle Physics Data Grid Ed May - ANL ATLAS Software Week LBNL May 12, 2000.
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
Partnerships & Interoperability - SciDAC Centers, Campus Grids, TeraGrid, EGEE, NorduGrid,DISUN Ruth Pordes Fermilab Open Science Grid Joint Oversight.
OSG Tier 3 support Marco Mambelli - OSG Tier 3 Dan Fraser - OSG Tier 3 liaison Tanya Levshina - OSG.
The Earth System Grid (ESG) Computer Science and Technologies DOE SciDAC ESG Project Review Argonne National Laboratory, Illinois May 8-9, 2003.
Spectrum of Support for Data Movement and Analysis in Big Data Science Network Management and Control E-Center & ESCPS Network Management and Control E-Center.
Overview of grid activities in France in relation to FKPPL FKPPL Workshop Thursday February 26th, 2009 Dominique Boutigny.
The Particle Physics Data Grid Collaboratory Pilot Richard P. Mount For the PPDG Collaboration DOE SciDAC PI Meeting January 15, 2002.
Nancy M. Porter, Co-Chair National Initiative Management Team May 2003.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Ricardo Rocha CERN (IT/GS) EGEE’08, September 2008, Istanbul, TURKEY Experiment.
High Energy FermiLab Two physics detectors (5 stories tall each) to understand smallest scale of matter Each experiment has ~500 people doing.
INFSO-RI Enabling Grids for E-sciencE The gLite File Transfer Service: Middleware Lessons Learned form Service Challenges Paolo.
Jan 2010 OSG Update Grid Deployment Board, Feb 10 th 2010 Now having daily attendance at the WLCG daily operations meeting. Helping in ensuring tickets.
Storage and Data Movement at FNAL D. Petravick CHEP 2003.
US-CMS T2 Centers US-CMS Tier 2 Report Patricia McBride Fermilab GDB Meeting August 31, 2007 Triumf - Vancouver.
Computing Sector, Fermi National Accelerator Laboratory 4/12/12GlobusWorld 2012: Experience with
EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services.
Open Science Grid in the U.S. Vicky White, Fermilab U.S. GDB Representative.
High Throughput Data Program (HTDP) at FNAL Mission: investigate the impact of and provide solutions for the scientific computing challenges in Big Data.
Grid Technologies for Distributed Database Services 3D Project Meeting CERN, May 19, 2005 A. Vaniachine (ANL)
Parag Mhashilkar Computing Division, Fermi National Accelerator Laboratory.
The Claromentis Digital Workplace An Introduction
April 25, 2006Parag Mhashilkar, Fermilab1 Resource Selection in OSG & SAM-On-The-Fly Parag Mhashilkar Fermi National Accelerator Laboratory Condor Week.
A. Mohapatra, T. Sarangi, HEPiX-Lincoln, NE1 University of Wisconsin-Madison CMS Tier-2 Site Report D. Bradley, S. Dasu, A. Mohapatra, T. Sarangi, C. Vuosalo.
Enabling Grids for E-sciencE INFSO-RI Enabling Grids for E-sciencE Gavin McCance GDB – 6 June 2007 FTS 2.0 deployment and testing.
A Data Handling System for Modern and Future Fermilab Experiments Robert Illingworth Fermilab Scientific Computing Division.
1 This Changes Everything: Accelerating Scientific Discovery through High Performance Digital Infrastructure CANARIE’s Research Software.
Fermilab / FermiGrid / FermiCloud Security Update Work supported by the U.S. Department of Energy under contract No. DE-AC02-07CH11359 Keith Chadwick Grid.
100G R&D for Big Data at Fermilab Gabriele Garzoglio Grid and Cloud Computing Department Computing Sector, Fermilab ISGC – March 22, 2013 Overview Fermilab.
GlobusWorld 2012: Experience with EXPERIENCE WITH GLOBUS ONLINE AT FERMILAB Gabriele Garzoglio Computing Sector Fermi National Accelerator.
100G R&D at Fermilab Gabriele Garzoglio (for the High Throughput Data Program team) Grid and Cloud Computing Department Computing Sector, Fermilab Overview.
Big Data over a 100G Network at Fermilab Gabriele Garzoglio Grid and Cloud Services Department Computing Sector, Fermilab CHEP 2013 – Oct 15, 2013 Overview.
Report from US ALICE Yves Schutz WLCG 24/01/2007.
Gene Oleynik, Head of Data Storage and Caching,
Vincenzo Spinoso EGI.eu/INFN
DOE Facilities - Drivers for Science: Experimental and Simulation Data
DCache things Paul Millar … on behalf of the dCache team.
Human Resources Competency Framework
Project Title: (PEARS Action Plan-Step 1)
Presentation transcript:

High Throughput Data Program at Fermilab R&D Parag Mhashilkar Grid and Cloud Computing Department Computing Sector, Fermilab Network Planning for ESnet/Internet2/OSG August 23, 2012

High Throughput Data Program (HTDP) at Fermilab Mission: investigate the impact of and provide solutions for the scientific computing challenges in Big Data for the the Computing Sector at Fermilab and its stakeholders. Three major thrusts driving the project  Grid middleware evaluation on 100GE to prepare lab and its stakeholders for the transition  Foster the process to make network a manageable resource. Actively participate in OSG Networking activity  Integration of network intelligence with data and job management middleware 2

Grid Middleware evaluation on 100GE 2011: ANI Long Island MAN (LIMAN) testbed.  Tested GridFTP and Globus Online for the data movement use cases of HEP over 3x10GE : Super Computing  Demonstrated fast access to ~30TB of CMS data from NERSC to ANL using GridFTP.  Achieved 70 Gbps Currently: ANI 100GE testbed.  Evaluated xrootd, GridFTP, Globus Online, SRM, SQUID  Large files (2G-8G) perform well  Small (8K-4M) & medium (8M-1G) files suffer from LOSF  Medium is the new small  Evaluate other middleware based on stakeholder needs: CVMFS, IRODS, dcache, etc Fall 2012: 100GE Endpoint at Fermilab  Continue testing of middleware technologies defined by stakeholders. Subject to extension of availability of the ANI 100GE testbed 3

Experience on the ANI LIMAN & 100GE Testbed 4

Actively participate in the OSG networking area for Network-As-A-Service  The team brings to the table 10+ years of experience & expertise in Grid and Cloud middleware in the context of international collaborations (OSG, US CMS, XSEDE, …)  OSG Networking Dashboard, repository of network information  Secure Information gathering  Integration layer  Secure Presentation layer: Information consumption by human and middleware  We are very much interested in contributing to development of OSG Networking dashboard (internal stakeholder: USCMS)  Add gridftp network data to the repository.  Dashboard as source of information for network intelligence 5 Collaboration with OSG

Integration of network intelligence with data management service Feed network knowledge to the decisions on data and job placement (e.g. think of replica selection) Consumes information from the Dashboard Present information in a format that can be easily consumed by data management middleware Seeking collaborators to complement funding from external sources (targeting ASCR next) 6

Summary Program of work for HTDP  Continue evaluating middleware performance on 100GE  Actively participate in development of OSG Networking Dashboard  Integration of network intelligence with data management middleware 7