IT in the 12 GeV Era Roy Whitney, CIO May 31, 2013 Jefferson Lab User Group Annual Meeting.

Slides:



Advertisements
Similar presentations
Managing Data from Avian Radar Systems Edwin Herricks, PhD Siddhartha Majumdar.
Advertisements

23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
12 GeV Era Computing (CNI) Andy Kowalski May 20, 2011.
Trigger and online software Simon George & Reiner Hauser T/DAQ Phase 1 IDR.
Microsoft ® Application Virtualization 4.6 Infrastructure Planning and Design Published: September 2008 Updated: February 2010.
Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.
Cluster computing facility for CMS simulation work at NPD-BARC Raman Sehgal.
“ Does Cloud Computing Offer a Viable Option for the Control of Statistical Data: How Safe Are Clouds” Federal Committee for Statistical Methodology (FCSM)
Tier 3g Infrastructure Doug Benjamin Duke University.
Hall D Online Data Acquisition CEBAF provides us with a tremendous scientific opportunity for understanding one of the fundamental forces of nature. 75.
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
Ian Fisk and Maria Girone Improvements in the CMS Computing System from Run2 CHEP 2015 Ian Fisk and Maria Girone For CMS Collaboration.
Term 2, 2011 Week 3. CONTENTS The physical design of a network Network diagrams People who develop and support networks Developing a network Supporting.
©Kwan Sai Kit, All Rights Reserved Windows Small Business Server 2003 Features.
Washington SchoolSpeedTest Month Use the SchoolSpeedTest to Assess Internet Readiness For Digital Learning October 1―31 Thank you for joining us…the webinar.
Online Data Challenges David Lawrence, JLab Feb. 20, /20/14Online Data Challenges.
High Energy Physics At OSCER A User Perspective OU Supercomputing Symposium 2003 Joel Snow, Langston U.
The GlueX Collaboration Meeting October 4-6, 2012 Jefferson Lab Curtis Meyer.
An emerging computing paradigm where data and services reside in massively scalable data centers and can be ubiquitously accessed from any connected devices.
9/16/2000Ian Bird/JLAB1 Planning for JLAB Computational Resources Ian Bird.
Outline IT Organization SciComp Update CNI Update
The University of Texas at San Antonio The Office of Information Technology Network Upgrade Overview.
1 Computing & Networking User Group Meeting Roy Whitney Andy Kowalski Sandy Philpott Chip Watson 17 June 2008.
Computing and IT Update Jefferson Lab User Group Roy Whitney, CIO & CTO 10 June 2009.
NLIT May 26, 2010 Page 1 Computing Jefferson Lab Users Group Meeting 8 June 2010 Roy Whitney CIO & CTO.
Offline Coordinators  CMSSW_7_1_0 release: 17 June 2014  Usage:  Generation and Simulation samples for run 2 startup  Limited digitization and reconstruction.
Scientific Computing Experimental Physics Lattice QCD Sandy Philpott May 20, 2011 IT Internal Review 12GeV Readiness.
What is Cyberinfrastructure? Russ Hobby, Internet2 Clemson University CI Days 20 May 2008.
ALICE Upgrade for Run3: Computing HL-LHC Trigger, Online and Offline Computing Working Group Topical Workshop Sep 5 th 2014.
Looking Ahead: A New PSU Research Cloud Architecture Chuck Gilbert - Systems Architect and Systems Team Lead Research CI Coordinating Committee Meeting.
10/24/2015OSG at CANS1 Open Science Grid Ruth Pordes Fermilab
CDF Offline Production Farms Stephen Wolbers for the CDF Production Farms Group May 30, 2001.
Presented by Leadership Computing Facility (LCF) Roadmap Buddy Bland Center for Computational Sciences Leadership Computing Facility Project.
JLab Scientific Computing: Theory HPC & Experimental Physics Thomas Jefferson National Accelerator Facility Newport News, VA Sandy Philpott.
JLAB Computing Facilities Development Ian Bird Jefferson Lab 2 November 2001.
Page 1 May 10, 2011 IT for the 12 GeV Era 2011 Review Review Closing Summary.
Jefferson Lab Site Report Sandy Philpott Thomas Jefferson National Accelerator Facility Jefferson Ave. Newport News, Virginia USA 23606
Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015.
Template This is a template to help, not constrain, you. Modify as appropriate. Move bullet points to additional slides as needed. Don’t cram onto a single.
Hall-D/GlueX Software Status 12 GeV Software Review III February 11[?], 2015 Mark Ito.
1 Recommendations Now that 40 GbE has been adopted as part of the 802.3ba Task Force, there is a need to consider inter-switch links applications at 40.
1 Cluster Development at Fermilab Don Holmgren All-Hands Meeting Jefferson Lab June 1-2, 2005.
Jefferson Lab IT in the 12GeV Era - Review Accelerator Controls Matt Bickley 05/20/11.
Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.
Comprehensive Scientific Support Of Large Scale Parallel Computation David Skinner, NERSC.
Ian Bird Overview Board; CERN, 8 th March 2013 March 6, 2013
Follow-up to SFT Review (2009/2010) Priorities and Organization for 2011 and 2012.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
Data Infrastructure in the TeraGrid Chris Jordan Campus Champions Presentation May 6, 2009.
Global Science experimental Data hub Center April 25, 2016 Seo-Young Noh Status Report on KISTI’s Computing Activities.
Run - II Networks Run-II Computing Review 9/13/04 Phil DeMar Networks Section Head.
05/14/04Larry Dennis, FSU1 Scale of Hall D Computing CEBAF provides us with a tremendous scientific opportunity for understanding one of the fundamental.
Introduction to Data Analysis with R on HPC Texas Advanced Computing Center Feb
Jefferson Lab Site Report Sandy Philpott HEPiX Fall 07 Genome Sequencing Center Washington University at St. Louis.
EGI-InSPIRE EGI-InSPIRE RI The European Grid Infrastructure Steven Newhouse Director, EGI.eu Project Director, EGI-InSPIRE 29/06/2016CoreGrid.
Scientific Computing at Fermilab Lothar Bauerdick, Deputy Head Scientific Computing Division 1 of 7 10k slot tape robots.
INTRODUCTION TO XSEDE. INTRODUCTION  Extreme Science and Engineering Discovery Environment (XSEDE)  “most advanced, powerful, and robust collection.
29/04/2008ALICE-FAIR Computing Meeting1 Resulting Figures of Performance Tests on I/O Intensive ALICE Analysis Jobs.
Compute and Storage For the Farm at Jlab
Getting the Most out of Scientific Computing Resources
Getting the Most out of Scientific Computing Resources
Clouds , Grids and Clusters
Report from WLCG Workshop 2017: WLCG Network Requirements GDB - CERN 12th of July 2017
Computational Requirements
Overview of the Belle II computing
Dagmar Adamova (NPI AS CR Prague/Rez) and Maarten Litmaath (CERN)
Scientific Computing At Jefferson Lab
EGI Webinar - Introduction -
Computing Overview Amber Boehnlein.
Outline IT Division News Other News
Presentation transcript:

IT in the 12 GeV Era Roy Whitney, CIO May 31, 2013 Jefferson Lab User Group Annual Meeting

“Day One” Science  Working towards “Day One” Science The lab is organizing to enhance support of our users Annual reviews help us check how we are doing and how we can improve Operations support for data challenges and more  Offline Computing Resources Growing Fast In all areas, capacity will grow to meet the 12 GeV requirements

Computing Capacity Growth Today: 1K cores in the farm (3 racks, 4-16 cores per node, 2 GB/core) 9K LQCD cores (24 racks, 8-16 cores per node 2-3 GB/core) 180 nodes w/ 720 GPU + Xeon Phi as LQCD compute accelerators 2016: ~20K cores in the Farm (10 racks, cores per node, 2 GB/core) Accelerated nodes for Partial Wave Analysis? Even 1 st Pass? LQCD : some mix of conventional and accelerated, tbd (20 racks) Total footprint, power and cooling will grow only slightly. Capacity for detector simulation will be deployed in 2014 and 2015, with additional capacity for analysis in 2015 and Today Experimental Physics has < 5% of the compute capacity of LQCD. In 2016 it will be closer to 50% in dollar terms and number of racks (still small in terms of flops). News Flash 117 TF Linpac achieved 5/28/2013

Computing Change – ARE YOU READY? Today, most codes and jobs are serial. Each job uses one core, and we try to run enough jobs to keep all cores busy, without overusing memory or I/O bandwidth. Current weakness: if we have 16 cores per box, and run 24 jobs to keep them all busy, that means that there are 24 input and 24 output file I/O streams running just for this one box!=> lots of “head thrashing” in the disk system. Future: most data analysis will be event parallel (“trivially parallel”). Each thread will process one event. Each box will process 1 job events in parallel, with 1 input and 1 output. => much less head thrashing, higher I/O rates. Possibility: the farm will include GPU or Xeon Phi accelerated nodes! As software becomes ready, we will deploy it! Does not scale 12 GeV Era

Disk and Tape Systems Today, Physics uses about 400 TeraBytes of disk, with all compute nodes and file servers on Infiniband. Much of this is in Lustre, a parallel file system spread over 30+ servers (shared with LQCD, total bandwidth 6 GB/s). Future: Physics will have 3-4 PB of disk, with single file I/O streams of up to 0.5 to 1.0 GB/s; total ~30 GB/s. Today the tape library can do read/write at 1.5 GB/s and we store 1 PB / year. Future: increase I/O as needed, store PB / year, adding a 2 nd library in FY15 or FY16, and possibly a 3 rd later in this decade (tbd). For storage, we will deploy what is necessary to make off-line computing efficient.

Networking If there are issues, talk to us, we have the bandwidth! Mobile Computing trends –Expanding WiFi on campus –Cell phone coverage in major buildings –Virtual Desktop Infrastructure (VDI) for Windows desktops –Increasing support for bring Your Own Device (BYOD) Internet Today: –10Gbit WAN – ESnet Globus Online file transfers of over 3 Gb/s and rising as we tune it up –45Mbit WAN backup – Cox Will increase or find a second path to ESnet Future: –Second 10Gig link to ESnet from ELITE within 1 year 24 Hours 3 Gb/s Peak

Computing Environment Cyber Security – The threat is increasing! –Remote access 2-factor to Halls and Accelerator via a gateway Gateway is accessible directly from the Internet –Border firewall upgraded Next generation technology does packet/protocol inspection Supports 5-10 Gigabit depending on configuration Telecommunications Now VoIP –Phones are now a site wide PA system – Important for safety! –UPSs for more network segments (30 min) Collaboration –SeeVogh and ESnet H.323 for video conferencing –ReadyTalk (ESnet) for web conferencing

Collaborating for Science Regular meetings held between the Scientific Computing Group (SciComp) and Physics Computing Coordinators Overall: Graham Heyes Hall A: Ole Hansen Hall B: Dennis Weygand Hall C: Brad Sawatzky Hall D: Mark Ito –ensure requirements are met, set priorities –address issues as they arise –coordinate upgrades for computing, storage, network

Collaborating (2) User meetings to engage end-users of the compute and storage facilities –Most recent in February –During this UG meeting, discussion of data analysis cluster future New Physics Software Committee –Graham Heyes, chair –Better documentation and support for ROOT, CERNLib, Geant4, CLHEP, EVIO –Website: Goals: 1)Better support for students and new postdocs 2)Enhanced ease of use and change management for everyone

Data Preservation and Provenance DPHEP working group participation – Data Preservation in High Energy Physics ( Two physicists, Graham Heyes and Dennis Weygand, are actively working to ensure Jefferson Lab’s 6 GeV scientific data preservation, ensuring its format, accompanying software and configurations, documentation remain up-to- date and available over time. Wikipedia.org: “Scientific researchWikipedia.org: “Scientific research is generally held to be of good provenance when it is documented in detail sufficient to allow reproducibility. Scientific workflows assist scientists and programmers with tracking their data through all transformations, analyses, and interpretations. Data sets are reliable when the process used to create them are reproducible and analyzable for defects. Current initiatives to effectively manage, share, and reuse ecological data are indicative of the increasing importance of data provenance.”reproducibilityScientific workflows reproducible

Seeking External Advice IT Steering Committee semi-annual meetings –Includes Physics Computing Coordinator, UGBoD Computing contact, several physicists Annual reviews for 12 GeV Software and Computing –(Some) June 2012 recommendations: “Presentations in future reviews should address end user utilization of and experience with the software in more detail. Talks from end users on usage experience with the software and analysis infrastructure would be beneficial.” “An explicitly planned program of data challenges is recommended”. –Next review scheduled for September 2013

Summary Jefferson Lab is working hard to ensure “Day One” Science will be emerging from the 12 GeV research program. Plans are in place to deploy the significantly expanded IT resources needed for off-line computing. Success and progress is coming from the combined efforts of the Users, Physics Division, Theory Group and IT Division.