Research Data Transfer Zones

Slides:



Advertisements
Similar presentations
E-Infrastructure Networking David Salmon. Topics e-Infrastructure funding – What has Janet been doing ? Emerging Issues – Some practicalities Broader.
Advertisements

Joint CASC/CCI Workshop Report Strategic and Tactical Recommendations EDUCAUSE Campus Cyberinfrastructure Working Group Coalition for Academic Scientific.
Towards a Virtual European Supercomputing Infrastructure Vision & issues Sanzio Bassini
EInfrastructures (Internet and Grids) US Resource Centers Perspective: implementation and execution challenges Alan Blatecky Executive Director SDSC.
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) NSF-wide Cyberinfrastructure Vision People, Sustainability, Innovation,
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CF21) IRNC Kick-Off Workshop July 13,
Technical Review Group (TRG)Agenda 27/04/06 TRG Remit Membership Operation ICT Strategy ICT Roadmap.
1 IS112 – Chapter 1 Notes Computer Organization and Programming Professor Catherine Dwyer Fall 2005.
Semester 4 - Chapter 3 – WAN Design Routers within WANs are connection points of a network. Routers determine the most appropriate route or path through.
1 Building National Cyberinfrastructure Alan Blatecky Office of Cyberinfrastructure EPSCoR Meeting May 21,
Welcome to HTCondor Week #14 (year #29 for our project)
ISBE An infrastructure for European (systems) biology Martijn J. Moné Seqahead meeting “ICT needs and challenges for Big Data in the Life Sciences” Pula,
NORDUnet NORDUnet The Fibre Generation Lars Fischer CTO NORDUnet.
Presentation to Senior Management Team 24 th October 2008 UCD IT Services IT Strategy
The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Dataset Caitlin Minteer & Kelly Clynes.
Campus Cyberinfrastructure – Network Infrastructure and Engineering (CC-NIE) Kevin Thompson NSF Office of CyberInfrastructure April 25, 2012.
Information Security Research and Education Network INSuRE Dr. Melissa Dark Purdue University Award #
Information Resources and Communications University of California, Office of the President UC-Wide Activities in Support of Research and Scholarship David.
Russ Hobby Program Manager Internet2 Cyberinfrastructure Architect UC Davis.
GridPP Deployment & Operations GridPP has built a Computing Grid of more than 5,000 CPUs, with equipment based at many of the particle physics centres.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Slide David Britton, University of Glasgow IET, Oct 09 1 Prof. David Britton GridPP Project leader University of Glasgow UK-T0 Meeting 21 st Oct 2015 GridPP.
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
CYBERINFRASTRUCTURE FRAMEWORK FOR 21st CENTURY SCIENCE AND ENGINEERING (CIF21) Goal Develop and deploy comprehensive, integrated, sustainable, and secure.
NSF Middleware Initiative Purpose To design, develop, deploy and support a set of reusable, expandable set of middleware functions and services that benefit.
WHAT SURF DOES FOR RESEARCH SURF’s Science Engagement TNC15 June 18, 2015 Sylvia Kuijpers (SURFnet)
UNM SCIENCE DMZ Sean Taylor Senior Network Engineer.
Slide 1 The work of International Network for the Availability of Scientific Publications (INASP) in access to information thru the Programme for enhancement.
Client/Server Technology
Mikolt Csap (Unit G.2 – Creativity)
Chapter 1 Computer Technology: Your Need to Know
Bob Jones EGEE Technical Director
Grid Optical Burst Switched Networks
STRATEGIC ACADEMIC UNIT “PEOPLE & TECHNOLOGIES”
Clouds , Grids and Clusters
Report from WLCG Workshop 2017: WLCG Network Requirements GDB - CERN 12th of July 2017
Scaling Science Communities Lessons learned by and future plans of the Open Science Grid Frank Würthwein OSG Executive Director Professor of Physics UCSD/SDSC.
Tools and Services Workshop
Joslynn Lee – Data Science Educator
Semester 4 - Chapter 3 – WAN Design
Innovative Solutions from Internet2
The NSRC cultivates collaboration among a community of peers to build and improve a global Internet that benefits all parties. We facilitate the growth.
Christos Markou Institute of Nuclear Physics NCSR ‘Demokritos’
ELIXIR: Potential areas for collaboration with e-Infrastructures
DOE Facilities - Drivers for Science: Experimental and Simulation Data
National e-Infrastructure Vision
A. Rama Bharathi Regd. No: 08931F0040 III M.C.A
Zambia Research and Education Network (Creation of an NREN in Zambia) Bonny Khunga. CEO ZAMREN ZAMREN 7/6/2018.
The Shifting Landscape of CI Funding
Grid Portal Services IeSE (the Integrated e-Science Environment)
UK Status and Plans Scientific Computing Forum 27th Oct 2017
EGI-Engage Engaging the EGI Community towards an Open Science Commons
JASMIN Success Stories
University of Technology
The Biodiversity and Protected Areas Management (BIOPAMA) Programme
XSEDE’s Campus Bridging Project
Dr Joe McNamara Head of Population Health Sciences MRC
Carrier Wi-Fi Market
ESnet and Science DMZs: an update from the US
EGI Webinar - Introduction -
Big-Data around the world
Smart Learning concepts to enhance SMART Universities in Africa
Brian Matthews STFC EOSCpilot Brian Matthews STFC
NTU Presentation Jason Arviso, Director of Information Technology
NTU Presentation Jason Arviso, Director of Information Technology
The National Grid Service Mike Mineter NeSC-TOE
Data Management Components for a Research Data Archive
MAZARS’ CONSULTING PRACTICE Helping your Business Venture Further
What is a Grid? Grid - describes many different models
Maria Teresa Capria December 15, 2009 Paris – VOPlaneto 2009
Presentation transcript:

Research Data Transfer Zones Professor Tony Hey Chief Data Scientist STFC Rutherford Appleton Laboratory Didcot, OX11 0QX, UK

e-Infrastructure and Research Networks

NSF Task Force on ‘Campus Bridging’ (2011) The goal of ‘campus bridging’ is to enable the seamlessly integrated use among: a researcher’s personal cyberinfrastructure cyberinfrastructure at other campuses cyberinfrastructure at the regional, national and international levels so that they all function as if they were proximate to the scientist

Need for European adoption of ‘Science DMZ’ end-to-end network architecture Science DMZs implemented at over 100 US universities NSF invested more than $80M in DMZ campus cyberinfrastructure

The UK Met Office UPSCALE campaign controller 5 TB per day 100100100001110101 JASMIN 2.5 TB Data transfer & compression HERMIT @ HLRS successfully transferred and data validated

JASMIN Research Data Transfer Zone Architecture Simple Data Transfer Zone (DTZ) Supercomputer Center DTZ http://fasterdata.es.net/science-dmz-architecture

Pacific Research Platform NSF funding $5M award to UC San Diego and UC Berkeley to establish a science-driven high-capacity data-centric “freeway system” on a large regional scale. This network infrastructure will give the research institutions the ability to move data 1,000 times faster compared to speeds on today’s Internet. August 2015 “PRP will enable researchers to use standard tools to move data to and from their labs and their collaborators’ sites, supercomputer centers and data repositories distant from their campus IT infrastructure, at speeds comparable to accessing local disks,” said co-PI Tom DeFanti

e-Infrastructure Annex – High Performance Research Data Networking Andrew Samsun, Tony Hey (SCD) Bob Day, Tim Chown, Jeremy Sharp (Jisc)

Rationale Annex requests funding for high performance research data networking between the UK’s major research facilities and university user sites. Will provide high bandwidth research data connectivity between generators of high volume datasets at the national facilities and remote users at universities. Research data sets are rapidly increasing size and there is a need for high bandwidth end-to-end performance of the connecting research network. This proposal builds on planned enhancements to the Janet network and is supported by the Jisc team.

Proposal has three components: A programme to establish a UK Research Data Transfer Zone (RTDZ) connecting major university user sites and facilities. The production of a Data Transfer Toolkit (DTT) to assist users in exploiting the UK RDTZ infrastructure. Upgrading the core research data networking capacity between key NeI sites and the Janet backbone.

Description of Need (1) The UK academic research community has several existing as well as numerous emerging data-intensive research disciplines. These communities have an increasing need to rapidly transfer research data between organisations, institutes and facilities both within the UK and internationally. High bandwidth end-to-end network performance crucial to meet this need. The Janet network currently offers backbone throughput of 200Gbps, rising to 600Gbps by 2017/18, but ‘last mile’ limitations within connected campuses often restrict the performance of data-intensive applications. Campus network architectures designed to accommodate day-to-day applications and traffic, with cybersecurity models that need to accommodate a wide range of threats. Such campus architectures typically lack the necessary secure ‘fast paths’ to support the data-intensive flows of research data. Without investment in appropriate local network and systems engineering within the campuses, UK researchers will be hindered by these limitations and not be able to take maximum advantage of the research data generated by the NeI facilities. Implementation of such research data network enhancements is essential for the efficient exploitation of research data to deliver more and better results.

Description of Need (2) In the US, NSF’s ‘Campus Cyberinfrastructure’ program has addressed this problem by providing funding for over 100 universities to improve their local network and systems infrastructure in support of data-intensive science applications. This has been achieved through implementations of Berkeley Lab’s ‘Science DMZ’ model. This describes design patterns for appropriate local network architectures, data transfer node design, tailored cybersecurity policies and incorporates network performance measurement. We propose a similar investment strategy for UK research organisations by providing funding for improvements in the ‘last mile’ infrastructure between data sources such as those at the UK NeI facilities and university campuses.   Implementation of these research data network improvements will enhance UK research output by ensuring that its researchers can exploit the state-of- the-art Janet backbone network to its maximum potential.

Description of Need (3) There is also an urgent need to provide users with easy-to-use software tools for initiating large data transfers at high bandwidth. This proposal includes development of open source software toolkit for high throughput data transfers. Build on prior experience of data services such as FTS and Globus and will be designed to integrate with UK NeI AAAI infrastructure. The resulting Data Transfer Toolkit (DTT) will allow last mile data network improvements to be exploited to maximum effect by a wide range of research disciplines.

Outputs of the proposal Enhancements to network infrastructure at a range of university campus locations to support of high throughput data transfers to / from those sites. Open source software toolkit, DTT, supporting high throughput data transfers, fully integrated with UK NeI AAAI infrastructures, and allowing the research network infrastructure enhancements to be exploited for maximum benefit. Enhancements to network infrastructure at the key UK NeI facilities, with the goal of connecting these sites to Janet at 100Gbit/s in FY18, and by deploying appropriate, internal infrastructure upgrades.

Key Benefits Ensure the UK academic community is on a competitive trajectory to conduct world-leading data- intensive research with collaborators in the UK and internationally. Optimise the capability for UK research communities to exploit the growth in capacity of the Janet network as it moves towards Terabit networking. Address the challenge of linking increasingly affordable but high data volume networked scientific equipment, such as electron microscopes and gene sequencers, to national centres for processing and long-term storage of data. Enhance the ability of remote scientists to carry out ‘real-time’ research activities such as remote experiment control at national and international experimental facilities such as the Diamond Light Source and remote telescopes and observatories such as SKA. Improve the exploitation of the NeI by enabling real-time access to co-located processing capability and data caches such as that exemplified by NERC’s JASMIN Super Data Cluster. Provide a UK RDTZ capability for universities and build expertise and identify best practice in campus network engineering for data-intensive science. This will increase the potential for cross-fertilisation of data science methodologies used by different scientific communities. It will also reach out to disciplines currently unaware of the potential of the Janet network to increase their research output. Provide an open source high-capacity data transfer toolkit that will ensure that users can easily exploit the UK RDTZ infrastructure to its full potential. Make UK researchers internationally competitive in data intensive research applications by provision of world-leading end-to-end research data network performance.

Project Cost Breakdown     Project Cost Breakdown   FY17 FY18 FY19 FY20 RDTZ £3M £2M DTT £0.5M Core NeI £6M Total £3.5M £9.5M £2.5M

USERS CSP3 CSP4 CSP5 CSP2 JANET CSP6 GEANT CSP1 GridPP UK T0 Rest of the World Rest of the World Authentication, Authorization & Accounting Interface Authentication, Authorization & Accounting Interface Secure Secure GridPP UK T0 Secure ADRN Secure MEDICAL PROJECTS NHS DIRAC ARCHER HEI REGIONAL BUSINESS FARR CSP3 RIZ CSP4 RIZ CSP5 Secure Secure Secure DSP2: Unregulated Data e.g. LHC, ESA, Environment, Experiments, Telescopes DSP1: Regulated Data e.g. Genomics, ADC, NHS CSP2 JANET Secure Secure Research Infrastructure Zone (RIZ) Research Infrastructure Zone (RIZ) Secure CSP1 RIZ AAAI CSP6 RIZ AAAI GEANT Secure Secure Content Service Providers (AWS, MS….) CSP Secure