Presentation is loading. Please wait.

Presentation is loading. Please wait.

Copyright AARNet 20051 Massive Data Transfers George McLaughlin Mark Prior AARNet.

Similar presentations


Presentation on theme: "Copyright AARNet 20051 Massive Data Transfers George McLaughlin Mark Prior AARNet."— Presentation transcript:

1 Copyright AARNet 20051 Massive Data Transfers George McLaughlin Mark Prior AARNet

2 Copyright AARNet 20052 “Big Science” projects driving networks Large Hadron Collider –Coming on-stream in 2007 –Particle collisions generating terabytes/second of “raw” data at a single, central, well-connected site –Need to transfer data to global “tier 1” sites. A tier 1 site must have a 10Gbps path to CERN –Tier 1 sites need to ensure gigabit capacity to the Tier2 sites they serve Square Kilometre Array –Coming on-stream in 2010? –Greater data generator than LHC –Up to 125 sites at remote locations, data need to be brought together for correlation –Can’t determine “noise” prior to correlation –Many logistic issues to be addressed Billion dollar globally funded projects Massive data transfer needs

3 Copyright AARNet 20053 From very small to very big

4 Copyright AARNet 20054 Scientists and Network Engineers coming together HEP community and R&E network community have figured out mechanisms for interaction – probably because HEP is pushing network boundaries eg the ICFA workshops on HEP, Grid and the Global Digital Divide bring together scientists, network engineers and decision makers – and achieve results http://agenda.cern.ch/List.php

5 Copyright AARNet 20055 What’s been achieved so far  A new generation of real-time Grid systems is emerging - support worldwide data analysis by the physics community  Leading role of HEP in developing new systems and paradigms for data intensive science  Transformed view and theoretical understanding of TCP as an efficient, scalable protocol with a wide field of use  Efficient standalone and shared use of 10 Gbps paths of virtually unlimited length; progress towards 100 Gbps networking  Emergence of a new generation of “hybrid” packet- and circuit- switched networks

6 Copyright AARNet 20056 LHC data (simplified) Per experiment 40 million collisions per second After filtering, 100 collisions of interest per second A Megabyte of digitised information for each collision = recording rate of 100 Megabytes/sec 1 billion collisions recorded = 1 Petabyte/year CMSLHCbATLASALICE 1 Megabyte (1MB) A digital photo 1 Gigabyte (1GB) = 1000MB A DVD movie 1 Terabyte (1TB) = 1000GB World annual book production 1 Petabyte (1PB) = 1000TB 10% of the annual production by LHC experiments 1 Exabyte (1EB) = 1000 PB World annual information production

7 Copyright AARNet 20057 LHC Computing Hierarchy Tier 1 Tier2 Center Online System CERN Center PBs of Disk; Tape Robot FNAL Center IN2P3 Center INFN Center RAL Center Institute Workstations ~100-1500 MBytes/sec 2.5-10 Gbps Tens of Petabytes by 2007-8. An Exabyte ~5-7 Years later. ~PByte/sec ~2.5-10 Gbps Tier2 Center ~2.5-10 Gbps Tier 0 +1 Tier 3 Tier 4 Tier2 Center Tier 2 Experiment CERN/Outside Resource Ratio ~1:2 Tier0/(  Tier1)/(  Tier2) ~1:1:1 0.1 to 10 Gbps Physics data cache

8 Copyright AARNet 20058 Lightpaths for Massive data transfers From CANARIE A small number of users with large data transfer needs can use more bandwidth than all other users

9 Copyright AARNet 20059 Why? Cees de Laat classifies network users into 3 broad groups. 1.Lightweight users, browsing, mailing, home use. Who need full Internet routing, one to many; 2.Business applications, multicast, streaming, VPN’s, mostly LAN. Who need VPN services and full Internet routing, several to several + uplink; and 3.Scientific applications, distributed data processing, all sorts of grids. Need for very fat pipes, limited multiple Virtual Organizations, few to few, peer to peer. Type 3 users: High Energy Physics Astronomers, eVLBI, High Definition multimedia over IP Massive data transfers from experiments running 24x7

10 Copyright AARNet 200510 What is the GLIF? Global Lambda Infrastructure Facility - www.glif.is International virtual organization that supports persistent data-intensive scientific research and middleware development Provides ability to create dedicated international point to point Gigabit Ethernet circuits for “fixed term” experiments

11 Copyright AARNet 200511 Huygens Space Probe – a practical example Cassini spacecraft left Earth in October 1997 to travel to Saturn On Christmas Day 2004, the Huygens probe separated from Cassini Started it’s descent through the dense atmosphere of Titan on 14 Jan 2005 Using this technique 17 telescopes in Australia, China, Japan and the US were able to accurately position the probe to within a kilometre (Titan is ~1.5 billion kilometres from Earth) Need to transfer Terabytes of data between Australia and the Netherlands Very Long Baseline Interferometry (VLBI) is a technique where widely separated radio- telescopes observe the same region of the sky simultaneously to generate images of cosmic radio sources

12 Copyright AARNet 200512 AARNet - CSIRO ATNF contribution Created “dedicated” circuit The data from two of the Australian telescopes (Parkes [The Dish] & Mopra) was transferred via light plane to CSIRO Marsfield (Sydney) CeNTIE based fibre from CSIRO Marsfield to AARNet3 GigaPOP SXTransPORT 10G to Seattle “Lightpath” to Joint Institute for VLBI in Europe (JIVE) across CA*net4 and SURFnet optical infrastructure

13 Copyright AARNet 200513 But……….. 9 organisations in 4 countries involved in “making it happen” Required extensive human-human interaction (mainly emails…….lots of them) Although a 1Gbps path was available, maximum throughput was around 400Gbps Issues with protocols, stack tuning, disk-to- disk transfer, firewalls, different formats, etc Currently scientists and engineers need to test thoroughly before important experiments, not yet “turn up and use” Ultimate goal is for the control plane issues to be transparent to the end-user who simply presses the “make it happen” icon Although time from concept to undertaking the scientific experiment was only 3 weeks……..

14 Copyright AARNet 200514 International path for Huygens transfer

15 Copyright AARNet 200515 EXPReS and Square Kilometre Array SKA bigger data generator than LHC But in a remote location Australia one of countries bidding for SKA – significant infrastructure challenges Also, Eu Commision funded EXPReS project to link 16 radio telescopes around the world at gigabit speeds

16 Copyright AARNet 200516 In Conclusion scientists and network engineers working together can exploit the new opportunities that high capacity networking opens up for “big science” Need to solve issues associated with scalability, control plane, ease of use QUESTIONS?


Download ppt "Copyright AARNet 20051 Massive Data Transfers George McLaughlin Mark Prior AARNet."

Similar presentations


Ads by Google