Hiroyuki Matsunaga (Some materials were provided by Go Iwai) Computing Research Center, KEK Lyon, March 2012 1.

Slides:



Advertisements
Similar presentations
University of St Andrews School of Computer Science Experiences with a Private Cloud St Andrews Cloud Computing co-laboratory James W. Smith Ali Khajeh-Hosseini.
Advertisements

High Speed Total Order for SAN infrastructure Tal Anker, Danny Dolev, Gregory Greenman, Ilya Shnaiderman School of Engineering and Computer Science The.
High Performance Computing Course Notes Grid Computing.
Lesson 11-Virtual Private Networks. Overview Define Virtual Private Networks (VPNs). Deploy User VPNs. Deploy Site VPNs. Understand standard VPN techniques.
Network+ Guide to Networks, Fourth Edition Chapter 1 An Introduction to Networking.
IFIN-HH LHCB GRID Activities Eduard Pauna Radu Stoica.
March 27, IndiaCMS Meeting, Delhi1 T2_IN_TIFR of all-of-us, for all-of-us, by some-of-us Tier-2 Status Report.
Grid Computing for High Energy Physics in Japan Hiroyuki Matsunaga International Center for Elementary Particle Physics (ICEPP), The University of Tokyo.
1 Deployment of an LCG Infrastructure in Australia How-To Setup the LCG Grid Middleware – A beginner's perspective Marco La Rosa
ALICE Tier-2 at Hiroshima Toru Sugitate of Hiroshima University for ALICE-Japan GRID Team LHCONE workshop at the APAN 38 th.
KEK Network Qi Fazhi KEK SW L2/L3 Switch for outside connections Central L2/L3 Switch A Netscreen Firewall Super Sinet Router 10GbE 2 x GbE IDS.
CMS Data Transfer Challenges LHCOPN-LHCONE meeting Michigan, Sept 15/16th, 2014 Azher Mughal Caltech.
Section 11.1 Identify customer requirements Recommend appropriate network topologies Gather data about existing equipment and software Section 11.2 Demonstrate.
1 A Basic R&D for an Analysis Framework Distributed on Wide Area Network Hiroshi Sakamoto International Center for Elementary Particle Physics (ICEPP),
1 ESnet Network Measurements ESCC Feb Joe Metzger
Status Report on Tier-1 in Korea Gungwon Kang, Sang-Un Ahn and Hangjin Jang (KISTI GSDC) April 28, 2014 at 15th CERN-Korea Committee, Geneva Korea Institute.
SICSA student induction day, 2009Slide 1 Social Simulation Tutorial Session 6: Introduction to grids and cloud computing International Symposium on Grid.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE II - Network Service Level Agreement (SLA) Establishment EGEE’07 Mary Grammatikou.
Infrastructure for Better Quality Internet Access & Web Publishing without Increasing Bandwidth Prof. Chi Chi Hung School of Computing, National University.
Computing for ILC experiment Computing Research Center, KEK Hiroyuki Matsunaga.
Profiling Grid Data Transfer Protocols and Servers George Kola, Tevfik Kosar and Miron Livny University of Wisconsin-Madison USA.
Networking LAN (Local Area Network)  A network is a collection of computers that communicate with each other through a shared network medium.  LANs.
Connect communicate collaborate perfSONAR MDM updates: New interface, new possibilities Domenico Vicinanza perfSONAR MDM Product Manager
Data GRID Activity in Japan Yoshiyuki WATASE KEK (High energy Accelerator Research Organization) Tsukuba, Japan
15-1 Networking Computer network A collection of computing devices that are connected in various ways in order to communicate and share resources.
A short introduction to the Worldwide LHC Computing Grid Maarten Litmaath (CERN)
Guide to Linux Installation and Administration, 2e1 Chapter 2 Planning Your System.
The HEPiX IPv6 Working Group David Kelsey EGI TF, Prague 18 Sep 2012.
Thoughts on Future LHCOPN Some ideas Artur Barczyk, Vancouver, 31/08/09.
Data transfer over the wide area network with a large round trip time H. Matsunaga, T. Isobe, T. Mashimo, H. Sakamoto, I. Ueda International Center for.
Example: Sorting on Distributed Computing Environment Apr 20,
UNH network currently supports a 2.5 gbps network on campus. This network services thousands of users a day, and must be secure. These security measures.
Connect communicate collaborate Intercontinental Multi-Domain Monitoring for the LHC Community Domenico Vicinanza perfSONAR MDM Product Manager DANTE –
Connect. Communicate. Collaborate perfSONAR MDM Service for LHC OPN Loukik Kudarimoti DANTE.
Online-Offsite Connectivity Experiments Catalin Meirosu *, Richard Hughes-Jones ** * CERN and Politehnica University of Bucuresti ** University of Manchester.
Network to and at CERN Getting ready for LHC networking Jean-Michel Jouanigot and Paolo Moroni CERN/IT/CS.
Les Les Robertson LCG Project Leader High Energy Physics using a worldwide computing grid Torino December 2005.
National HEP Data Grid Project in Korea Kihyeon Cho Center for High Energy Physics (CHEP) Kyungpook National University CDF CAF & Grid Meeting July 12,
NETWORKING FUNDAMENTALS. Network+ Guide to Networks, 4e2.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE Site Architecture Resource Center Deployment Considerations MIMOS EGEE Tutorial.
IHEP(Beijing LCG2) Site Report Fazhi.Qi, Gang Chen Computing Center,IHEP.
Content: India’s e-infrastructure an overview The Regional component of the Worldwide LHC Computing Grid (WLCG ) India-CMS and India-ALICE Tier-2 site.
Globus and PlanetLab Resource Management Solutions Compared M. Ripeanu, M. Bowman, J. Chase, I. Foster, M. Milenkovic Presented by Dionysis Logothetis.
Feb. 14, 2002DØRAM Proposal DØ IB Meeting, Jae Yu 1 Proposal for a DØ Remote Analysis Model (DØRAM) Introduction Partial Workshop Results DØRAM Architecture.
3/12/2013Computer Engg, IIT(BHU)1 CLOUD COMPUTING-1.
US LHC Tier-2 Network Performance BCP Mar-3-08 LHC Community Network Performance Recommended BCP Eric Boyd Deputy Technology Officer Internet2.
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
CRISP WP18, High-speed data recording Krzysztof Wrona, European XFEL PSI, 18 March 2013.
KISTI activities and plans Global experiment Science Data hub Center Jin Kim LHCOPN-ONE Workshop in Taipei1.
Activities and Perspectives at Armenian Grid site The 6th International Conference "Distributed Computing and Grid- technologies in Science and Education"
© 2015 MetricStream, Inc. All Rights Reserved. AWS server provisioning © 2015 MetricStream, Inc. All Rights Reserved. By, Srikanth K & Rohit.
Campana (CERN-IT/SDC), McKee (Michigan) 16 October 2013 Deployment of a WLCG network monitoring infrastructure based on the perfSONAR-PS technology.
The status of IHEP Beijing Site WLCG Asia-Pacific Workshop Yaodong CHENG IHEP, China 01 December 2006.
Troubleshooting Ben Fineman,
Performance measurement of transferring files on the federated SRB
LHCOPN/LHCONE status report pre-GDB on Networking CERN, Switzerland 10th January 2017
Report from WLCG Workshop 2017: WLCG Network Requirements GDB - CERN 12th of July 2017
Belle II Physics Analysis Center at TIFR
Jan 12, 2005 Improving CMS data transfers among its distributed Computing Facilities N. Magini CERN IT-ES-VOS, Geneva, Switzerland J. Flix Port d'Informació.
Establishing End-to-End Guaranteed Bandwidth Network Paths Across Multiple Administrative Domains The DOE-funded TeraPaths project at Brookhaven National.
Clouds of JINR, University of Sofia and INRNE Join Together
Network between CC-IN2P3 and KEK
Venue and Participants
Project: COMP_01 R&D for ATLAS Grid computing
Alerting/Notifications (MadAlert)
Introduction to HEPiX Helge Meinhard, CERN-IT
Chapter 16: Distributed System Structures
ESnet Network Measurements ESCC Feb Joe Metzger
Interoperability of Digital Repositories
Big-Data around the world
Presentation transcript:

Hiroyuki Matsunaga (Some materials were provided by Go Iwai) Computing Research Center, KEK Lyon, March

Outline Introduction Network and Grid Belle II experiment Collaboration among Asian computing centers Network monitoring perfSONAR deployment Data sharing testbed Data storage system as a PoP (Point of Presence) Data transfer/sharing between PoPs Implementation Summary 2

Background The network becomes more and more important in High Energy and Nuclear Physics Smaller number of larger experiments in the world More collaborators worldwide Distributed data analysis using remote data centers Remote operations of a detector and an accelerator New communication tools: , web, phone and video conference etc. Increasing data volume Higher energies and/or higher luminosities More sensors in a detector for higher granularity/precision 3

Data Grid The Grid is used in large HEP experiments In particular, the WLCG (Worldwide LHC Computing Grid) for the LHC experiments at CERN is the largest one The Belle II experiment, hosted by KEK, is going to use the Grid Many Asian institutes have been involved in the LHC or Belle II experiments Stable network operation is vital for the Grid This Grid is a “Data Grid” that needs to handle a large amount of data High throughputs over WAN for data transfers between sites Do not forget about local data analysis which needs more bandwidth in LAN than data transfer over WAN 4

Network Monitoring Network monitoring system is necessary for stable operations of the Grid and other high network activities Also useful for common network use For administrators as well as users Makes it easier and faster to troubleshoot network problems Many parties (sites, network providers) are involved Difficult to spot problems under the above circumstance, and occasionally takes long time to fix them The system can be established with little effort and low cost Only a few servers are needed Once set up, operational cost is very low 5

Deploying perfSONAR perfSONAR is a network performance monitoring infrastructure Developed by major academic network providers: ESnet, GEANT, internet2, … Deployed at many WLCG sites Tier 0 & Tier 1 sites, large Tier 2 sites (involved in LHCOPN or LHCONE) LHCOPN: S S LHCONE: 6

perfSONAR perfSONAR includes a collection of tools to perform various network tests bi-directionally Bandwidth BWCTL (BandWidth ConTroL): using iperf etc. Latency OWAMP (One Way Active Measurement Protocol), … Traceroute Packet loss perfSONAR also provides Graphical tools (Cacti) Each site should have dedicated machines Better to have 2 small servers instead of 1 powerful machine One server for bandwidth, the other for latency With a network connectivity of 1 Gbps or more Preferably 10 Gbps, if a site is connected to the internet with >10 G link 7

8

perfSONAR at KEK KEK has been running perfSONAR on a test machine for the last few years Set up by Prof. Soh Suzuki in network group of KEK pS-Performance Toolkit 3.2 was installed perfsonar-test1.kek.jp (Pentium4, 3.0 GHz) Scientific Linux 5.4, 32 bit 1 Gbps NIC Most of the services are running Some tools in perfSONAR have been helpful to solve network problems Network throughputs were checked by BWCTL for a network path from KEK to Pacific Northwest National Laboratory (PNNL) in U.S. last year 9

New Servers at KEK Setting up 2 new machines Primarily for the Belle II Grid operations Based on the PS hardware recommendations Also with reference to LHC documents 2 Dell R310 servers CPU: single Xeon X3470 Memory: 4 GB (32 bit OS is only available by net install) HDD: 500 GB 10 G NICs The new servers will be located in the same subnet as the Belle II computer system The same firewall is used 10

Deployment Issues Firewall is a matter of concern Many ports (not only TCP but also UDP) have to be open to all other collaboration sites Negotiation/coordination is needed for the deployment and configuration The main obstacle will be a site policy rather than a technical issue? Most of the LHCOPN/LHCONE servers are inaccessible from outsiders Better to have a homogeneous environment (hardware and configuration) among the sites In order to obtain “absolute” results… In line with LHC sites? 11

Belle II Experiment The Belle II Grid plans to deploy perfSONAR at all participating sites and monitor network conditions continuously in the full-mesh topology (i.e. between any combination of 2 sites) We will officially propose this deployment plan at the Belle II Grid site meeting at Munich (co-located with EGI community forum) later this month Participating sites include so far: Asia Pacific: KEK (Japan), TIFR (India), KISTI (Korea), Melbourne (Australia), IHEP (China), ASGC (Taiwan) Europe: GridKa (Germany), SiGNet (Sloveina), Cyfronet (Poland), Prague (Czech) America: PNNL (USA), Virginia Tech (USA) 12

Deployment in Asian Region We recently proposed the establishment of network monitoring infrastructure in the Asian region At the AFAD (Asian Forum for Accelerators and Detectors) 2012 meeting in Kolkata, India in February 2012 This could be (partly) shared with the Belle II perfSONAR infrastructure Understanding network conditions in Asia would be interesting (Academic) network connectivity within Asia is not as good as that between Asia and U.S./Europe The launching of perfSONAR could be a first step for a future collaboration among Asian computer centers 13

Asia-Pacific Network (from 14

Central Dashboard For WLCG, BNL has set up a central monitor server for the perfSONAR results in LHCOPN, LHCONE,… Helpful for administrators and experiments to check a network and site problems KEK will set up a similar system for the Belle II Also for the Asian institutes 15

perfSONAR Dashboard at BNL 16

Data Sharing Testbed Data sharing testbed was also proposed (by Iwai san) at the AFAD 2012 meeting last month The aim is to provide storage service for a distributed environment The service should be easy to use and manage, and good at the performance for data transfer and access We will build it as a testbed, and do not intend to operate in production (for the time being) The testbed could be used for replication (for emergency) or as temporary space (in case of migration etc.) This testbed can be also used for actual network performance test More realistic network test compared to perfSONAR Should be employed subsequent to the perfSONAR deployment 17

RENKEI-PoP (Point of Presence) RENKEI-PoP is a good model for the data sharing testbed It is an appliance for e-Science data federation Originally proposed in the RENKEI project RENKEI means “federation” in Japanese The RENKEI project aims to develop middleware for federation or sharing of distributed computing resources RENKEI-PoP targets the development and evaluation of middleware and provisions of a means of collaboration between users Installed in each computer center as a gateway server (PoP) 18

19 Courtesy K. Aida RENKEI-POP is proposed in a sub-theme of the RENKEI project

RENKEI-POP (cont.) A RENKEI-PoP is just a storage server which has a large amount of data storage high speed network interface support for running VMs Built with open source software, such as linux kernel, kvm, libvirt and Grid middleware. Connected to a high speed (10 Gbps) R&E network (SINET) Realizes a distributed filesystem, or fast data sharing/transfer 20

RENKEI-PoP Deployment Gbps Network

Services by RENKEI-PoP File transfer/sharing between PoP’s and data access services Distributed filesystem (Gfarm) pNFS is expected to be employed in a testbed Gridftp, openssh, gsissh, gfarm-client, … Virtual machine hosting A cloud-like hosting service based on OpenNebula 22

Gfarm - Network Shared Filesystem - 23

Typical Use Case 24 Writes data to the nearest PoP (for user A) Reads data from the nearest PoP (for user B) File sharing/transfer between PoPs

Hardware Design The peak performance of inter-PoP disk-to-disk data transfer is designed to be 1 GB(bytes)ps Each PoP server is equipped with a high throughput storage device that uses 8 to 16 SSDs (or HDDs) via a HBA and a 10 GbE NIC for remote connections 25

Data Transfer Performance 14 GB astronomy data was sent from various PoP’s to 1 in Tokyo Tech (titech) Applied network tuning techniques include: kernel TCP and flow control parameter tunings Configuration of some device specific properties (such as TSO and interrupt interval) 26

Belle II and Asia-PoP We intend to have such infrastructure for Belle II and for Asian computer centers after our experience of RENKEI- PoP This will be prospoed in addition to the perfSONAR, at the Belle II meeting at Munich RENKEI-PoP connects to high speed network in Japan. The question on whether new testbed works well in worse network conditions should be explored. 27

Summary Network is vital in accelerator science nowadays For the Grid operations in particular We propose deploying perfSONAR at Belle II and at Asian computer centers to monitor network conditions in a full-mesh topology We also propose the employment of data sharing testbed Good experience from RENKEI-PoP Software for installation can be chosen on demand Gfarm (in NAREGI/RENKEI) is not widely used outside Japan These could be potential topics under FJPPL? 28