A Silvio Pardi on behalf of the SuperB Collaboration a INFN-Napoli -Campus di M.S.Angelo Via Cinthia– 80126, Napoli, Italy CHEP12 – New York – USA – May.

Slides:



Advertisements
Similar presentations
S.Chechelnitskiy / SFU Simon Fraser Running CE and SE in a XEN virtualized environment S.Chechelnitskiy Simon Fraser University CHEP 2007 September 6 th.
Advertisements

EGEE is a project funded by the European Union under contract IST Using SRM: DPM and dCache G.Donvito,V.Spinoso INFN Bari
High Performance Computing Course Notes Grid Computing.
Distributed Tier1 scenarios G. Donvito INFN-BARI.
Linux Clustering A way to supercomputing. What is Cluster? A group of individual computers bundled together using hardware and software in order to make.
Presented by: Yash Gurung, ICFAI UNIVERSITY.Sikkim BUILDING of 3 R'sCLUSTER PARALLEL COMPUTER.
VMware Infrastructure Alex Dementsov Tao Yang Clarkson University Feb 28, 2007.
DatacenterMicrosoft Azure Consistency Connectivity Code.
Undergraduate Poster Presentation Match 31, 2015 Department of CSE, BUET, Dhaka, Bangladesh Wireless Sensor Network Integretion With Cloud Computing H.M.A.
Site Report US CMS T2 Workshop Samir Cury on behalf of T2_BR_UERJ Team.
Project Cysera Hardware Configuration Drafted by Zoebir Bong.
DPM Italian sites and EPEL testbed in Italy Alessandro De Salvo (INFN, Roma1), Alessandra Doria (INFN, Napoli), Elisabetta Vilucchi (INFN, Laboratori Nazionali.
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
Edge Based Cloud Computing as a Feasible Network Paradigm(1/27) Edge-Based Cloud Computing as a Feasible Network Paradigm Joe Elizondo and Sam Palmer.
Data Mining on the Web via Cloud Computing COMS E6125 Web Enhanced Information Management Presented By Hemanth Murthy.
Database Services for Physics at CERN with Oracle 10g RAC HEPiX - April 4th 2006, Rome Luca Canali, CERN.
Module 12: Designing High Availability in Windows Server ® 2008.
Standard FTP and GridFTP protocols for international data transfer in Pamela Satellite Space Experiment R. Esposito 1, P. Mastroserio 1, F. Taurino 1,2,
DISTRIBUTED COMPUTING
W HAT IS H ADOOP ? Hadoop is an open-source software framework for storing and processing big data in a distributed fashion on large clusters of commodity.
Introduction to Apache Hadoop Zibo Wang. Introduction  What is Apache Hadoop?  Apache Hadoop is a software framework which provides open source libraries.
Introduction to Hadoop and HDFS
Hadoop Hardware Infrastructure considerations ©2013 OpalSoft Big Data.
Farm Management D. Andreotti 1), A. Crescente 2), A. Dorigo 2), F. Galeazzi 2), M. Marzolla 3), M. Morandin 2), F.
VICCI: Programmable Cloud Computing Research Testbed Andy Bavier Princeton University November 3, 2011.
INTRODUCTION The GRID Data Center at INFN Pisa hosts a big Tier2 for the CMS experiment, together with local usage from other HEP related/not related activities.
Large Scale Test of a storage solution based on an Industry Standard Michael Ernst Brookhaven National Laboratory ADC Retreat Naples, Italy February 2,
ScotGRID:The Scottish LHC Computing Centre Summary of the ScotGRID Project Summary of the ScotGRID Project Phase2 of the ScotGRID Project Phase2 of the.
Eneryg Efficiency for MapReduce Workloads: An Indepth Study Boliang Feng Renmin University of China Dec 19.
Data transfer over the wide area network with a large round trip time H. Matsunaga, T. Isobe, T. Mashimo, H. Sakamoto, I. Ueda International Center for.
Wide Area Network Access to CMS Data Using the Lustre Filesystem J. L. Rodriguez †, P. Avery*, T. Brody †, D. Bourilkov *, Y.Fu *, B. Kim *, C. Prescott.
D0SAR - September 2005 Andre Sznajder 1 Rio GRID Initiatives : T2-HEPGRID Andre Sznajder UERJ(Brazil)
Performance Evaluation of Image Conversion Module Based on MapReduce for Transcoding and Transmoding in SMCCSE Speaker : 吳靖緯 MA0G IEEE.
1 CMPE 511 HIGH PERFORMANCE COMPUTING CLUSTERS Dilek Demirel İşçi.
ITEP computing center and plans for supercomputing Plans for Tier 1 for FAIR (GSI) in ITEP  8000 cores in 3 years, in this year  Distributed.
THE NAPLES GROUP: RESOURCES SCoPE Datacenter of more than CPU/core and 300TB including infiniband and MPI Library in supporting Fast Simulation activy.
Challenges of deploying Wide-Area-Network Distributed Storage System under network and reliability constraints – A case study
HADOOP DISTRIBUTED FILE SYSTEM HDFS Reliability Based on “The Hadoop Distributed File System” K. Shvachko et al., MSST 2010 Michael Tsitrin 26/05/13.
Status of the Bologna Computing Farm and GRID related activities Vincenzo M. Vagnoni Thursday, 7 March 2002.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
 Introduction  Architecture NameNode, DataNodes, HDFS Client, CheckpointNode, BackupNode, Snapshots  File I/O Operations and Replica Management File.
International Workshop on HEP Data Grid Aug 23, 2003, KNU Status of Data Storage, Network, Clustering in SKKU CDF group Intae Yu*, Joong Seok Chae Department.
Condor on WAN D. Bortolotti - INFN Bologna T. Ferrari - INFN Cnaf A.Ghiselli - INFN Cnaf P.Mazzanti - INFN Bologna F. Prelz - INFN Milano F.Semeria - INFN.
Final Implementation of a High Performance Computing Cluster at Florida Tech P. FORD, X. FAVE, K. GNANVO, R. HOCH, M. HOHLMANN, D. MITRA Physics and Space.
Cloud Computing project NSYSU Sec. 1 Demo. NSYSU EE IT_LAB2 Outline  Our system’s architecture  Flow chart of the hadoop’s job(web crawler) working.
THE GLUE DOMAIN DEPLOYMENT The middleware layer supporting the domain-based INFN Grid network monitoring activity is powered by GlueDomains [2]. The GlueDomains.
Load Rebalancing for Distributed File Systems in Clouds.
HADOOP Dr. Silvio Pardi INFN-Naples.
1 The S.Co.P.E. Project and its model of procurement G. Russo, University of Naples Prof. Guido Russo.
By: Joel Dominic and Carroll Wongchote 4/18/2012.
G. Russo, D. Del Prete, S. Pardi Kick Off Meeting - Isola d'Elba, 2011 May 29th–June 01th A proposal for distributed computing monitoring for SuperB G.
G. Russo, D. Del Prete, S. Pardi Frascati, 2011 april 4th-7th The Naples' testbed for the SuperB computing model: first tests G. Russo, D. Del Prete, S.
PRIN STOA-LHC: STATUS BARI BOLOGNA-18 GIUGNO 2014 Giorgia MINIELLO G. MAGGI, G. DONVITO, D. Elia INFN Sezione di Bari e Dipartimento Interateneo.
S. Pardi Computing R&D Workshop Ferrara 2011 – 4 – 7 July SuperB R&D on going on storage and data access R&D Storage Silvio Pardi
SuperB – Naples Site Dr. Silvio Pardi. Right now the Napoli Group is employed in 3 main tasks relate the computing in SuperB Fast Simulation Electron.
Dynamic Extension of the INFN Tier-1 on external resources
Distributed storage, work status
SuperB – INFN-Bari Giacinto DONVITO.
OpenMosix, Open SSI, and LinuxPMI
A testbed for the SuperB computing model
LCG 3D Distributed Deployment of Databases
Silvio Pardi R&D Storage Silvio Pardi
Software Defined Storage
Heterogeneous Computation Team HybriLIT
Yaodong CHENG Computing Center, IHEP, CAS 2016 Fall HEPiX Workshop
Experience of Lustre at a Tier-2 site
Clouds of JINR, University of Sofia and INRNE Join Together
Windows Server 2016 Software Defined Storage
CLUSTER COMPUTING.
GridTorrent Framework: A High-performance Data Transfer and Data Sharing Framework for Scientific Computing.
Presentation transcript:

a Silvio Pardi on behalf of the SuperB Collaboration a INFN-Napoli -Campus di M.S.Angelo Via Cinthia– 80126, Napoli, Italy CHEP12 – New York – USA – May 21-25, 2012 Testing and evaluating storage technology to build a distributed Tier1 for SuperB in Italy The data analysis of the SuperB collider[a] will done in several centers of the International Collaboration. A new distributed supercomputing center, dedicated for SuperB, is under construction within the Italian Cloud. This infrastructure will provide Tier1 class services, in term of available resources and processing activities. The designed Infrastructure is based on 4 centers in south Italy connected by high speed network links. The goal of the present study is investigate models and technologies of Data-Storage in order to obtain a reliable and consistent framework to share data between the sites and support both central analysis (Skimming, MonteCarlo) and the chaotic final users's analysis. INTRODUCTION - 10 Gbps future links - 10 Gbps current links Technology Under Investigation Exploiting Dynamic Data Placement and automatic data distribution system through the native feature of distributed file system. HDFS – The Hadoop File System the fuse module. GlusterFS configured as a distributed FS with replica policy. The interest is explore those technologies both in a Local Area Network farm and on a Wide Area Network test bed We tested the main characteristics of GlusterFS using SuperB analysis Job. Test on the network usage show that GlusterFS can scale over a 10Gbit/s in a small/medium cluster of nodes. Analysis of CPU of the Gluster Demon, suggest to dedicate a core for node in order to don’t affect drastically the performances in terms of jobs/s rate. 12 Server PowerEdge R510 2 CPU quad- core Intel® Xeon 32 Gigabyte Ram 1066 Mhz 4 HD –Raid0 500 GB 7200 Rpm Broadcom NetXtreme I GbE NIC O.S. Scientific Linux Hadoop Cluster in Bari Gluster FS Cluster in Napoli HDFS/Glus terFS Network Usage 8 Jobs on a Single Node Max rate 333MB/s Network Usage when all the Nodes are involved 7 Jobs per Node Max rate 333MB/s First Test done with GlusterFS and Hadoop showed interesting features for the SuperB purposes, in term of functionality, scalability and reliability. Both the file system provide a multicluster configuration extensible over WAN. A first geographic implementation of HDSF was implemented between Napoli and Bari. The next step is setup the GlusterFS over WAN and made stress test in order to compare the two technologies. File System 1 Proc. 2 Proc. 4 Proc.8 Proc.16 Proc.24 Proc. GlusterFS The table below show the time spent for the computation of 1, 2, 4 to 24 jobs on the same machine. Conclusions and Next Steps The test on HADOOP carried on at INFN-Bari is focused to the understanding the capabilities of Hadoop file-system to cope with any kind of hardware and software failures. We have verified that each node (both name node and data nodes) could fail with small or not service interruption At INFN-Bari, we are modifying the HADOOP replication policy in order to have a broader data distribution among different racks within a farm, and in order to provide the capability of understanding the rack topology. This added feature could provide the capability of understanding the rack distribution among different farm or in general among different failures domain. At the end of the work it is foreseen that this policy will allow the file-system to resist to a complete farm failure. 500GB of SuperB simulate data are replicated from Pisa to INFN- Naples and INFN-Bari in order to test analysis job with HADOOP, The data can be shared with INFN-Bari hadoop Rack through the distribute HDFS