The Virtual Data Toolkit distributed by the Open Science Grid Richard Jones University of Connecticut CAT project meeting, June 24, 2008.

Slides:



Advertisements
Similar presentations
Tivoli SANergy. SANs are Powerful, but... Most SANs today offer limited value One system, multiple storage devices Multiple systems, isolated zones of.
Advertisements

A Proposal of Capacity and Performance Assured Storage in The PRAGMA Grid Testbed Yusuke Tanimura 1) Hidetaka Koie 1,2) Tomohiro Kudoh 1) Isao Kojima 1)
Gfarm v2 and CSF4 Osamu Tatebe University of Tsukuba Xiaohui Wei Jilin University SC08 PRAGMA Presentation at NCHC booth Nov 19,
OptorSim: A Replica Optimisation Simulator for the EU DataGrid W. H. Bell, D. G. Cameron, R. Carvajal, A. P. Millar, C.Nicholson, K. Stockinger, F. Zini.
University of Chicago Department of Energy The Parallel and Grid I/O Perspective MPI, MPI-IO, NetCDF, and HDF5 are in common use Multi TB datasets also.
Setting up Small Grid Testbed
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
New Cluster for Heidelberg TRD(?) group. New Cluster OS : Scientific Linux 3.06 (except for alice-n5) Batch processing system : pbs (any advantage rather.
OSG GUMS CE SE VOMS VOMRS UConn-OSG University of Connecticut GLUEX support center Gluex VO Open Science Grid All-Hands Meeting, Chicago, IL, Mar. 8-11,
August 25, 2003 Richard Jones, Prof of Physics, University of Connecticut 1 project manager client browser web server GUI applet project 1 project 2 project.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
Reported by Richard Jones GlueX collaboration meeting, Newport News, May 13, 2009 Collaborative Analysis Toolkit for Partial Wave Analysis.
The Difficulties of Distributed Data Douglas Thain Condor Project University of Wisconsin
NPACI: National Partnership for Advanced Computational Infrastructure August 17-21, 1998 NPACI Parallel Computing Institute 1 Cluster Archtectures and.
MULTICOMPUTER 1. MULTICOMPUTER, YANG DIPELAJARI Multiprocessors vs multicomputers Interconnection topologies Switching schemes Communication with messages.
Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.
A Workflow-Aware Storage System Emalayan Vairavanathan 1 Samer Al-Kiswany, Lauro Beltrão Costa, Zhao Zhang, Daniel S. Katz, Michael Wilde, Matei Ripeanu.
A Lightweight Platform for Integration of Resource Limited Devices into Pervasive Grids Stavros Isaiadis and Vladimir Getov University of Westminster
DISTRIBUTED COMPUTING
José M. Hernández CIEMAT Grid Computing in the Experiment at LHC Jornada de usuarios de Infraestructuras Grid January 2012, CIEMAT, Madrid.
The SLAC Cluster Chuck Boeheim Assistant Director, SLAC Computing Services.
Alain Romeyer - 15/06/20041 CMS farm Mons Final goal : included in the GRID CMS framework To be involved in the CMS data processing scheme.
D0 SAM – status and needs Plagarized from: D0 Experiment SAM Project Fermilab Computing Division.
1 Evolution of OSG to support virtualization and multi-core applications (Perspective of a Condor Guy) Dan Bradley University of Wisconsin Workshop on.
Job Submission Condor, Globus, Java CoG Kit Young Suk Moon.
Computer Science Section National Center for Atmospheric Research Department of Computer Science University of Colorado at Boulder Blue Gene Experience.
Amy Apon, Pawel Wolinski, Dennis Reed Greg Amerson, Prathima Gorjala University of Arkansas Commercial Applications of High Performance Computing Massive.
Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson
SoCal Infrastructure OptIPuter Southern California Network Infrastructure Philip Papadopoulos OptIPuter Co-PI University of California, San Diego Program.
LOGO PROOF system for parallel MPD event processing Gertsenberger K. V. Joint Institute for Nuclear Research, Dubna.
Grid Computing at Yahoo! Sameer Paranjpye Mahadev Konar Yahoo!
Log-structured Memory for DRAM-based Storage Stephen Rumble, John Ousterhout Center for Future Architectures Research Storage3.2: Architectures.
7. Replication & HA Objectives –Understand Replication and HA Contents –Standby server –Failover clustering –Virtual server –Cluster –Replication Practicals.
Grid Architecture William E. Johnston Lawrence Berkeley National Lab and NASA Ames Research Center (These slides are available at grid.lbl.gov/~wej/Grids)
Large Scale Parallel File System and Cluster Management ICT, CAS.
Chapter 8-2 : Multicomputers Multiprocessors vs multicomputers Multiprocessors vs multicomputers Interconnection topologies Interconnection topologies.
Cloud Age Time to change the programming paradigm?
GVis: Grid-enabled Interactive Visualization State Key Laboratory. of CAD&CG Zhejiang University, Hangzhou
Rob Allan Daresbury Laboratory NW-GRID Training Event 25 th January 2007 Introduction to NW-GRID R.J. Allan CCLRC Daresbury Laboratory.
EVGM081 Multi-Site Virtual Cluster: A User-Oriented, Distributed Deployment and Management Mechanism for Grid Computing Environments Takahiro Hirofuchi,
Terascala – Lustre for the Rest of Us  Delivering high performance, Lustre-based parallel storage appliances  Simplifies deployment, management and tuning.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Implementation and performance analysis of.
Presented by Reliability, Availability, and Serviceability (RAS) for High-Performance Computing Stephen L. Scott Christian Engelmann Computer Science Research.
Efficiency of small size tasks calculation in grid clusters using parallel processing.. Olgerts Belmanis Jānis Kūliņš RTU ETF Riga Technical University.
A Fully Automated Fault- tolerant System for Distributed Video Processing and Off­site Replication George Kola, Tevfik Kosar and Miron Livny University.
2. WP9 – Earth Observation Applications ESA DataGrid Review Frascati, 10 June Welcome and introduction (15m) 2.WP9 – Earth Observation Applications.
HPC HPC-5 Systems Integration High Performance Computing 1 Application Resilience: Making Progress in Spite of Failure Nathan A. DeBardeleben and John.
Bulk Data Transfer Activities We regard data transfers as “first class citizens,” just like computational jobs. We have transferred ~3 TB of DPOSS data.
T3g software services Outline of the T3g Components R. Yoshida (ANL)
Running clusters on a Shoestring Fermilab SC 2007.
Joint Institute for Nuclear Research Synthesis of the simulation and monitoring processes for the data storage and big data processing development in physical.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
Activities and Perspectives at Armenian Grid site The 6th International Conference "Distributed Computing and Grid- technologies in Science and Education"
Creating Grid Resources for Undergraduate Coursework John N. Huffman Brown University Richard Repasky Indiana University Joseph Rinkovsky Indiana University.
Condor on Dedicated Clusters Peter Couvares and Derek Wright Computer Sciences Department University of Wisconsin-Madison
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING
XNAT at Scale June 7, 2016.
Introduction to Distributed Platforms
File Share Dependencies
Network Requirements Javier Orellana
GRID COMPUTING PRESENTED BY : Richa Chaudhary.
Grid Canada Testbed using HEP applications
湖南大学-信息科学与工程学院-计算机与科学系
CLUSTER COMPUTING.
TeraScale Supernova Initiative
Collaborative Analysis Toolkit 2008 Work Report Supplementary
Status of Grids for HEP and HENP
EGI High-Throughput Compute
Presentation transcript:

the Virtual Data Toolkit distributed by the Open Science Grid Richard Jones University of Connecticut CAT project meeting, June 24, 2008

2 the UConn Grendl cluster  62 dual-processor nodes  mix of old and newer cpu’s  7 TB of shared storage  condor job management  heavy reliance on nfs  home-built processing workflow package called “openShop”

CAT project meeting, June 24, the UConn Grendl cluster  Efficient MPI job scheduling using the condor “parallel universe”  Large datasets staged on large distributed “parallel virtual file system” (pvfs) volumes high throughput low cost – no dedicated file servers reduced cpu – data location coupling

CAT project meeting, June 24, Obstacles to scaling  nfs servers x clients = N 2 problem 1 server down hangs/drags all N clients starts to be a problem with 62 nodes cross-site nfs is an admin nightmare!  pvfs 1 server down hangs entire volume poor recovery, compared to nfs invasive installation procedure

CAT project meeting, June 24, CAT project scaling  data large base-input datasets  non-volatile  non-replicated relatively compact PWA event lists  volatile  replicated complex workflow pattern global management scheme is needed

CAT project meeting, June 24, CAT project scaling  processor co-scheduling cpu resource allocation in clusters  network latency  allocation persistence not tied to client location access independent of local userid global resource monitoring required