Reported by Richard Jones GlueX collaboration meeting, Newport News, May 13, 2009 Collaborative Analysis Toolkit for Partial Wave Analysis.

Slides:



Advertisements
Similar presentations
COMPUTERS: TOOLS FOR AN INFORMATION AGE Chapter 3 Operating Systems.
Advertisements

INTRODUCTION TO SIMULATION WITH OMNET++ José Daniel García Sánchez ARCOS Group – University Carlos III of Madrid.
Threads, SMP, and Microkernels
The AASPI Software Computational Environment Tim Kwiatkowski Welcome Consortium Members December 9,
Exploiting Graphics Processors for High- performance IP Lookup in Software Routers Author: Jin Zhao, Xinya Zhang, Xin Wang, Yangdong Deng, Xiaoming Fu.
GPGPU Introduction Alan Gray EPCC The University of Edinburgh.
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Threads, SMP, and Microkernels Chapter 4. Process Resource ownership - process is allocated a virtual address space to hold the process image Scheduling/execution-
AASPI Software Computational Environment Tim Kwiatkowski Welcome Consortium Members November 18, 2008.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Reference: Message Passing Fundamentals.
Parallel Programming on the SGI Origin2000 With thanks to Moshe Goldberg, TCC and Igor Zacharov SGI Taub Computer Center Technion Mar 2005 Anne Weill-Zrahia.
Computer Forensics Principles and Practices by Volonino, Anzaldua, and Godwin Chapter 6: Operating Systems and Data Transmission Basics for Digital Investigations.
Module – 7 network-attached storage (NAS)
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
Design and Implementation of a Single System Image Operating System for High Performance Computing on Clusters Christine MORIN PARIS project-team, IRISA/INRIA.
Introduction to Symmetric Multiprocessors Süha TUNA Bilişim Enstitüsü UHeM Yaz Çalıştayı
Web Servers Web server software is a product that works with the operating system The server computer can run more than one software product such as .
1 Integrating GPUs into Condor Timothy Blattner Marquette University Milwaukee, WI April 22, 2009.
Computer System Architectures Computer System Software
SSI-OSCAR A Single System Image for OSCAR Clusters Geoffroy Vallée INRIA – PARIS project team COSET-1 June 26th, 2004.
OpenMP in a Heterogeneous World Ayodunni Aribuki Advisor: Dr. Barbara Chapman HPCTools Group University of Houston.
Technology Overview. Agenda What’s New and Better in Windows Server 2003? Why Upgrade to Windows Server 2003 ?  From Windows NT 4.0  From Windows 2000.
UNIX SVR4 COSC513 Zhaohui Chen Jiefei Huang. UNIX SVR4 UNIX system V release 4 is a major new release of the UNIX operating system, developed by AT&T.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
MIDeA :A Multi-Parallel Instrusion Detection Architecture Author: Giorgos Vasiliadis, Michalis Polychronakis,Sotiris Ioannidis Publisher: CCS’11, October.
BLU-ICE and the Distributed Control System Constraints for Software Development Strategies Timothy M. McPhillips Stanford Synchrotron Radiation Laboratory.
Scalable Web Server on Heterogeneous Cluster CHEN Ge.
Contents 1.Introduction, architecture 2.Live demonstration 3.Extensibility.
CMAQ Runtime Performance as Affected by Number of Processors and NFS Writes Patricia A. Bresnahan, a * Ahmed Ibrahim b, Jesse Bash a and David Miller a.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
CE Operating Systems Lecture 3 Overview of OS functions and structure.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
GVis: Grid-enabled Interactive Visualization State Key Laboratory. of CAD&CG Zhejiang University, Hangzhou
Distribution and components. 2 What is the problem? Enterprise computing is Large scale & complex: It supports large scale and complex organisations Spanning.
Chapter 2 Introduction to Systems Architecture. Chapter goals Discuss the development of automated computing Describe the general capabilities of a computer.
AASPI Software Computational Environment Tim Kwiatkowski Welcome Consortium Members November 10, 2009.
CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003.
USATLAS dCache System and Service Challenge at BNL Zhenping (Jane) Liu RHIC/ATLAS Computing Facility, Physics Department Brookhaven National Lab 10/13/2005.
VMware vSphere Configuration and Management v6
Page 1 Printing & Terminal Services Lecture 8 Hassan Shuja 11/16/2004.
Chapter 1 Basic Concepts of Operating Systems Introduction Software A program is a sequence of instructions that enables the computer to carry.
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 4: Threads.
Pathway to Petaflops A vendor contribution Philippe Trautmann Business Development Manager HPC & Grid Global Education, Government & Healthcare.
Background Computer System Architectures Computer System Software.
Page 1 2P13 Week 1. Page 2 Page 3 Page 4 Page 5.
1 Chapter 2: Operating-System Structures Services Interface provided to users & programmers –System calls (programmer access) –User level access to system.
S. Pardi Frascati, 2012 March GPGPU Evaluation – First experiences in Napoli Silvio Pardi.
Constructing a system with multiple computers or processors 1 ITCS 4/5145 Parallel Programming, UNC-Charlotte, B. Wilkinson. Jan 13, 2016.
INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.
StratusLab Final Periodic Review
StratusLab Final Periodic Review
Distribution and components
Constructing a system with multiple computers or processors
Study course: “Computing clusters, grids and clouds” Andrey Y. Shevel
Grid Canada Testbed using HEP applications
Chapter 4: Threads.
Objective Understand the concepts of modern operating systems by investigating the most popular operating system in the current and future market Provide.
Chapter 2: System Structures
Chapter 2: The Linux System Part 1
Constructing a system with multiple computers or processors
Constructing a system with multiple computers or processors
Constructing a system with multiple computers or processors
MPJ: A Java-based Parallel Computing System
Outline Chapter 2 (cont) OS Design OS structure
Collaborative Analysis Toolkit 2008 Work Report Supplementary
Chapter 4: Threads & Concurrency
Objective Understand the concepts of modern operating systems by investigating the most popular operating system in the current and future market Provide.
Types of Parallel Computers
Presentation transcript:

reported by Richard Jones GlueX collaboration meeting, Newport News, May 13, 2009 Collaborative Analysis Toolkit for Partial Wave Analysis

Overview GOAL: Develop a framework for amplitude analysis that  is modular and independent of experiment.  scales to very large data sets.  accommodates increased computational demands.  allows amplitudes to be written by the physicist.  encourages a closer theory-experiment collaboration. GlueX collaboration meeting, Newport News, May 13, 2009 GRID courtesy of Ryan Mitchell, Munich PWA workshop

Overview COLLABORATION:  Funded by the National Science Foundation (NSF) Physics at the Information Frontier (PIF) program.  Indiana University: R. Mitchell, M. Shepherd, A. Szczepaniak  Carnegie Mellon University: C. Meyer, M. Williams  University of Connecticut: R. Jones, J. Yang  Plus: 2 more postdocs courtesy of Ryan Mitchell, Munich PWA workshop GlueX collaboration meeting, Newport News, May 13, 2009

Overview: work flow GlueX collaboration meeting, Newport News, May 13, 2009 Data Events 4-Vector Files Generate Describe 2.Amplitude Generation Amp files 3.Kinematic Variable Generation Kinvar files 4.Normalization Integral Generation Norma- lization files 5.Build a Fit RulesMethods 6.Run a Fit 3 directories: data, acc, raw

GlueX collaboration meeting, Newport News, May 13, 2009 courtesy of Ryan Mitchell, Munich PWA workshop

GlueX collaboration meeting, Newport News, May 13, 2009 courtesy of Ryan Mitchell, Munich PWA workshop

GlueX collaboration meeting, Newport News, May 13, 2009 courtesy of Ryan Mitchell, Munich PWA workshop

GlueX collaboration meeting, Newport News, May 13, 2009 courtesy of Ryan Mitchell, Munich PWA workshop

GlueX collaboration meeting, Newport News, May 13, 2009 courtesy of Ryan Mitchell, Munich PWA workshop

GlueX collaboration meeting, Newport News, May 13, 2009 Ongoing work on 3  system: Peng et.al.

GlueX collaboration meeting, Newport News, May 13, 2009 courtesy of Hrayr Matevosyan, Indiana University

GlueX collaboration meeting, Newport News, May 13, 2009 courtesy of Hrayr Matevosyan, Indiana University

GlueX collaboration meeting, Newport News, May 13, 2009 courtesy of Hrayr Matevosyan, Indiana University

14 1.Porting the Ruby-PWA toolkit to the grid 1.1 Install the Ruby-PWA software bundle on the UConn site 1.2 Benchmark and scaling studies 1.3 Parallel Ruby-PWA 2.Integration of grid storage into Ruby-PWA framework 2.1 Upgrade to the latest version of RedHat Linux 2.2 Monitoring data integrity of dCache pools 2.3 Grid data storage platform benchmarks 3.Extending the UConn cluster for use as a grid testbed 3.1 Installation 3.2 Tuning GlueX collaboration meeting, Newport News, May 13, 2009 Ongoing work at Connecticut

Install the Ruby-PWA software bundle on the UConn site ◦ Ruby-PWA  A flexible framework for doing partial wave analysis fits  Written in Ruby  C++ classes wrapped as Ruby objects ◦ Installation  Prerequisite: ruby-minuit  Updated version of crenlib root package  A number of build errors had to be investigated and solved  Run Ruby-PWA test fits GlueX collaboration meeting, Newport News, May 13, 2009

Benchmark and scaling studies ◦ Test data bundle from CMU ◦ Original test data: 2188 simulated data events ◦ Create test samples of size x10, x20, x30, and x50 by duplicating Original test data ◦ More extensive scaling studies with statistically independent data will be carried out once the code is ported to a parallel platform. GlueX collaboration meeting, Newport News, May 13, 2009

Parallel Ruby-PWA ◦ MPI 2.0 new features  Dynamic process management  "the ability of an MPI process to participate in the creation of new MPI processes or to establish communication with MPI processes that have been started separately.“ – From MPI-2 specification  One-sided communication  Three one-sided communications operations, Put, Get, and Accumulate, being a write to remote memory, a read from remote memory, and a reduction operation on the same memory across a number of tasks  Three different methods for synchronizing this communication - global, pairwise, and remote locks.  Collective extensions  In MPI-2, most collective operations apply also to intercommunicators GlueX collaboration meeting, Newport News, May 13, 2009

Parallel Ruby-PWA ◦ Comparison of OpenMPI 1.3 and MPICH 2  OpenMPI 1.3  Open MPI is an open source, production quality MPI-2 implementation  Developed and maintained by a consortium of academic, research, and industry partners  OpenMPI design is centered around component concepts.  Network connection devices: Shared memory, TCP, Myrinet, and Infiniband  Network connection devices are dynamically selected in run time.  We have tested this feature of OpenMPI. We also tested other basic functions of MPI, and got good performance  We are now incorporating OpenMPI into Condor GlueX collaboration meeting, Newport News, May 13, 2009

Parallel Ruby-PWA ◦ Comparison of OpenMPI 1.3 and MPICH 2(cont.)  MPICH 2  High-performance implementations of MPI-1 and MPI-2 functionality  MPICH2 separates communication from process management.  MPICH 2 channel devices such as sock, ssm, shm, etc…, can be specified on installation.  We select OpenMPI for future development of ruby-pwa  Advanced component architecture  Dynamic communications features  Broad support from academic and industry partners. GlueX collaboration meeting, Newport News, May 13, 2009

Parallel Ruby-PWA ◦ Hybrid programming model : OpenMP + MPI  OpenMP defines an API for writing multithreaded applications for running specifically on shared memory architectures.  Greatly simplifies writing multi-thread programs in Fortran, C and C++.  gcc version 3.4 and above on Linux supports OpenMP.  The hybrid programming model, OpenMP + MPI, combines the shared- memory and distributed-memory programming models.  In our tests, OpenMP implementation ran very efficiently on up to 8 processes.  OpenMPI implementation ran with essentially 100% scaling, provided that all of the communications channels were tcp-ip sockets.  OpenMPI tests using a mixture of shared-memory and tcp-ip communcations channels showed markedly lower performance. We are still investigating its cause. GlueX collaboration meeting, Newport News, May 13, 2009

Parallel Ruby-PWA ◦ GPU Computing  From Graphics Processing to Parallel Computing  GPU (Graphic Processor Unit) has evolved into a highly parallel, multithreaded, manycore processor with tremendous computational horsepower and very high memory bandwidth  In November 2006, NVIDIA introduced CUDA™, a general purpose parallel computing architecture – with a new parallel programming model and instruction set architecture.  The initial CUDA SDK was made public 15 February NVIDIA has released versions of the CUDA API for Microsoft Windows, Linux and Mac OS X.  Our test platform  GeForce 9800 GT(14 multiprocessors, 112 cores, 512MB memory)  Intel(R) Core(TM)2 CPU 2.40GHz  8GB memory GlueX collaboration meeting, Newport News, May 13, 2009

Parallel Ruby-PWA ◦ GPU Computing (cont.)  Benchmark: Matrix Multiplication GlueX collaboration meeting, Newport News, May 13, 2009

Parallel Ruby-PWA ◦ GPU Computing (cont.)  Constraints of GPU computing  One GPU can only be used by one process, it can not be shared by other processes.  Developers should design delicate programs to parallel their applications to overlap the latency of global memory.  CUDA uses a recursive-free, function-point-free subset of C language.  The programming skill for CUDA is new to developers. The learning cure may be steep for developers.  In some applications, it’s difficult for CUDA to achieve so high performance as matrix multiplication does.  Challenges  How to incorporate GPU computing into paralleling Ruby-PWA  How to explore, share and manage GPU resources on grid  How to hide the complexity of GPU computing to developers GlueX collaboration meeting, Newport News, May 13, 2009

Upgrade to the latest version of Redhat Linux ◦ Upgrade operating systems to latest version of Redhat Linux (CentOS 5 distribution) ◦ Upgrade Linux kernel to ◦ The upgrades are deployed to more than 80 nodes on UConn site. GlueX collaboration meeting, Newport News, May 13, 2009

Monitoring data integrity of dCache pools ◦ dCache  Single 'rooted' file system name space tree  Supports multiple internal and external copies of a single file  Data may be distributed among a huge amount of disk servers.  Automatic load balancing by cost metric and interpool transfers  Distributed Access Points (Doors)  Automatic HSM migration and restore  Widely used at Tier II and Tier III centers in the LHC data grid. ◦ We have adopted dCache as the network data storage infrastructure at UConn site ◦ We developed scripts to monitor the integrity for all files stored in dCache pools GlueX collaboration meeting, Newport News, May 13, 2009

Grid data storage platform benchmarks ◦ Mechanisms for reading data stored in dCache pools  Similar methods to unix streams: open, read blocks of data, and close.  File transfer program like rft. ◦ Protocols  dCap: dCap API provides an easy way to access dCache files as if they were traditional unix stream files  gsidCap: secure version of dCap  gridftp: the standard protocol to transfer files on the grid platform. ◦ Performance test  Open a file, read one block of data, and close the file.  600 files tested GlueX collaboration meeting, Newport News, May 13, 2009

Grid data storage platform benchmarks ◦ dCap  Avg. 690ms for access one block of a file GlueX collaboration meeting, Newport News, May 13, 2009

Grid data storage platform benchmarks ◦ gsidCap  Avg. 750ms for access one block of a file GlueX collaboration meeting, Newport News, May 13, 2009

Grid data storage platform benchmarks ◦ GridFTP  Avg. 1700ms for access one block of a file GlueX collaboration meeting, Newport News, May 13, 2009

Installation  New cluster is added to UConn physics grid platform  32 DELL PowerEdge 1435 servers  quad core processors per node  8 GB memory per node  One 320GB hard disk per node  Two gigabit network interfaces per node 3.2 Tuning ◦ To reduce the cost of maintenance, we use NFS root file system ◦ Tune TCP settings to make the new cluster stable GlueX collaboration meeting, Newport News, May 13, 2009

31 GlueX collaboration meeting, Newport News, May 13, 2009