Performance of CMAQ on a Mac OS X System Tracey Holloway, John Bachan, Scott Spak Center for Sustainability and the Global Environment University of Wisconsin-Madison.

Slides:



Advertisements
Similar presentations
Introduction to Openmp & openACC
Advertisements

_____________________________________________________________________CMAS User Support MCNC - EMC__________________________________________Community Modeling.
DOSAR Workshop VI April 17, 2008 Louisiana Tech Site Report Michael Bryant Louisiana Tech University.
The AASPI Software Computational Environment Tim Kwiatkowski Welcome Consortium Members December 9,
Today’s topics Single processors and the Memory Hierarchy
HPCC Mid-Morning Break High Performance Computing on a GPU cluster Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery.
Types of Parallel Computers
Information Technology Center Introduction to High Performance Computing at KFUPM.
Copyright HiPERiSM Consulting, LLC, George Delic, Ph.D. HiPERiSM Consulting, LLC (919) P.O. Box 569, Chapel Hill,
Copyright HiPERiSM Consulting, LLC, George Delic, Ph.D. HiPERiSM Consulting, LLC (919) P.O. Box 569, Chapel Hill,
AASPI Software Computational Environment Tim Kwiatkowski Welcome Consortium Members November 18, 2008.
Novell Server Linux vs. windows server 2008 By: Gabe Miller.
Getting Started with Linux Douglas Thain University of Wisconsin, Computer Sciences Condor Project October 2000.
Parallel/Concurrent Programming on the SGI Altix Conley Read January 25, 2007 UC Riverside, Department of Computer Science.
Understanding Operating Systems 1 Overview Introduction Operating System Components Machine Hardware Types of Operating Systems Brief History of Operating.
IBM RS/6000 SP POWER3 SMP Jari Jokinen Pekka Laurila.
CPP Staff - 30 CPP Staff - 30 FCIPT Staff - 35 IPR Staff IPR Staff ITER-India Staff ITER-India Staff Research Areas: 1.Studies.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
HPCC Mid-Morning Break Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery Introduction to the new GPU (GFX) cluster.
Lesson 4 Computer Software
Ch 1. Introduction Dr. Bernard Chen Ph.D. University of Central Arkansas Spring 2012.
Computing at COSM by Lawrence Sorrillo COSM Center.
Distributed Systems Early Examples. Projects NOW – a Network Of Workstations University of California, Berkely Terminated about 1997 after demonstrating.
Basic Computer Structure and Knowledge Project Work.
Slide 1 Copyright © 2003 Encapsule Systems, Inc. Hyperworx Platform Brief Modeling and deploying component software services with the Hyperworx™ platform.
RSC Williams MAPLD 2005/BOF-S1 A Linux-based Software Environment for the Reconfigurable Scalable Computing Project John A. Williams 1
SSI-OSCAR A Single System Image for OSCAR Clusters Geoffroy Vallée INRIA – PARIS project team COSET-1 June 26th, 2004.
Introduction to the WRF Modeling System Wei Wang NCAR/MMM.
Instrumentation System Design – part 2 Chapter6:.
Tools and Utilities for parallel and serial codes in ENEA-GRID environment CRESCO Project: Salvatore Raia SubProject I.2 C.R. ENEA-Portici. 11/12/2007.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
Development of the Graphical User Interface and Improvement and Streamlining of NYMTC's Best Practice Model Jim Lam, Andres Rabinowicz, Srini Sundaram,
1 Programming in C. 2 The Abacus  The abacus, a simple counting aid, may have been invented in Babylonia (now Iraq) in the fourth century B.C.
Ch 1. A Python Q&A Session Spring Why do people use Python? Software quality Developer productivity Program portability Support libraries Component.
CMAQ Runtime Performance as Affected by Number of Processors and NFS Writes Patricia A. Bresnahan, a * Ahmed Ibrahim b, Jesse Bash a and David Miller a.
 Copyright, HiPERiSM Consulting, LLC, George Delic, Ph.D. HiPERiSM Consulting, LLC (919) P.O. Box 569, Chapel Hill, NC.
Headline in Arial Bold 30pt HPC User Forum, April 2008 John Hesterberg HPC OS Directions and Requirements.
ARGONNE NATIONAL LABORATORY Climate Modeling on the Jazz Linux Cluster at ANL John Taylor Mathematics and Computer Science & Environmental Research Divisions.
_______________________________________________________________CMAQ Libraries and Utilities ___________________________________________________Community.
1 CMPE 511 HIGH PERFORMANCE COMPUTING CLUSTERS Dilek Demirel İşçi.
Computing Resources at Vilnius Gediminas Technical University Dalius Mažeika Parallel Computing Laboratory Vilnius Gediminas Technical University
Computer Organization & Assembly Language © by DR. M. Amer.
Transparently Gathering Provenance with Provenance Aware Condor Christine Reilly and Jeffrey Naughton Department of Computer Sciences University of Wisconsin.
OPERATING SYSTEM - program that is loaded into the computer and coordinates all the activities among computer hardware devices. -controls the hardware.
AASPI Software Computational Environment Tim Kwiatkowski Welcome Consortium Members November 10, 2009.
CASPUR Site Report Andrei Maslennikov Lead - Systems Amsterdam, May 2003.
Parallel Computing With High Performance Computing Clusters (HPCs) By Jeremy Cathey.
John Matrow, System Administrator/Trainer. Short History HiPeCC created April 1999 Purchased 16p 300Mhz SGI Origin 2000 April 2001: Added 8p 250Mhz.
Parallel I/O in CMAQ David Wong, C. E. Yang*, J. S. Fu*, K. Wong*, and Y. Gao** *University of Tennessee, Knoxville, TN, USA **now at: Pacific Northwest.
Computer Software Types Three layers of software Operation.
Harnessing Grid-Based Parallel Computing Resources for Molecular Dynamics Simulations Josh Hursey.
By Chi-Chang Chen.  Cluster computing is a technique of linking two or more computers into a network (usually through a local area network) in order.
Enabling Technologies for Distributed Computing Dr. Sanjay P. Ahuja, Ph.D. Fidelity National Financial Distinguished Professor of CIS School of Computing,
OpenMP for Networks of SMPs Y. Charlie Hu, Honghui Lu, Alan L. Cox, Willy Zwaenepoel ECE1747 – Parallel Programming Vicky Tsang.
University of North Carolina at Chapel Hill Carolina Environmental Programs Community Modeling and Analysis System (CMAS) Year 3 Adel Hanna Director, CMAS.
Final Implementation of a High Performance Computing Cluster at Florida Tech P. FORD, X. FAVE, K. GNANVO, R. HOCH, M. HOHLMANN, D. MITRA Physics and Space.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 1.
Computing Issues for the ATLAS SWT2. What is SWT2? SWT2 is the U.S. ATLAS Southwestern Tier 2 Consortium UTA is lead institution, along with University.
Lab Activities 1, 2. Some of the Lab Server Specifications CPU: 2 Quad(4) Core Intel Xeon 5400 processors CPU Speed: 2.5 GHz Cache : Each 2 cores share.
HPC usage and software packages
High Performance Computing on an IBM Cell Processor --- Bioinformatics
Constructing a system with multiple computers or processors
Programming Languages
Constructing a system with multiple computers or processors
Constructing a system with multiple computers or processors
Constructing a system with multiple computers or processors
Hybrid Programming with OpenMP and MPI
Korea Software HRD Center
Types of Parallel Computers
Presentation transcript:

Performance of CMAQ on a Mac OS X System Tracey Holloway, John Bachan, Scott Spak Center for Sustainability and the Global Environment University of Wisconsin-Madison A presentation to the 3rd annual CMAS Models-3 conference October 19, 2004

Thinking different. Motivation Methods Performance Hardware Release Ongoing Improvements

Motivations. Simplified operation Easier development Easy clustering Improved performance

Motivation: Operation. Single platform for all research and academic computing User-friendly interface UNIX OS Open source software, hardware support Today’s cluster node = tomorrow’s desktop

Motivation: Development. Better Developer Tools Xcode (Interface Builder) CHUD performance & debugging suite Distribution Tools standardized profiles PackageMaker FAT binaries automated installation

Operation & Development.

Motivation: Performance. Unique Hardware Advantages powerful PPC 970 vector chip auto-vectorizing compilers 2000 NASA Langley report Populist Parallelization mix dedicated cluster nodes with free cycles on personal & lab machines off-the-shelf solutions simple GUI and command-line tools

Methods. IBM XL Fortan v8.1 compiler auto-vectorization equivalent to AIX Modifications flag conversion build settings array passing > 400 man-hours

Performance. 2 Test Machines dual 2 GHz G5, 5 GB RAM, 1 GHz bus stock dual 1 GHz G4, 1.5 GB RAM, 133 MHz bus Mac OS X Test Run First day of CMAQ 4.3 tutorial 1 day, 32 km x 32 km, 38 x 38, 6 layers default EBI CB4 chemistry

Benchmarks. Tutorial Runtime by Hardware and Compiler (seconds) IFC = Intel Fortan Compiler 7.1 PGF = Portland Group Compiler Intel machines running CMAQ 4.22 on 2 processors with mpich parallelization. Source: Gail Tonnesen, “Benchmarks for CPUs and Compilers for the CMAQ release.” Macs running CMAQ 4.3 on 1 processor (XLF) or 2 processors (XLF SMP) with OpenMP parallelization seconds

Chemistry. Species Mean |  | from reference Max |  | from reference (% of cells >1 ppb) O3O ppb 4.52 ppb (0.43) NO ppb 0.72 ppb (0) NO ppb 2.05 ppb (0.02) NH ppb 1.67 ppb (0.0002) SO 4 (I + J)  g/m  g/m 3 Source: ACONC.nc output from Day 1 of CMAQ 4.3 tutorial Dual 2 GHz G5 running CMAQ 4.3 on 1 processor

Good Chemistry. Small difference from reference set greater than difference among Intel machines and compilers Noise, floating point calculations, initialization greatest at surface level, early in run ambient concentrations only random distribution no bias does not propagate in time or space not correlated to high or low concentrations Consistent G4/G5 chemistry modules compiler flags

Better Chemistry. Tutorial Runtime by Chemistry Module (seconds) Dual 2 GHz G5 running CMAQ 4.3 on 1 processor

Models-3 on Mac, 10/04. Core Platform MM5 (Fovell) MCIP v2.2 Smoke v2.1 CMAQ v4.3 Libraries & Add-Ons netCDF v3.5.1 mpich v I/O API v2.2 MCPL Currently no PAVE, but Vis5d, VisAd, GrADS, NCL, and

Hardware.

Dedicated Cluster XServe G5 Dual 2 GHz, 2 GB RAM Xserve RAID 3.5 TB 8 Power Mac G5 Dual 2GHz, 5 GB RAM Distributed Capacity student lab eMacs personal G4 desktops 60 processor vector cluster 0 Full-time Sys-admins 18 G5 processors 42 G4 processors

Cost Competitive. Apple Xserve Dual G5 2GHz < $3500 RAID storage at $3 per GB G5 Desktop $ Compare to Dell PowerVault RAID at $5 per GB Dell Precision dual Xeon 2.8 GHz, $ Sysadmin costs

JOHN SCOTT

Release. Following input from the CMAS Center alpha code to CMAS by November, 2004 CMAS testing potential support Following CMAS Testing, preliminary code, scripts, binaries, instructions available for download at Scott Spak will answer questions for early users:

Ongoing improvements. Our planned activities g95 - GNU compilation parallel implementations Condor Xgrid Pooch/Appleseed further optimization Dual 2.5 GHz benchmarks CMAQ MADRID A community effort? CMAQ Unified MIMS PAVE

Acknowledgements. Mary Sternitzky, UW Seth Price, UW Hans Vahlenkamp and NOAA GFDL Zac Adelman and the CMAS Help Desk Dr. Gail Tonnesen and Glen Kaukola, UCR Models-3 Listserv All funding provided by the University of Wisconsin- Madison.