Nek5000 preliminary discussion for petaflops apps project.

Slides:



Advertisements
Similar presentations
The Web Wizards Guide to Freeware/Shareware Chapter Six Open Source Software.
Advertisements

Extreme Programming Alexander Kanavin Lappeenranta University of Technology.
Autonomic Systems Justin Moles, Winter 2006 Enabling autonomic behavior in systems software with hot swapping Paper by: J. Appavoo, et al. Presentation.
Parallelizing Audio Feature Extraction Using an Automatically-Partitioned Streaming Dataflow Language Eric Battenberg Mark Murphy CS 267, Spring 2008.
GridRPC Sources / Credits: IRISA/IFSIC IRISA/INRIA Thierry Priol et. al papers.
High Performance Computing The GotoBLAS Library. HPC: numerical libraries  Many numerically intensive applications make use of specialty libraries to.
Creating Shareable Models By: Eric Hutton CSDMS - Community Surface Dynamics Modeling System (pronounced ˈ s ɪ stəms) Image by Flickr user Let There Be.
MCITP Guide to Microsoft Windows Server 2008 Server Administration (Exam #70-646) Chapter 14 Server and Network Monitoring.
The project plan. December 16, Agenda The project plan –Risks –Language decision –Schedule –Quality plan –Testing –Documentation Program architecture.
Source Code Management Or Configuration Management: How I learned to Stop Worrying and Hate My Co-workers Less.
Automated Tests in NICOS Nightly Control System Alexander Undrus Brookhaven National Laboratory, Upton, NY Software testing is a difficult, time-consuming.
Hands-On Microsoft Windows Server 2008 Chapter 11 Server and Network Monitoring.
Windows Server 2008 Chapter 11 Last Update
Michael Atkins. Note:  This is a non-technical overview  Some light technical background is given, to put things in context  Some of the content is.
ADLB Update Recent and Current Adventures with the Asynchronous Dynamic Load Balancing Library Rusty Lusk Mathematics and Computer Science Division Argonne.
SCRAM Software Configuration, Release And Management Background SCRAM has been developed to enable large, geographically dispersed and autonomous groups.
CS 355 – Programming Languages
Suzanne Gysin1 Software for the LHC Types of Software Current Prototyping Architecture Ideas Requirements Revisited WBS considerations.
Software Licensing, Made Simple SELECT Server XM Edition
Introduction Optimizing Application Performance with Pinpoint Accuracy What every IT Executive, Administrator & Developer Needs to Know.
1 ITSK 2611 Welcome. 2 Operating System 3 What is an OS Resource Manager –Disk –Memory –CPU Device Manager –Printers –Video Card –Sound Card Utility.
Cmpe 589 Spring Software Quality Metrics Product  product attributes –Size, complexity, design features, performance, quality level Process  Used.
6-January-2003cse Introduction © 2003 University of Washington1 Introduction CSE 403, Winter 2003 Software Engineering
Trilinos 101: Getting Started with Trilinos November 7, :30-9:30 a.m. Mike Heroux Jim Willenbring.
Web Trnsport – Beta Testing and Implementation TUG Roundtable Discussion Elizabeth Rodgers Info Tech, Inc. October 9, 2007.
NeSC Apps Workshop July 20 th, 2002 Customizable command line tools for Grids Ian Kelley + Gabrielle Allen Max Planck Institute for Gravitational Physics.
1 “How Can We Address the Needs and Solve the Problems in HPC Benchmarking?” Jack Dongarra Innovative Computing Laboratory University of Tennesseehttp://
Yannick Patois – Datagrid Repository Presentation- 2001/11/21 - n° 1 Partner Logo DataGrid Software Repository presentation A short presentation of the.
1 SEG4912 University of Ottawa by Jason Kealey Software Engineering Capstone Project Tools and Technologies.
UPC Applications Parry Husbands. Roadmap Benchmark small applications and kernels —SPMV (for iterative linear/eigen solvers) —Multigrid Develop sense.
Copyright © 2007 Addison-Wesley. All rights reserved.1-1 Reasons for Studying Concepts of Programming Languages Increased ability to express ideas Improved.
Application / User Viewpoint Computer Science Section Head Computational and Information Systems Laboratory National Center for Atmospheric.
How to configure, build and install Trilinos November 2, :30-9:30 a.m. Jim Willenbring Mike Phenow.
ARGONNE NATIONAL LABORATORY Climate Modeling on the Jazz Linux Cluster at ANL John Taylor Mathematics and Computer Science & Environmental Research Divisions.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
WEP Presentation for non-IT Steps and roles in software development 2. Skills developed in 1 st year 3. What can do a student in 1 st internship.
Spring 2003CSE P5481 Issues in Multiprocessors Which programming model for interprocessor communication shared memory regular loads & stores message passing.
George Tsouloupas University of Cyprus Task 2.3 GridBench ● 1 st Year Targets ● Background ● Prototype ● Problems and Issues ● What's Next.
CS 3500 L Performance l Code Complete 2 – Chapters 25/26 and Chapter 7 of K&P l Compare today to 44 years ago – The Burroughs B1700 – circa 1974.
Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute) Anne Weill – Zrahia Technion,Computer Center October 2008.
August 2001 Parallelizing ROMS for Distributed Memory Machines using the Scalable Modeling System (SMS) Dan Schaffer NOAA Forecast Systems Laboratory (FSL)
Summertime Fun Everyone loves performance Shirley Browne, George Ho, Jeff Horner, Kevin London, Philip Mucci, John Thurman.
MESQUITE: Mesh Optimization Toolkit Brian Miller, LLNL
Portal Update Plan Ashok Adiga (512)
Linear Algebra Libraries: BLAS, LAPACK, ScaLAPACK, PLASMA, MAGMA
Virtual Application Profiler (VAPP) Problem – Increasing hardware complexity – Programmers need to understand interactions between architecture and their.
Solving the hard problems of User Experience
Getting Started with Trilinos October 14, :30-10:30 a.m. Jim Willenbring.
Module 9 Planning and Implementing Monitoring and Maintenance.
TI Information – Selective Disclosure Implementation of Linear Algebra Libraries for Embedded Architectures Using BLIS September 28, 2015 Devangi Parikh.
Single Node Optimization Computational Astrophysics.
SPI NIGHTLIES Alex Hodgkins. SPI nightlies  Build and test various software projects each night  Provide a nightlies summary page that displays all.
Testing plan outline Adam Leko Hans Sherburne HCS Research Laboratory University of Florida.
Refactoring and Integration Testing or Strategy, introduced reliably by TDD The power of automated tests.
Collaborative Development Services Learning From the Open Source Agile Development Process Richard Kilmer, InfoEther LLC.
How to configure, build and install Trilinos November 2, :30-9:30 a.m. Jim Willenbring.
1 TCS Confidential. 2 In this session we will be learning:  What is Rally?  Why Rally?  Use cases  Actions  Architecture  Components.
Psychophysics Software Suite Yearly project for Dr. Karen Banai.
Parallel Programming & Cluster Computing Linear Algebra Henry Neeman, University of Oklahoma Paul Gray, University of Northern Iowa SC08 Education Program’s.
Linear Algebra Libraries: BLAS, LAPACK, ScaLAPACK, PLASMA, MAGMA Shirley Moore CPS5401 Fall 2013 svmoore.pbworks.com November 12, 2012.
Petaflops Application Meeting Dec Agenda Updates Updates Repositories Repositories Tools Tools Projects Projects Katherine on Steve Pieper’s.
JRA1 Meeting – 09/02/ Software Configuration Management and Integration EGEE is proposed as a project funded by the European Union under contract.
Introduction to Performance Tuning Chia-heng Tu PAS Lab Summer Workshop 2009 June 30,
Enterprise Library 3.0 Memi Lavi Solution Architect Microsoft Consulting Services Guy Burstein Senior Consultant Advantech – Microsoft Division.
PROGRAMMING LANGUAGES
Hands-On Microsoft Windows Server 2008
Introduction of Week 3 Assignment Discussion
Content Management Systems
Understanding Performance Counter Data - 1
Outline Operating System Organization Operating System Examples
Presentation transcript:

nek5000 preliminary discussion for petaflops apps project

General facts about nek5000 Research variant of commercial code developed by Fischer, Ho, and Ronquist in late 80’s subsequently modified by Fischer and Tufo Solves incompressible Navier Stokes using spectral element method. Used by several external research groups (Duke, Brown).

nek5000 language issues About 80,000 lines of mostly pure old f77 with a little C. C called from Fortran with decent portability strategies. Shell scripts provided to simplify job management. These are mostly jazz specific.

nek5000 portability issues Has been run on a wide range of architectures – Power3, Pentiums, Alpha, SGI, etc. Focus is on PGF compiler but portability looks pretty good – I ran on SGI and Fujitsu-i386 pretty easily as well as Jazz with PGF.

portability, cont. Relies on hacking a somewhat generalized makefile No configure script and no pre-existent machine-specific makefiles. compiler must be able to promote real to double precision some non-standard f77 (common blocks resized, e.g.)

Software process No repository Thus, no versioning, no release schedule, no bug tracking, etc. Test problems, but no auto-verification-type test suite Good quick howto guide but very light on documentation not directly downloadable from e.g. web server

Performance Exhaustively studied/optimized Gordon Bell Prize winner Serial part: Dominated by matrix-matrix product with smallish vector lengths homemade routine makes much better use of cache, does much better than BLAS – very high floppage rates on non-vector mahines.

Performance, cont. Communication patterns Nearest neighbor (~10%) Vector reduction (~10%) Coarse grid solve (small) Not communication bound (yet) Has scaled nicely to 1000’s procs on ASCI Red, Seaborg (SP3)

Performance issues Outstanding performance questions Serial  Efficient use of cache for different parameter regimes (different vector sizes)  how will it perform on new vector hardware?  No “spike” in performance histogram. Hard to optimize further. Parallel  nearest nabe could become bottleneck for slow- converging helmholtz  Scaling at 100,000 procs depends possibly on improved vector reduction implementation

What am I doing now? Software process creating cvs repository establishing license agreement creating simple web page with info/release creating some self-testing scripts convincing Paul to add some documentation posting a page of benchmarks creating release script

What am I doing, cont. Performance collecting some of my own numbers  PAPI installed locally but need on Jazz!  pgf tools to access hardware counters?  adding some superior instrumentation techniques to the code to make this easier in the future. Petaflops apps meetings posting minutes/notes from each meeting on local web site.