N Tropy: A Framework for Analyzing Massive Astrophysical Datasets Harnessing the Power of Parallel Grid Resources for Astrophysical Data Analysis Jeffrey.

Slides:

Advertisements

Similar presentations

A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University

Advertisements

Instructor Notes Lecture discusses parallel implementation of a simple embarrassingly parallel nbody algorithm We aim to provide some correspondence between.

SkewReduce YongChul Kwon Magdalena Balazinska, Bill Howe, Jerome Rolia* University of Washington, *HP Labs Skew-Resistant Parallel Processing of Feature-Extracting.

National Center for Supercomputing Applications 259 th fastest computer in the world Michael Remijan NCSA –Research Programmer –Web-based Distributed Programming.

Reference: Message Passing Fundamentals.

Bin Fu Eugene Fink, Julio López, Garth Gibson Carnegie Mellon University Astronomy application of Map-Reduce: Friends-of-Friends algorithm A distributed.

NPACI: National Partnership for Advanced Computational Infrastructure Supercomputing ‘98 Mannheim CRAY T90 vs. Tera MTA: The Old Champ Faces a New Challenger.

Introduction What is Parallel Algorithms? Why Parallel Algorithms? Evolution and Convergence of Parallel Algorithms Fundamental Design Issues.

A Parallel Structured Ecological Model for High End Shared Memory Computers Dali Wang Department of Computer Science, University of Tennessee, Knoxville.

Astrophysics, Biology, Climate, Combustion, Fusion, Nanoscience Working Group on Simulation-Driven Applications 10 CS, 10 Sim, 1 VR.

Astro-DISC: Astronomy and cosmology applications of distributed super computing.

PRESTON SMITH ROSEN CENTER FOR ADVANCED COMPUTING PURDUE UNIVERSITY A Cost-Benefit Analysis of a Campus Computing Grid Condor Week 2011.

What is Concurrent Programming? Maram Bani Younes.

Enabling Knowledge Discovery in a Virtual Universe Harnessing the Power of Parallel Grid Resources for Astrophysical Data Analysis Jeffrey P. Gardner Andrew.

Simple Interface for Polite Computing (SIPC) Travis Finch St. Edward’s University Department of Computer Science, School of Natural Sciences Austin, TX.

Reference: / Parallel Programming Paradigm Yeni Herdiyeni Dept of Computer Science, IPB.

© Fujitsu Laboratories of Europe 2009 HPC and Chaste: Towards Real-Time Simulation 24 March

1 Babak Behzad, Yan Liu 1,2,4, Eric Shook 1,2, Michael P. Finn 5, David M. Mattli 5 and Shaowen Wang 1,2,3,4 Babak Behzad 1,3, Yan Liu 1,2,4, Eric Shook.

Statistical Performance Analysis for Scientific Applications Presentation at the XSEDE14 Conference Atlanta, GA Fei Xing Haihang You Charng-Da Lu July.

Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?

Seaborg Cerise Wuthrich CMPS Seaborg  Manufactured by IBM  Distributed Memory Parallel Supercomputer  Based on IBM’s SP RS/6000 Architecture.

N Tropy: A Framework for Knowledge Discovery in a Virtual Universe Harnessing the Power of Parallel Grid Resources for Astrophysical Data Analysis Jeffrey.

Definition of Computational Science Computational Science for NRM D. Wang Computational science is a rapidly growing multidisciplinary field that uses.

Computational issues in Carbon nanotube simulation Ashok Srinivasan Department of Computer Science Florida State University.

1 The Terabyte Analysis Machine Jim Annis, Gabriele Garzoglio, Jun 2001 Introduction The Cluster Environment The Distance Machine Framework Scales The.

Chapter 3 Parallel Algorithm Design. Outline Task/channel model Task/channel model Algorithm design methodology Algorithm design methodology Case studies.

N Tropy: A Framework for Knowledge Discovery in a Virtual Universe Harnessing the Power of Parallel Grid Resources for Astronomical Data Analysis Jeffrey.

A performance evaluation approach openModeller: A Framework for species distribution Modelling.

Building a Parallel File System Simulator E Molina-Estolano, C Maltzahn, etc. UCSC Lab, UC Santa Cruz. Published in Journal of Physics, 2009.

Biswanath Panda, Mirek Riedewald, Daniel Fink ICDE Conference 2010 The Model-Summary Problem and a Solution for Trees 1.

Benchmarking MapReduce-Style Parallel Computing Randal E. Bryant Carnegie Mellon University.

The Future of the iPlant Cyberinfrastructure: Coming Attractions.

April 26, CSE8380 Parallel and Distributed Processing Presentation Hong Yue Department of Computer Science & Engineering Southern Methodist University.

The european ITM Task Force data structure F. Imbeaux.

Parallel Processing Steve Terpe CS 147. Overview What is Parallel Processing What is Parallel Processing Parallel Processing in Nature Parallel Processing.

Web and Grid Services from Pitt/CMU Andrew Connolly Department of Physics and Astronomy University of Pittsburgh Jeff Gardner, Alex Gray, Simon Krughoff,

Mark Rast Laboratory for Atmospheric and Space Physics Department of Astrophysical and Planetary Sciences University of Colorado, Boulder Kiepenheuer-Institut.

Overcoming Scaling Challenges in Bio-molecular Simulations Abhinav Bhatelé Sameer Kumar Chao Mei James C. Phillips Gengbin Zheng Laxmikant V. Kalé.

1 1 What does Performance Across the Software Stack mean?  High level view: Providing performance for physics simulations meaningful to applications 

Lecture 4 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of.

Parallelization of likelihood functions for data analysis Alfio Lazzaro CERN openlab Forum on Concurrent Programming Models and Frameworks.

Applications and Requirements for Scientific Workflow Introduction May NSF Geoffrey Fox Indiana University.

Motivation: Sorting is among the fundamental problems of computer science. Sorting of different datasets is present in most applications, ranging from.

DDM Kirk. LSST-VAO discussion: Distributed Data Mining (DDM) Kirk Borne George Mason University March 24, 2011.

Thinking in Parallel – Implementing In Code New Mexico Supercomputing Challenge in partnership with Intel Corp. and NM EPSCoR.

 The need for parallelization  Challenges towards effective parallelization  A multilevel parallelization framework for BEM: A compute intensive application.

Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS

Enabling Rapid Development of Parallel Tree-Search Applications Harnessing the Power of Massively Parallel Platforms for Astrophysical Data Analysis Jeffrey.

Science on Supercomputers: Pushing the (back of) the envelope Jeffrey P. Gardner Pittsburgh Supercomputing Center Carnegie Mellon University University.

Parallelization Strategies Laxmikant Kale. Overview OpenMP Strategies Need for adaptive strategies –Object migration based dynamic load balancing –Minimal.

C OMPUTATIONAL R ESEARCH D IVISION 1 Defining Software Requirements for Scientific Computing Phillip Colella Applied Numerical Algorithms Group Lawrence.

Parameter Reduction for Density-based Clustering on Large Data Sets Elizabeth Wang.

Background Computer System Architectures Computer System Software.

Introduction Goal: connecting multiple computers to get higher performance – Multiprocessors – Scalability, availability, power efficiency Job-level (process-level)

First INFN International School on Architectures, tools and methodologies for developing efficient large scale scientific computing applications Ce.U.B.

VIEWS b.ppt-1 Managing Intelligent Decision Support Networks in Biosurveillance PHIN 2008, Session G1, August 27, 2008 Mohammad Hashemian, MS, Zaruhi.

System Support for High Performance Scientific Data Mining Gagan Agrawal Ruoming Jin Raghu Machiraju S. Parthasarathy Department of Computer and Information.

Computer Science and Engineering Parallelizing Feature Mining Using FREERIDE Leonid Glimcher P. 1 ipdps’04 Scaling and Parallelizing a Scientific Feature.

Massively Parallel Cosmological Simulations with ChaNGa Pritish Jetley, Filippo Gioachin, Celso Mendes, Laxmikant V. Kale and Thomas Quinn.

Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming

Hydrodynamic Galactic Simulations

for the Offline and Computing groups

Parallel Objects: Virtualization & In-Process Components

Parallel Algorithm Design

Class project by Piyush Ranjan Satapathy & Van Lepham

The performance requirements for DSP applications continue to grow and the traditional solutions do not adequately address this new challenge Paradigm.

Department of Computer Science, University of Tennessee, Knoxville

Parallel Implementation of Adaptive Spacetime Simulations A

L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher

Presentation transcript:

N Tropy: A Framework for Analyzing Massive Astrophysical Datasets Harnessing the Power of Parallel Grid Resources for Astrophysical Data Analysis Jeffrey P. Gardner Andrew Connolly Cameron McBride Pittsburgh Supercomputing Center University of Pittsburgh Carnegie Mellon University

How to turn observational data into scientific knowledge Step 1: Collect data Step 2: Analyze data on workstation Step 3: Extract meaningful scientific knowledge (happy astronomer)

The Era of Sky Surveys Paradigm shift in astronomy: Sky Surveys Available data is growing at a much faster rate than computational power.

Mining the Universe can be (Computationally) Expensive In the future, there will be many problems that will be impossible without multiprocessor resources. There will be many more problems for which throughput can be substantially enhanced by multiprocessor resources.

Mining the Universe can be (Computationally) Expensive In 5 years, even your workstation may have 80 cores!! Intel’s 4-core Quadro

Good News for “Data Parallel” Operations Data Parallel (or “Embarrassingly Parallel”): Example: 1,000,000 QSO spectra Each spectrum takes ~1 hour to reduce Each spectrum is computationally independent from the others There are many workflow management tools that will distribute your computations across many machines.

Tightly-Coupled Parallelism (what this talk is about) Data and computational domains overlap Computational elements must communicate with one another Examples: Group finding N-Point correlation functions New object classification Density estimation

The Challenge of Data Analysis in a Multiprocessor Universe Parallel programs are difficult to write! Steep learning curve to learn parallel programming Parallel programs are expensive to write! Lengthy development time Parallel world is dominated by simulations: Code is often reused for many years by many people Therefore, you can afford to invest lots of time writing the code. Example: GASOLINE (a cosmology N-body code) Required 10 FTE-years of development

The Challenge of Data Analysis in a Multiprocessor Universe Data Analysis does not work this way: Rapidly changing scientific inqueries Less code reuse Simulation groups do not even write their analysis code in parallel! Data Mining paradigm mandates rapid software development!

The Challenge of Data Analysis in a Multiprocessor Universe Build a framework that is: Sophisticated enough to take care of all of the nasty parallel bits for you Flexible enough to be used for your own particular data analysis application

N tropy: A framework for multiprocessor development GOAL: Minimize development time for parallel applications. GOAL: Enable scientists with no parallel programming background (or time to learn) to still implement their algorithms in parallel by writing only serial code. GOAL: Provide seamless scalability from single processor machines to massively parallel resources. GOAL: Do not restrict inquiry space.

N tropy Methodology Limited Data Structures: Astronomy deals with point-like data in an N-dimensional parameter space Most efficient methods on these kind of data use space- partitioning trees. Limited Methods: Analysis methods perform a limited number of fundamental operations on these data structures.

N tropy Conceptual Schematic Computational “Agenda” C, C++, Python (Fortran?) Framework (“Black Box”) User serial collective staging and processing routines Web Service Layer (at least from Python) Domain Decomposition/ Tree Building Tree Traversal Parallel I/O User serial I/O routines VO WSDL? SOAP? Key: Framework Components Tree Services User Supplied Collectives Dynamic Workload Management User tree traversal routines User tree and particle data

A Simple N tropy Example ntropy_ReadParticles(…, (*myReadFunction)); N tropy master layer N tropy thread service layer myReadFunction() N tropy thread service layer myReadFunction() N tropy thread service layer myReadFunction() N tropy thread service layer myReadFunction() Proc. 0 Proc. 1Proc. 2 Proc. 3 Master Thread Computational Agenda Particle data to be read in parallel

N tropy Performance 10 million particles Spatial 3-Point 3->4 Mpc (SDSS DR1 takes less than 1 minute with perfect load balancing)

N tropy Performance 10 million particles Projected 3-Point 0->3 Mpc

Serial Performance N tropy vs. an existing serial n-point correlation function calculator: N tropy is 6 to 30 times faster in serial! In short: 1. Not only does it takes much less time to write an application using N tropy, 2. You application may run faster than if you wrote it from scratch!

N tropy “Meaningful” Benchmarks The purpose of this framework is to minimize development time! Development time for: 1. Parallel N-point correlation function calculator 3 months 2. Parallel Friends-of-Friends group finder 3 weeks 3. Parallel N-body gravity code 1 day!* *(OK, I cheated a bit and used existing serial N-body code fragments)

Conclusions Astrophysics, like many sciences, is facing a deluge of data that we must rely upon multiprocessor compute systems to analyze With N tropy, you can develop a tree-based algorithm in less time than it would take you to write one from scratch: 1. The implementation may be even faster than a “from scratch” effort 2. It will scale across many distributed processors More Information: Go to Wikipedia and seach “Ntropy”