Priority Research Direction Key challenges Fault oblivious, Error tolerant software Hybrid and hierarchical based algorithms (eg linear algebra split across.

Slides:



Advertisements
Similar presentations
Technology Drivers Traditional HPC application drivers – OS noise, resource monitoring and management, memory footprint – Complexity of resources to be.
Advertisements

Christian Delbe1 Christian Delbé OASIS Team INRIA -- CNRS - I3S -- Univ. of Nice Sophia-Antipolis November Automatic Fault Tolerance in ProActive.
Priority Research Direction Key challenges General Evaluation of current algorithms Evaluation of use of algorithms in Applications Application of “standard”
System Software Environments Breakout Report June 27, 2002.
Priority Research Direction (I/O Models, Abstractions and Software) Key challenges What will you do to address the challenges? – Develop newer I/O models.
Parallel Research at Illinois Parallel Everywhere
Priority Research Direction: Portable de facto standard software frameworks Key challenges Establish forums for multi-institutional discussions. Define.
4.1.5 System Management Background What is in System Management Resource control and scheduling Booting, reconfiguration, defining limits for resource.
Cache Coherent Distributed Shared Memory. Motivations Small processor count –SMP machines –Single shared memory with multiple processors interconnected.
Scaling Distributed Machine Learning with the BASED ON THE PAPER AND PRESENTATION: SCALING DISTRIBUTED MACHINE LEARNING WITH THE PARAMETER SERVER – GOOGLE,
A 100,000 Ways to Fa Al Geist Computer Science and Mathematics Division Oak Ridge National Laboratory July 9, 2002 Fast-OS Workshop Advanced Scientific.
OCIN Workshop Wrapup Bill Dally. Thanks To Funding –NSF - Timothy Pinkston, Federica Darema, Mike Foster –UC Discovery Program Organization –Jane Klickman,
1 Aug 7, 2004 GPU Req GPU Requirements for Large Scale Scientific Applications “Begin with the end in mind…” Dr. Mark Seager Asst DH for Advanced Technology.
1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Future Parallel Computing Systems – what to remember from the past RAMP Workshop FCRC.
DIVES Alur, Lee, Kumar, Pappas: University of Pennsylvania  Charon: high-level modeling language and a design environment reflecting the current state.
5 th Biennial Ptolemy Miniconference Berkeley, CA, May 9, 2003 MESCAL Application Modeling and Mapping: Warpath Andrew Mihal and the MESCAL team UC Berkeley.
1 FM Overview of Adaptation. 2 FM RAPIDware: Component-Based Design of Adaptive and Dependable Middleware Project Investigators: Philip McKinley, Kurt.
Power is Leading Design Constraint Direct Impacts of Power Management – IDC: Server 2% of US energy consumption and growing exponentially HPC cluster market.
© 2005, it - instituto de telecomunicações. Todos os direitos reservados. System Level Resource Discovery and Management for Multi Core Environment Javad.
GPU-Qin: A Methodology For Evaluating Error Resilience of GPGPU Applications Bo Fang , Karthik Pattabiraman, Matei Ripeanu, The University of British.
Computer System Architectures Computer System Software
PicsouGrid Viet-Dung DOAN. Agenda Motivation PicsouGrid’s architecture –Pricing scenarios PicsouGrid’s properties –Load balancing –Fault tolerance Perspectives.
ET E.T. International, Inc. X-Stack: Programming Challenges, Runtime Systems, and Tools Brandywine Team May2013.
4.x Performance Technology drivers – Exascale systems will consist of complex configurations with a huge number of potentially heterogeneous components.
1 Discussions on the next PAAP workshop, RIKEN. 2 Collaborations toward PAAP Several potential topics : 1.Applications (Wave Propagation, Climate, Reactor.
CASTNESS‘11 Computer Architectures and Software Tools for Numerical Embedded Scalable Systems Workshop & School: Roma January 17-18th 2011 Frédéric ROUSSEAU.
Version 4.0. Objectives Describe how networks impact our daily lives. Describe the role of data networking in the human network. Identify the key components.
Priority Research Direction (use one slide for each) Key challenges -Fault understanding (RAS), modeling, prediction -Fault isolation/confinement + local.
CCA Common Component Architecture Manoj Krishnan Pacific Northwest National Laboratory MCMD Programming and Implementation Issues.
4.2.1 Programming Models Technology drivers – Node count, scale of parallelism within the node – Heterogeneity – Complex memory hierarchies – Failure rates.
Heterogeneous Multikernel OS Yauhen Klimiankou BSUIR
BG/Q vs BG/P—Applications Perspective from Early Science Program Timothy J. Williams Argonne Leadership Computing Facility 2013 MiraCon Workshop Monday.
Back-end (foundation) Working group X-stack PI Kickoff Meeting Sept 19, 2012.
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
Polymorphous Computing Architectures Run-time Environment And Design Application for Polymorphous Technology Verification & Validation (READAPT V&V) Lockheed.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
Investigating Survivability Strategies for Ultra-Large Scale (ULS) Systems Vanderbilt University Nashville, Tennessee Institute for Software Integrated.
Software evolution l Software evolution is the term used in software engineering (specifically software maintenance) to refer to the process of developing.
Numerical Libraries Project Microsoft Incubation Group Mary Beth Hribar Microsoft Corporation CSCAPES Workshop June 10, 2008 Copyright Microsoft Corporation,
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public 1 Version 4.0 Living in a Network Centric World Network Fundamentals – Chapter 1.
Summary Background –Why do we need parallel processing? Moore’s law. Applications. Introduction in algorithms and applications –Methodology to develop.
Introduction to Research 2011 Introduction to Research 2011 Ashok Srinivasan Florida State University Images from ORNL, IBM, NVIDIA.
© 2009 IBM Corporation Parallel Programming with X10/APGAS IBM UPC and X10 teams  Through languages –Asynchronous Co-Array Fortran –extension of CAF with.
FP7 Support Action - European Exascale Software Initiative DG Information Society and the unit e-Infrastructures EESI Final Conference Exascale key hardware.
Breakout Group: Debugging David E. Skinner and Wolfgang E. Nagel IESP Workshop 3, October, Tsukuba, Japan.
Linear Algebra Libraries: BLAS, LAPACK, ScaLAPACK, PLASMA, MAGMA
Data Center & Large-Scale Systems (updated) Luis Ceze, Bill Feiereisen, Krishna Kant, Richard Murphy, Onur Mutlu, Anand Sivasubramanian, Christos Kozyrakis.
Programmability Hiroshi Nakashima Thomas Sterling.
What Programming Paradigms and algorithms for Petascale Scientific Computing, a Hierarchical Programming Methodology Tentative Serge G. Petiton June 23rd,
B5: Exascale Hardware. Capability Requirements Several different requirements –Exaflops/Exascale single application –Ensembles of Petaflop apps requiring.
MSc in High Performance Computing Computational Chemistry Module Parallel Molecular Dynamics (i) Bill Smith CCLRC Daresbury Laboratory
Computing Systems: Next Call for Proposals Dr. Panagiotis Tsarchopoulos Computing Systems ICT Programme European Commission.
C OMPUTATIONAL R ESEARCH D IVISION 1 Defining Software Requirements for Scientific Computing Phillip Colella Applied Numerical Algorithms Group Lawrence.
Priority Research Direction (use one slide for each) Key challenges What will you do to address the challenges?Brief overview of the barriers and gaps.
EU-Russia Call Dr. Panagiotis Tsarchopoulos Computing Systems ICT Programme European Commission.
Linear Algebra Libraries: BLAS, LAPACK, ScaLAPACK, PLASMA, MAGMA Shirley Moore CPS5401 Fall 2013 svmoore.pbworks.com November 12, 2012.
Is MPI still part of the solution ? George Bosilca Innovative Computing Laboratory Electrical Engineering and Computer Science Department University of.
Heterogeneous Processing KYLE ADAMSKI. Overview What is heterogeneous processing? Why it is necessary Issues with heterogeneity CPU’s vs. GPU’s Heterogeneous.
Defining the Competencies for Leadership- Class Computing Education and Training Steven I. Gordon and Judith D. Gardiner August 3, 2010.
Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,
Medium Access Control. MAC layer covers three functional areas: reliable data delivery access control security.
Introduction to Distributed Platforms
Welcome: Intel Multicore Research Conference
Programming Models for SimMillennium
Summary Background Introduction in algorithms and applications
Power is Leading Design Constraint
Priority Research Direction (use one slide for each)
Architectures of distributed systems
Priority Research Direction (use one slide for each)
Presentation transcript:

Priority Research Direction Key challenges Fault oblivious, Error tolerant software Hybrid and hierarchical based algorithms (eg linear algebra split across multi-core and gpu, self-adapting) Mixed arithmetic Energy efficient algorithms Algorithms that minimize communications Autotuning based software Architectural aware algorithms/libraries Standardization activities Async methods Overlap data and computation Adaptivity for architectural environment Scalability : need algorithms with minimal amount of communication Increasing the level of asynchronous behavior Fault resistant software– bit flipping and loosing data (due to failures). Algorithms that detect and carry on or detect and correct and carry on (for one or more) Heterogeneous architectures Languages Accumulation of round-off errors Efficient libraries of numerical routines Agnostic of platforms Self adapting to the environment Libraries will be impacted by compilers, OS, runtime, prog env etc Standards: FT, Power Management, Hybrid Programming, arch characteristics Make systems more usable by a wider group of applications Enhance programmability Summary of research direction Potential impact on software component Potential impact on usability, capability, and breadth of community

4.2.4 Numerical Libraries Energy aware Fault tolerant Heterogeneous sw Self adapting for precision Scaling to billion way Complexity of system Architectural transparency Self Adapting for performance Numerical Libraries Structured grids Unstructured grids FFTs Dense LA Sparse LA Monte Carlo Optimization Language issues Std: Fault tolerant Std: Energy aware Std: Arch characteristics Std: Hybrid Progm

4.2.4 Numerical Libraries Technology drivers – Hybrid architectures – Programming models/languages – Precision – Fault detection – Energy budget – Memory hierarchy – Standards Alternative R&D strategies – Message passing – Global address space – Message-driven work-queue Recommended research agenda – Hybrid and hierarchical based software (eg linear algebra split across multi-core / accelerator) – Autotuning – Fault oblivious sw, Error tolerant sw – Mixed arithmetic – Architectural aware libraries – Energy efficient implementation – Algorithms that minimize communications Crosscutting considerations – Performance – Fault tolerance – Power management – Arch characteristics