Dagstuhl Seminar on Dark Silicon: From Embedded to HPC Feb 3, 2016

Slides:

Advertisements

Similar presentations

Cloud Computing: Theirs, Mine and Ours Belinda G. Watkins, VP EIS - Network Computing FedEx Services March 11, 2011.

Advertisements

Reconfigurable Network Topologies at Rack Scale

11/14/05ELEC Fall Multi-processor SoCs Yijing Chen.

Undergraduate Poster Presentation Match 31, 2015 Department of CSE, BUET, Dhaka, Bangladesh Wireless Sensor Network Integretion With Cloud Computing H.M.A.

Software Cluster Improve Collaboration and Community Engagement Work with diverse communities that contribute to the sustainability of scientific software.

CCA Common Component Architecture Manoj Krishnan Pacific Northwest National Laboratory MCMD Programming and Implementation Issues.

Cluster Reliability Project ISIS Vanderbilt University.

Extreme scale parallel and distributed systems – High performance computing systems Current No. 1 supercomputer Tianhe-2 at petaflops Pushing toward.

Extreme-scale computing systems – High performance computing systems Current No. 1 supercomputer Tianhe-2 at petaflops Pushing toward exa-scale computing.

High-Level Interconnect Architectures for FPGAs Nick Barrow-Williams.

Programming Models & Runtime Systems Breakout Report MICS PI Meeting, June 27, 2002.

Challenges towards Elastic Power Management in Internet Data Center.

Chapter 1 Introduction. Objectives To explain the definition of computer architecture To discuss the history of computers To describe the von-neumann.

Supercomputing Cross-Platform Performance Prediction Using Partial Execution Leo T. Yang Xiaosong Ma* Frank Mueller Department of Computer Science.

ICOM 6115: Computer Systems Performance Measurement and Evaluation August 11, 2006.

Implications of Emerging Hardware Tom Wenisch (University of Michigan) Nikos Hardavellas (Northwestern University) Sangyeun Cho (University of Pittsburgh)

CML SSDM: Smart Stack Data Management for Software Managed Multicores Jing Lu Ke Bai, and Aviral Shrivastava Compiler Microarchitecture Lab Arizona State.

Large Area Surveys - I Large area surveys can answer fundamental questions about the distribution of gas in galaxy clusters, how gas cycles in and out.

Present by Sheng Cai Coordinating Power Control and Performance Management for Virtualized Server Clusters.

Partitioned Multistack Evironments for Exascale Systems Jack Lange Assistant Professor University of Pittsburgh.

1 Thermal Management of Datacenter Qinghui Tang. 2 Preliminaries What is data center What is thermal management Why does Intel Care Why Computer Science.

Background Computer System Architectures Computer System Software.

Ian Collier, STFC, Romain Wartel, CERN Maintaining Traceability in an Evolving Distributed Computing Environment Introduction Security.

Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.

Scientific days, June 16 th & 17 th, 2014 This work has been partially supported by the LabEx PERSYVAL-Lab (ANR-11-LABX ) funded by the French program.

1 Performance Impact of Resource Provisioning on Workflows Gurmeet Singh, Carl Kesselman and Ewa Deelman Information Science Institute University of Southern.

Matthew Locke November 2007 A Linux Power Management Architecture.

GUIDO VOLPI – UNIVERSITY DI PISA FTK-IAPP Mid-Term Review 07/10/ Brussels.

 Programming methodology: ◦ is a process of developing programs that involves strategically dividing important tasks into functions to be utilized by.

What we mean by Big Data and Advanced Analytics

Performance Assurance for Large Scale Big Data Systems

ABOUT THE SEMINAR Project Management may be defined as “discipline of initiating, planning, executing and controlling a set of activities to achieve specific.

Review of the WLCG experiments compute plans

Organizations Are Embracing New Opportunities

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING

NSAP Technology Challenges and Frontiers

The “Understanding Performance!” team in CERN IT

Outline Benchmarking in ATLAS Performance scaling

Job Scheduling in a Grid Computing Environment

Parallel Computing in the Multicore Era

What is Dark Silicon in Embedded?

Software Requirements

Grid Computing Colton Lewis.

Scaling for the Future Katherine Yelick U.C. Berkeley, EECS

Many-core Software Development Platforms

High Performance Computing University of Southern California

Department of Computer Science University of California, Santa Barbara

Haishan Zhu, Mattan Erez

Power is Leading Design Constraint

Degree-aware Hybrid Graph Traversal on FPGA-HMC Platform

Resource-Efficient and QoS-Aware Cluster Management

Parallel Computing in the Multicore Era

CS385T Software Engineering Dr.Doaa Sami

Architecture & System Performance

Performance Evaluation of Computer Networks

Building and running HPC apps in Windows Azure

Software Acceleration in Hybrid Systems Xiaoqiao (XQ) Meng IBM T. J

Performance Evaluation of Computer Networks

Overview of Workflows: Why Use Them?

Defining the Grid Fabrizio Gagliardi EMEA Director Technical Computing

Mihai Neacşu, BSc. Prof.dr.eng. Alexandru Iosup Ir. Laurens Versluis

Presented by Jacob Feldman (732)

MagnaData: Scheduling Complex Workflows with Non-Functional Requirements in Datacenters Laurens Versluis Massivizing Computer research.

Modern data architecture at scale in the cloud : Best practices of Serverless, lambda and microservices architecture Prakriteswar Santikary, PhD Vice President.

Department of Computer Science University of California, Santa Barbara

We work with companies advancing artificial intelligence

Kajornsak Piyoungkorn,

Welcome to HTCondor Week #17 (year 32 of our project)

SOFTWARE ENGINEERING CS-5337: Introduction

Presentation transcript:

Dagstuhl Seminar on Dark Silicon: From Embedded to HPC Feb 3, 2016 HPC Group Dagstuhl Seminar on Dark Silicon: From Embedded to HPC Feb 3, 2016

What is Dark Silicon in HPC? Shortage of power, strict constraints Overprovisioning of hardware resources with respect to power Chip-level: we already have dark silicon Job-level, cluster-level: workload dependent, dim silicon

Main challenges arising from DS? Focus in HPC is performance, we care about FLOPS, time to solution, accuracy of science. Different than the embedded community Stay in a power band, fluctuations of a few megawatts may arise, power stability Performance variability, reproducibility System throughput and power utilization: how do we manage resources?

Challenges… Performance Variability: Embrace it and build runtime systems to address this At the cost of performance? Use it to our advantage: need more control? Minimize in hardware? How do we tolerate variability? Current approaches for programming the machines and performance tuning will no longer work. We need to redefine the metrics on how we measure things. We assume homogeneity at the moment.

Fundamental Techniques Monitoring at large scale (TUM, LRZ, LLNL, ETH, Dresden, JSC …) Granularity, what to gather, how to use it Resource Management, RMAP/Flux (LLNL, TUM, ETH, UniBo…) Runtime Systems GEO PM (Intel), Conductor (LLNL), SDSC, PowSched (UOregon) Tracing, Visualization (LLNL, JSC, Dresden, Oregon,SDSC, …) Deeper connection with workflows and applications. Also, refactoring of applications (work stealing, oversubscription, auto tuning)

Opportunities and Impacting the Embedded community? Measurement and Analysis Tools Learning from the operating systems community about hardware management Real-time boundaries, can we introduce them in HPC, have a QOS definition? Programming models tailored to code (right techniques for programming) Portability, development cycle Hardware-software co-design Redesign the stack: hardware, job, cluster, site

Going Forward: Big Data and HPC HPC Applications are handling more data, which creates a new set of challenges… Data-intensive computing, data-aware computing Another Dagstuhl? 