Tracing and Performance Analysis Tools for Heterogeneous Multicore System by Soon Thean Siew.

Slides:



Advertisements
Similar presentations
K T A U Kernel Tuning and Analysis Utilities Department of Computer and Information Science Performance Research Laboratory University of Oregon.
Advertisements

Profiler In software engineering, profiling ("program profiling", "software profiling") is a form of dynamic program analysis that measures, for example,
1 Coven a Framework for High Performance Problem Solving Environments Nathan A. DeBardeleben Walter B. Ligon III Sourabh Pandit Dan C. Stanzione Jr. Parallel.
Performance Visualizations using XML Representations Presented by Kristof Beyls Yijun Yu Erik H. D’Hollander.
Enabling Efficient On-the-fly Microarchitecture Simulation Thierry Lafage September 2000.
The Path to Multi-core Tools Paul Petersen. Multi-coreToolsThePathTo 2 Outline Motivation Where are we now What is easy to do next What is missing.
Robert Bell, Allen D. Malony, Sameer Shende Department of Computer and Information Science Computational Science.
Generic Sensor Platform for Networked Sensors Haywood Ho.
TAU Parallel Performance System DOD UGC 2004 Tutorial Allen D. Malony, Sameer Shende, Robert Bell Univesity of Oregon.
Combining Static and Dynamic Data in Code Visualization David Eng Sable Research Group, McGill University PASTE 2002 Charleston, South Carolina November.
University of Kansas Construction & Integration of Distributed Systems Jerry James Oct. 30, 2000.
On the Integration and Use of OpenMP Performance Tools in the SPEC OMP2001 Benchmarks Bernd Mohr 1, Allen D. Malony 2, Rudi Eigenmann 3 1 Forschungszentrum.
1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Future Parallel Computing Systems – what to remember from the past RAMP Workshop FCRC.
Generic Sensor Platform for Networked Sensors Haywood Ho.
Instrumentation and Profiling David Kaeli Department of Electrical and Computer Engineering Northeastern University Boston, MA
November 2011 At A Glance GREAT is a flexible & highly portable set of mission operations analysis tools that increases the operational value of ground.
Types of software. Sonam Dema..
Effective C# 50 Specific Way to Improve Your C# Item 50 Scott68.Chang.
1 8/29/05CS360 Windows Programming Professor Shereen Khoja.
Architecture of.NET Framework .NET Framework ٭ Microsoft.NET (pronounced “dot net”) is a software component that runs on the Windows operating.
ASP.NET  ASP.NET is a web development platform, which provides a programming model, a comprehensive software infrastructure and various services required.
Beyond Automatic Performance Analysis Prof. Dr. Michael Gerndt Technische Univeristät München
MSR Sense The Microsoft Research Networked Embedded Sensing Toolkit Stewart Tansley, PhD Adapted from: Feng Zhao.
ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago.
CCS APPS CODE COVERAGE. CCS APPS Code Coverage Definition: –The amount of code within a program that is exercised Uses: –Important for discovering code.
Analyzing parallel programs with Pin Moshe Bach, Mark Charney, Robert Cohn, Elena Demikhovsky, Tevi Devor, Kim Hazelwood, Aamer Jaleel, Chi- Keung Luk,
A Methodology for Architecture Exploration of heterogeneous Signal Processing Systems Paul Lieverse, Pieter van der Wolf, Ed Deprettere, Kees Vissers.
Profile-Guided Optimization Targeting High Performance Embedded Applications David Kaeli Murat Bicer Efe Yardimci Center for Subsurface Sensing and Imaging.
Instrumentation in Software Dynamic Translators for Self-Managed Systems Bruce R. Childers Naveen Kumar, Jonathan Misurda and Mary.
INNOV-10 Progress® Event Engine™ Technical Overview Prashant Thumma Principal Software Engineer.
6. A PPLICATION MAPPING 6.3 HW/SW partitioning 6.4 Mapping to heterogeneous multi-processors 1 6. Application mapping (part 2)
Enabling Self-management of Component-based High-performance Scientific Applications Hua (Maria) Liu and Manish Parashar The Applied Software Systems Laboratory.
Department of Computer Science MapReduce for the Cell B. E. Architecture Marc de Kruijf University of Wisconsin−Madison Advised by Professor Sankaralingam.
1 SciDAC High-End Computer System Performance: Science and Engineering Jack Dongarra Innovative Computing Laboratory University of Tennesseehttp://
Overview of Operating Systems Introduction to Operating Systems: Module 0.
Virtual Application Profiler (VAPP) Problem – Increasing hardware complexity – Programmers need to understand interactions between architecture and their.
Lecture 02. Java Virtual Machine(JVM) –set of computer software programs and data structures that use a virtual machine model for the execution of other.
1 University of Maryland Runtime Program Evolution Jeff Hollingsworth © Copyright 2000, Jeffrey K. Hollingsworth, All Rights Reserved. University of Maryland.
EECS 583 – Class 20 Research Topic 2: Stream Compilation, Stream Graph Modulo Scheduling University of Michigan November 30, 2011 Guest Speaker Today:
Aneka Cloud ApplicationPlatform. Introduction Aneka consists of a scalable cloud middleware that can be deployed on top of heterogeneous computing resources.
Hybrid Multi-Core Architecture for Boosting Single-Threaded Performance Presented by: Peyman Nov 2007.
Parallel Performance Measurement of Heterogeneous Parallel Systems with GPUs Allen D. Malony, Scott Biersdorff, Sameer Shende, Heike Jagode†, Stanimire.
ECE 526 – Network Processing Systems Design Programming Model Chapter 21: D. E. Comer.
Introduction to Performance Tuning Chia-heng Tu PAS Lab Summer Workshop 2009 June 30,
Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.
Heterogeneous Processing KYLE ADAMSKI. Overview What is heterogeneous processing? Why it is necessary Issues with heterogeneity CPU’s vs. GPU’s Heterogeneous.
Parallel OpenFOAM CFD Performance Studies Student: Adi Farshteindiker Advisors: Dr. Guy Tel-Zur,Prof. Shlomi Dolev The Department of Computer Science Faculty.
Introduction to Visual Basic. NET,. NET Framework and Visual Studio
OPERATING SYSTEM CONCEPT AND PRACTISE
Kai Li, Allen D. Malony, Sameer Shende, Robert Bell
Before You Begin Nahla Abuel-ola /WIT.
Parallel Computing Lecture
TAU integration with Score-P
Chapter 1 Introduction.
Parallel Algorithm Design
Introduction Enosis Learning.
Performance Tuning Team Chia-heng Tu June 30, 2009
University of Technology
CSCI/CMPE 3334 Systems Programming
Many-core Software Development Platforms
A Review of Processor Design Flow
Introduction Enosis Learning.
Section 1: Introduction to Simics
Physics-based simulation for visual computing applications
COMPUTER SOFT WARE Software is a set of electronic instructions that tells the computer how to do certain tasks. A set of instructions is often called.
Horizontally Partitioned Hybrid Main Memory with PCM
Dynamic Program Analysis
Outline Introduction Motivation for performance mapping SEAA model
Web APIs In computer programming, an application programming interface (API) is a set of subroutine definitions, protocols, and tools for building application.
Research: Past, Present and Future
Presentation transcript:

Tracing and Performance Analysis Tools for Heterogeneous Multicore System by Soon Thean Siew

Goals Collection profiles and traces on heterogeneous multicore platform, targets on Cell B.E architecture. Performance analysis and simulation based on the profiles and traces. Performance tuning based on the result of analysis/simulation. Performance visualization. System architecture evaluation based on trace-driven simulation. A trace-driven simulation produces a trace of memory references (for data and instructions). The trace can be used to model memory system performance & to model an instruction pipeline evaluate the performance of a specific multiprocessor with respect to a given workload, if traces concerning this workload are available

Trace Collecting Handler Library Tool Overview Preprocessing Runtime Trace Collection Post-processing and Visualization Source Code Instrumented Trace API Visualization Tools Report Executable Profile Instrumentation Tool Trace Collecting Handler Library Converter Compiler Architecture Dependent Trace-driven Simulation

Components Trace API Trace Collecting Handler Library Converter Serves as a protocol between runtime trace collection and post-processing Trace Collecting Handler Library A set of analysis routines which allocate collecting data into proper predefined structures Converter Transforms collecting trace into suitable format which is to be fed into visualization tools

Progress I Literature research on current available performance analysis tools for Cell BE: No robust tool that is capable for collecting computation and communication traces simultaneously. PDT mainly focus on SPE/communication specific problem Oprofile supports profiling on the PPU events and SPU time profiling (Fedora 7 only)

Progress II Literature research on solving code size problem on SPE local storage: Partition Manager from University of Delaware (overlay + software cache) Dynamic code loading

Progress III Set up and familiarize with instrumentation tool. CIL TAU (tau_instrumentor + Program Database Toolkit)