Breakout Session 3 Alex, Mirco, Vojtech, Juraj, Christoph

Slides:



Advertisements
Similar presentations
Home Exam 2: Video Encoding on GPUs using nVIDIA CUDA with Managed Memory Home Exam 2: Video Encoding on GPUs using nVIDIA CUDA with Managed Memory September.
Advertisements

OpenMP Optimization National Supercomputing Service Swiss National Supercomputing Center.
Intel® performance analyze tools Nikita Panov Idrisov Renat.
Institute of Networking and Multimedia, National Taiwan University, Jun-14, 2014.
OS Spring ’ 04 Scheduling Operating Systems Spring 2004.
CPU Scheduling. Schedulers Process migrates among several queues –Device queue, job queue, ready queue Scheduler selects a process to run from these queues.
CS533 Concepts of Operating Systems Class 6 The Duality of Threads and Events.
Review for Test 2 i206 Fall 2010 John Chuang. 2 Topics  Operating System and Memory Hierarchy  Algorithm analysis and Big-O Notation  Data structures.
CS533 Concepts of Operating Systems Class 2 The Duality of Threads and Events.
CS533 Concepts of Operating Systems Class 3 Integrated Task and Stack Management.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Operating Systems CMPSCI 377 Lecture.
Operating Systems CMPSC 473 Threads September 16, Lecture 7 Instructor: Bhuvan Urgaonkar.
Software Performance Analysis Using CodeAnalyst for Windows Sherry Hurwitz SW Applications Manager SRD Advanced Micro Devices Lei.
Uncovering the Multicore Processor Bottlenecks Server Design Summit Shay Gal-On Director of Technology, EEMBC.
Scheduling Basic scheduling policies, for OS schedulers (threads, tasks, processes) or thread library schedulers Review of Context Switching overheads.
Operating Systems. Definition An operating system is a collection of programs that manage the resources of the system, and provides a interface between.
System Architecture Directions for Networked Sensors Jason Hill, Robert Szewczyk, Alec Woo, Seth Hollar, David Culler, Kris Pister Presented by Yang Zhao.
CS Spring 2011 CS 414 – Multimedia Systems Design Lecture 31 – Multimedia OS (Part 1) Klara Nahrstedt Spring 2011.
Breakout Session 3 Stack of adaptive systems (with a view on self-adaptation)
Chapter 4 – Threads (Pgs 153 – 174). Threads  A "Basic Unit of CPU Utilization"  A technique that assists in performing parallel computation by setting.
CSC Multiprocessor Programming, Spring, 2012 Chapter 11 – Performance and Scalability Dr. Dale E. Parson, week 12.
CS399 New Beginnings Jonathan Walpole. 2 Concurrent Programming & Synchronization Primitives.
A Preliminary Investigation on Optimizing Charm++ for Homogeneous Multi-core Machines Chao Mei 05/02/2008 The 6 th Charm++ Workshop.
CS510 Concurrent Systems Jonathan Walpole. RCU Usage in Linux.
Models for runtime optimization Free Breakout Session Jens, Thomas, Alex, Christoph.
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 4: Threads.
ECE 692 Power-Aware Computer Systems Final Review Prof. Xiaorui Wang.
Chapter 4: Threads 羅習五. Chapter 4: Threads Motivation and Overview Multithreading Models Threading Issues Examples – Pthreads – Windows XP Threads – Linux.
Processes Chapter 3. Processes in Distributed Systems Processes and threads –Introduction to threads –Distinction between threads and processes Threads.
Multi-Core CPUs Matt Kuehn. Roadmap ► Intel vs AMD ► Early multi-core processors ► Threads vs Physical Cores ► Multithreading and Multi-core processing.
Lucas De Marchi sponsors: co-authors: Liria Matsumoto Sato
Covert Channels Through Branch Predictors: a Feasibility Study
C++11 Atomic Types and Memory Model
Ran Liu (Fudan Univ. Shanghai Jiaotong Univ.)
EMERALDS Landon Cox March 22, 2017.
Current Generation Hypervisor Type 1 Type 2.
Simultaneous Multithreading
April 6, 2001 Gary Kimura Lecture #6 April 6, 2001
Knowledge bases for measurements and modeling
Chapter 4: Threads 羅習五.
Morgan Kaufmann Publishers Large and Fast: Exploiting Memory Hierarchy
Morgan Kaufmann Publishers
CS510 Operating System Foundations
Operating System Concepts
Some challenges in heterogeneous multi-core systems
Directory-based Protocol
Department of Computer Science University of California, Santa Barbara
Chapter 4: Threads.
John-Paul Fryckman CSE 231: Paper Presentation 23 May 2002
Mid Term review CSC345.
Half-Sync/Half-Async (HSHA) and Leader/Followers (LF) Patterns
CSSE 340 Operating Systems (First class)
Fast Communication and User Level Parallelism
(A Research Proposal for Optimizing DBMS on CMP)
Fine-grained vs Coarse-grained multithreading
Hardware Counter Driven On-the-Fly Request Signatures
Major Topics in Operating Systems
EPICSv4 Workshop, SLS, 2013 pvMicroBenchmark Matej Sekoranja.
Operating System Introduction.
Processor Scheduling Hank Levy 1.
Why Threads Are A Bad Idea (for most purposes)
Use-Case Design in Context
Chip&Core Architecture
Department of Computer Science University of California, Santa Barbara
Why Threads Are A Bad Idea (for most purposes)
Why Threads Are A Bad Idea (for most purposes)
Java Virtual Machine Profiling. Agenda Introduction JVM overview Performance concepts Monitoring Profiling VisualVM demo Tuning Conclusions.
Address-Stride Assisted Approximate Load Value Prediction in GPUs
CSC Multiprocessor Programming, Spring, 2011
Presented by Florian Ettinger
Presentation transcript:

Breakout Session 3 Alex, Mirco, Vojtech, Juraj, Christoph Correct Measurement Breakout Session 3 Alex, Mirco, Vojtech, Juraj, Christoph

Our Dream  With a certain usage profile we want to predict the optimal number of cores for a multi threaded application from the measurements taken from a single core! What kind of measurement do we need? Service Demands? Response Time? CPU Utilization? Memory Access pattern? …?

Questions Difference between multi-core and single-core multi-threaded systems? Difference between multi-core and distributed systems?

Overhead (not exhaustive) Assumption for our discussion: 2 Threads on 1 Core vs. 4 Threads on 2 Cores Single Core Multi-Threaded Multi-Core Multi-Threaded Cache Penalties Cache Misses Context Switching Synchronization (High Level Locks) …

What to measure on multi-core? Memory reads/writes [e.g. Bytes] (can we?) Energy Consumption (CPU utilization) -> all comes down to memory access and caches within the discussion Memory behaviour of an application? Scheduling overhead? Not easy to point out what is really important

Conclusion We have to take care in evaluating the memory behaviour more than in a general distributed system What has already been done in system research w.r.t. our question? No general solution, everything has to be tailored to the goal and application Usage profile matters most (Use Linux (more transperency) OS for measurements)