INF5063: Programming heterogeneous multi-core processors

Slides:



Advertisements
Similar presentations
Micro controllers introduction. Areas of use You are used to chips like the Pentium and the Athlon, but in terms of installed machines these are a small.
Advertisements

INF5063 – GPU & CUDA Håkon Kvale Stensland iAD-lab, Department for Informatics.
Accelerators for HPC: Programming Models Accelerators for HPC: StreamIt on GPU High Performance Applications on Heterogeneous Windows Clusters
Home Exam 2: Video Encoding on GPUs using nVIDIA CUDA with Managed Memory Home Exam 2: Video Encoding on GPUs using nVIDIA CUDA with Managed Memory September.
Philips Research ICS 252 class, February 3, The Trimedia CPU64 VLIW Media Processor Kees Vissers Philips Research Visiting Industrial Fellow
Streaming SIMD Extension (SSE)
Dr. Ken Hoganson, © August 2014 Programming in R COURSE NOTES 2 Hoganson Language Translation.
Lecture 38: Chapter 7: Multiprocessors Today’s topic –Vector processors –GPUs –An example 1.
Time Optimization of HEVC Encoder over X86 Processors using SIMD
Introduction Introduction Håkon Kvale Stensland August 22 th, 2014 INF5063: Programming heterogeneous multi-core processors.
Home Exam 1: Video Encoding on Intel x86 using Streaming SIMD Extensions (SSE) and Advanced Vector Extensions (AVX) Home Exam 1: Video Encoding on Intel.
Software and Services Group Optimization Notice Advancing HPC == advancing the business of software Rich Altmaier Director of Engineering Sept 1, 2011.
Standards, process, requirements 4K PLAYBACK EXPLAINED.
© Fastvideo, Key Points We implemented the fastest JPEG codec Many applications using JPEG can benefit from our codec.
CS533 Concepts of Operating Systems Class 1 Course Overview.
Computer Architecture Wed: 14:00-14:00, 14/34 Instructor: Jihad El-Sana Office:111, Building:37 Tel:
Programming with CUDA, WS09 Waqar Saleem, Jens Müller Programming with CUDA and Parallel Algorithms Waqar Saleem Jens Müller.
CS533 Concepts of Operating Systems Class 1 Course Overview and Entrance Exam.
Copyright © 2006, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners Intel® Core™ Duo Processor.
Developing An Online Information Literacy Course Nancy O’Hanlon Ohio State University Libraries Wuhan University, China March 2007.
1 OS & Computer Architecture Modern OS Functionality (brief review) Architecture Basics Hardware Support for OS Features.
Strictly private and confidential
CS 470/570:Introduction to Parallel and Distributed Computing.
This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this license, visit
1 HW-SW Framework for Multimedia Applications on MPSoC: Practice and Experience Adviser : Chun-Tang Chao Adviser : Chun-Tang Chao Student : Yi-Ming Kuo.
CMPT 300: Operating Systems
CSSE 492 Advanced Computer Networks Dr. Yingwu Zhu Spring 2008.
1/23/2005 page1 11/11/2004 MPEG4 Codec for Access Grids National Center for High Performance Computing Speaker: Barz Hsu
Performance Enhancement of Video Compression Algorithms using SIMD Valia, Shamik Jamkar, Saket.
Performance of mathematical software Agner Fog Technical University of Denmark
Hyper Threading (HT) and  OPs (Micro-Operations) Department of Computer Science Southern Illinois University Edwardsville Summer, 2015 Dr. Hiroshi Fujinoki.
Fall 2014 MATH 250: Calculus III. Course Topics Review: Parametric Equations and Polar Coordinates Vectors and Three-Dimensional Analytic Geometry.
C o n f i d e n t i a l 1 Course: BCA Semester: III Subject Code : BC 0042 Subject Name: Operating Systems Unit number : 1 Unit Title: Overview of Operating.
Vodafone Chair Mobile Communications Systems, Prof. Dr.-Ing. G. Fettweis chair HW/SW Co-design Praktikum Erik Fischer & Emil Matúš
Knowledge Management Systems Lecture 3 Payman Shafiee.
Presentation 31 – Multicore, Multiprocessing, Multithreading, and Multitasking. When discussing modern PCs, the term “Multi” is thrown around a lot as.
UNDER THE GUIDANCE DR. K. R. RAO SUBMITTED BY SHAHEER AHMED ID : Encoding H.264 by Thread Level Parallelism.
CS533 Concepts of Operating Systems Class 1 Course Overview.
CENTRAL PROCESSING UNIT. CPU Does the actual processing in the computer. A single chip called a microprocessor. Composed of an arithmetic and logic unit.
Review of the numeration systems The hardware/software representation of the computer and the coverage of that representation by this course. What is the.
Lab Activities 1, 2. Some of the Lab Server Specifications CPU: 2 Quad(4) Core Intel Xeon 5400 processors CPU Speed: 2.5 GHz Cache : Each 2 cores share.
Time Optimization of HEVC Encoder over X86 Processors using SIMD Kushal Shah Advisor: Dr. K. R. Rao Spring 2013 Multimedia.
CS 179: GPU Computing LECTURE 2: MORE BASICS. Recap Can use GPU to solve highly parallelizable problems Straightforward extension to C++ ◦Separate CUDA.
Using the VTune Analyzer on Multithreaded Applications
CS203 – Advanced Computer Architecture
COMPSCI 110 Operating Systems
DDC 2223 SYSTEM SOFTWARE DDC2223 SYSTEM SOFTWARE.
Advanced Operating Systems CIS 720
Microarchitecture.
Fall 2016 MATH 250: Calculus III.
CS533 Concepts of Operating Systems Class 1
Chapter 1: Introduction
Lecture 2: Intro to the simd lifestyle and GPU internals
Many-core Software Development Platforms
Compiler Back End Panel
CS533 Concepts of Operating Systems Class 1
EGJ309 Smart Materials Class Assignment
CSCI1600: Embedded and Real Time Software
Compiler Back End Panel
Coe818 Advanced Computer Architecture
Embedded Computer Architecture 5SIA0 Overview
1.1 The Characteristics of Contemporary Processors, Input, Output and Storage Devices Types of Processors.
Computer Consultant Assignment
Lecture on High Performance Processor Architecture (CS05162)
CS 286 Computer Organization and Architecture
6- General Purpose GPU Programming
CS533 Concepts of Operating Systems Class 1
Term Dr Abdelhafid Bouhraoua
Question 1 How are you going to provide language and/or library (or other?) support in Fortran, C/C++, or another language for massively parallel programming.
Presentation transcript:

INF5063: Programming heterogeneous multi-core processors … because the OS-course is just to easy! Home Exam 1: Motion JPEG Encoding on Intel x86 using Streaming SIMD Extensions (SSE) 17/09 - 2010 Håkon Kvale Stensland

Motion JPEG Encoding Pål wants to encode some videos to Motion JPEG on his computer… Pål has spent his entire computer budget for buying PowerPoint 2010 licenses, so it is your task to make the encoder as efficient as possible on a single core.

Motion JPEG MJPEG is a series of JPEG frames without any motion compensation as in more advanced codecs. Allows easy parallelization strategies since there are few data dependencies. Unfortunately not standardized format, but extensively used on cheap embedded devices like web cameras and mobile phones.

Precode The precode is a basic single threaded MJPEG encoder written in C. You can use your own machines for this assignment, but you have to make sure that the code works on bush or clinton at the Simula network You are free to use and modify it as you see fit, but are not allowed to paste code from other projects / encoders in it. Your implementation is supposed to be single threaded, and optimized to use the instruction level parallelism available in a single x86 core.

Your task Utilize the instruction level parallelism (ILP) and CPU vector unit to get the most performance out of a single core. Start by profiling the encoder to see which parts of the encoder that are the bottlenecks. Remember, after optimizing one part of the code, more profiling might be needed to find new bottlenecks. Write a report with details on which parts of the encoding process that benefited from your optimizations. The report should also explain how your code works. Other tools like objdump are also availible to se the assembly instructions.

Formal Information Deadline: Wednesday 6th October – 23:59:59.99 The assignment will be graded, and count 33% of the final grade. Deliver your code and report to: https://devilry.ifi.uio.no/ Prepare a short (5 - 10 minutes) presentation for the class on the Thursday 7th October.

Good Luck!