December 13, 20041 G raphical A symmetric P rocessing Prototype Presentation December 13, 2004.

Slides:



Advertisements
Similar presentations
Agenda Definitions Evolution of Programming Languages and Personal Computers The C Language.
Advertisements

A laser will be controlled using the LabVIEW FPGA (Field Programmable Gate Array) module. The module will precisely control the two-dimensional motion.
SLA-Oriented Resource Provisioning for Cloud Computing
By Adam Balla & Wachiu Siu
TU/e Processor Design 5Z032 1 Processor Design 5Z032 The role of Performance Henk Corporaal Eindhoven University of Technology 2009.
Case Tools Trisha Cummings. Our Definition of CASE  CASE is the use of computer-based support in the software development process.  A CASE tool is a.
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 19 Scheduling IV.
Prof. B. I. Khodanpur HOD – Dept. of CSE R. V. College of Engineering
Introductory Comments Regarding Hardware Description Languages.
Understanding Operating Systems 1 Overview Introduction Operating System Components Machine Hardware Types of Operating Systems Brief History of Operating.
Copyright © 1998 Wanda Kunkle Computer Organization 1 Chapter 2.1 Introduction.
Chapter 4 Assessing and Understanding Performance
Enhancing the Platform Independence of the Real-Time Specification for Java Andy Wellings, Yang Chang and Tom Richardson University of York.
GPGPU platforms GP - General Purpose computation using GPU
Basics of Operating Systems March 4, 2001 Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard.
Risk Modeling with Condor at The Hartford Condor Week March 15, 2005 Bob Nordlund The Hartford
VAP What is a Virtual Application ? A virtual application is an application that has been optimized to run on virtual infrastructure. The application software.
What is Concurrent Programming? Maram Bani Younes.
May 5, 2004Slide 2 Team Organization May 5, 2004Slide 3 Overview Define Goals, Objectives, Risks Product Components Prototype Components Testing Budget.
CS410 - BLUE GROUP Final Presentation communicate2Me.
CS492: Special Topics on Distributed Algorithms and Systems Fall 2008 Lab 3: Final Term Project.
An Effective Dynamic Scheduling Runtime and Tuning System for Heterogeneous Multi and Many-Core Desktop Platforms Authous: Al’ecio P. D. Binotto, Carlos.
THE MILESTONES OF MASS TRANSIT CS 410 Blue Group communicate 2Me.
An intro to programming. The purpose of writing a program is to solve a problem or take advantage of an opportunity Consists of multiple steps:  Understanding.
 What is OS? What is OS?  What OS does? What OS does?  Structure of Operating System: Structure of Operating System:  Evolution of OS Evolution of.
Operating Systems.
A Spring 2005 CS 426 Senior Project By Group 15 John Studebaker, Justin Gerthoffer, David Colborne CSE Dept., University of Nevada, Reno Advisors (CSE.
GPUs and Accelerators Jonathan Coens Lawrence Tan Yanlin Li.
Feasibility Study.
Mahesh Sukumar Subramanian Srinivasan. Introduction Embedded system products keep arriving in the market. There is a continuous growing demand for more.
INVITATION TO COMPUTER SCIENCE, JAVA VERSION, THIRD EDITION Chapter 6: An Introduction to System Software and Virtual Machines.
Frontiers in Massive Data Analysis Chapter 3.  Difficult to include data from multiple sources  Each organization develops a unique way of representing.
Computational Biology 2008 Advisor: Dr. Alon Korngreen Eitan Hasid Assaf Ben-Zaken.
GPU Architecture and Programming
Systems Life Cycle A2 Module Heathcote Ch.38.
Advisor: Dr. Edwin Jones 1 Client: Paul Jewell ISU Engineering Distance Learning Facility May01-13 Design Team: David DouglasCprE Matt EngelbartEE Hank.
Network Enabled Wearable Sensors The Combined Research Curriculum Development (CRCD) project works with the Virtual Reality Applications Center (VRAC)
1 CSCD 326 Data Structures I Software Design. 2 The Software Life Cycle 1. Specification 2. Design 3. Risk Analysis 4. Verification 5. Coding 6. Testing.
By Dirk Hekhuis Advisors Dr. Greg Wolffe Dr. Christian Trefftz.
Milestone Presentation CS 410 Red Team Presenters: Olga Stubbs, Adrian Clark 7 April 2005.
Company small business cloud solution Client UNIVERSITY OF BEDFORDSHIRE.
An operating system is the software that makes everything in the computer work together smoothly and efficiently. What is an Operating System?
Uppsala, April 12-16th 2010EGEE 5th User Forum1 A Business-Driven Cloudburst Scheduler for Bag-of-Task Applications Francisco Brasileiro, Ricardo Araújo,
Application Software System Software.
Operating Systems.
CS4315A. Berrached:CMS:UHD1 Introduction to Operating Systems Chapter 1.
3/12/2013Computer Engg, IIT(BHU)1 CUDA-3. GPGPU ● General Purpose computation using GPU in applications other than 3D graphics – GPU accelerates critical.
Unit F451 Computer Fundamentals Components of a Computer System Software Data: Its representation, structure and management in information.
Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.
NFV Compute Acceleration APIs and Evaluation
Lecture 2: Performance Evaluation
September 2 Performance Read 3.1 through 3.4 for Tuesday
An assessment framework for Intrusion Prevention System (IPS)
Tohoku University, Japan
Topic: Difference b/w JDK, JRE, JIT, JVM
Where are being used the OS?
Chapter 9 – Real Memory Organization and Management
Grid Computing.
Texas Instruments TDA2x and Vision SDK
Introduction to Operating System (OS)
FEASIBILITY STUDY Feasibility study is a means to check whether the proposed system is correct or not. The results of this study arte used to make decision.
1. 2 VIRTUAL MACHINES By: Satya Prasanna Mallick Reg.No
Dejan Pavlovic Director, Regulatory Affairs & Development
What is Concurrent Programming?
MPJ: A Java-based Parallel Computing System
Chapter 5: CPU Scheduling
COMS 361 Computer Organization
January 25 Did you get mail from Chun-Fa about assignment grades?
Prof. B. I. Khodanpur HOD – Dept. of CSE R. V. College of Engineering
Operating System Overview
Presentation transcript:

December 13, G raphical A symmetric P rocessing Prototype Presentation December 13, 2004

2 Team Organization

December 13,

December 13,

December 13,

December 13, Problem Statement Computationally intensive environments underutilize Graphical Processing Units.

December 13, Background Information Discussed since 1996, but never implemented. GPU Performance –Multiplied at a rate of 2.8 times per year since 1993 –Expected to increase at this rate for another 5 years The more performance increases the more helpful our product becomes

December 13, Solution: G.A.P. Create a usable, extendable, and maintainable API to leverage the unused computing power of graphics processors that will result in increased performance of scientific, database, and other processor- intensive applications.

December 13, Solution: G.A.P. Utilizing existing hardware: –Improve computing power –Improve computing time –Improve computing responsiveness

December 13, Solution Implementation By creating a: –SDK to utilize the GPU –Selling that SDK to NVIDIA

December 13, What is an SDK? S.D.K. – Software Development Kit A set of programs that allows software developers to create products to run on a particular platform or to work with an API. Include: Manual, Examples, Libraries Other examples, both free and commercial: –Java, OS/2, AW, Windows, DirectX

December 13, Phase 1 Product Goals Demonstrate amount of power in current GPUs –Also: Ability to utilize power Secure funding to continue development Secure interested parties – universities and research labs Take first steps towards NVIDIA partnership

December 13, Phase 1 Product Objectives Leverage the GPU for additional power Improve throughput on workstation machines Ease programming difficulty for utilizing the GPU Maintain current program compatibility Preserve system stability

December 13, Product Risks & Mitigations Vendor Support –NVIDIA sets aside $1billion to use on Acquisitions R&D Writing the Software –Time intensive product “Build first and optimize later”

December 13, Product Functional Diagram

December 13, Product Dataflow Diagram

December 13, Product Dataflow Diagram

December 13, Product Dataflow Diagram

December 13, Product Dataflow Diagram

December 13, Dataflow Diagram for Product

December 13, Prototype

December 13, Navier-Stokes Equations used to refer to the incompressible form of the momentum equation. a full and general set of differential equations governing the motion of a fluid

December 13, Navier-Stokes Equations Simulation of Fluid Like Behavior –Example of applications used within Computational Intensive Environments –Multiple Old Dominion PHD candidate’s thesis topics focus on Navier Stokes Will serve as a basis application to prove efficiency of GPU over CPU –Shows an average 60% gain in efficiency

December 13, Prototype Functional Diagram

December 13, Dataflow Diagram for Prototype

December 13, Dataflow Diagram for Prototype

December 13, Demonstration Two versions of an executable –CPU vs GPU Navier Stokes on a vector field with four jets –Demonstration will consist of firing the jets for different lengths of time and observing performance –Observe CPU alone –Observe GPU alone –Observe Simultaneously

December 13, On the CPU

December 13, With GAP on the GPU

December 13, Risks Main research issues include quality of floating point –The numbers are ‘single precision’ not double. Works best when ‘batched,’ which requires a relatively ‘parallel’ system –Already a multithreading issue. Solutions both in programmer practice and compiler design exist.

December 13, Risks Mitigated (Prototype) Floating Point Quality: –Distributed the field thickly enough that floating point was accurate. Batching: –Used “Stream” operator that ensured a command size was sufficient before it flushed the results.

December 13, Risk Mitigation (Product) Floating Point –NVIDIA says cards will include double precision upon demand –NVIDIA partnership will expedite. Batching –The Context system has an internal, self optimizing queue, with the “flush” instruction for programmer flexibility.

December 13, Testing and Evaluation 20 Frames to 1 “real world second” –Translates: speed on GPU –Faster than a “real world second”! speed on CPU

December 13, Suitability What does this prove? –Gives magnitude of performance increase –Efficiency gain with no new hardware –“Real world” problem solved –Standard interface any program could use

December 13, Degree of Completeness Similarities Prototype General access functions “Context” based input Demonstrated performance gain Utilizes GPU for as much work as possible Release General access functions “Context” based input Demonstrated performance gain Utilizes GPU for as much work as possible

December 13, Degree of Completeness Differences Prototype Specific to GF5 platform Limited GAP Commands “All or Nothing” GPU use Release General platform Wide array of GAP commands Dynamic GPU use based on capabilities

December 13, Budget Reports

December 13, Phase I Funding Phase I SBIR –Completed at the end of Phase 0

December 13, Phase I Budget Staff

December 13, Phase I Budget

December 13, Major Milestones Phase I Organize Project Group Produce Project Descriptive Paper Develop Contracts Produce Budget White Paper Produce Project User Manual Develop Prototype Produce SBIR Phase II Proposal Produce Project Website

December 13, Phase I Schedule

December 13, Phase II Funding Phase II SBIR –Completed at the end of Phase I

December 13, Phase II Budget Staff *4 programmers needed

December 13, Patent Acquisition

December 13, Phase II Budget * Purchased in Phase 1

December 13, Major Milestones Phase II Production Marketing Legal Negotiation Final Preproduction Alterations

December 13, Phase II Schedule

December 13, Phase III We plan to sell the product to NVIDIA at the end of Phase II Doing so would mitigate all responsibilities and risk factors that may arise on the market –While we increase the companies profit by over $6.5 million

December 13, Profit Margin/Break Even Immediate Profit $70 million average profit for acquisitions –If we obtain 1/10(average) –We would still make a $6.5 million gain

December 13, Profit Margin/Break Even Phase 1 Budget Phase 2 Budget Total GAP Acquisition by NVIDIA $7,000,000 NET PROFIT$6,592,000

December 13, Conclusion Through our prototype we have achieved “proof of concept” The overall efficiency gain obtained within computationally intensive environments proves a need for GAP

December 13, G raphical A symmetric P rocessing