Framework support for Accelerators Sami Kama. Introduction Current Status Future Accelerator use modes Symmetric resource Asymmetric resource 09/11/2015.

Slides:



Advertisements
Similar presentations
MODERN OPERATING SYSTEMS Third Edition ANDREW S. TANENBAUM Chapter 3 Memory Management Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall,
Advertisements

Chapter 5 Data Management. – The Best & Most Convenient Way to Learn Salesforce.com 2 Objectives By the end of the module, you.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts Amherst Operating Systems CMPSCI 377 Lecture.
Chapter 4: Threads. Overview Multithreading Models Threading Issues Pthreads Windows XP Threads.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
1 Wednesday, June 28, 2006 Command, n.: Statement presented by a human and accepted by a computer in such a manner as to make the human feel that he is.
Mahapatra-Texas A&M-Fall'001 cosynthesis Introduction to cosynthesis Rabi Mahapatra CPSC498.
Status and roadmap of the AlFa Framework Mohammad Al-Turany GSI-IT/CERN-PH-AIP.
CS 104 Introduction to Computer Science and Graphics Problems Software and Programming Language (2) Programming Languages 09/26/2008 Yang Song (Prepared.
Distributed Process Management1 Learning Objectives Distributed Scheduling Algorithms Coordinator Elections Orphan Processes.
Overview SAP Basis Functions. SAP Technical Overview Learning Objectives What the Basis system is How does SAP handle a transaction request Differentiating.
Operating Systems CSE 411 CPU Management Sept Lecture 11 Instructor: Bhuvan Urgaonkar.
Using Data Active Server Pages Objectives In this chapter, you will: Learn about variables and constants Explore application and session variables Learn.
Lecture Roger Sutton CO530 Automation Tools 5: Class Libraries and Assemblies 1.
Irwin/McGraw-Hill Copyright © 2004 The McGraw-Hill Companies. All Rights reserved Whitten Bentley DittmanSYSTEMS ANALYSIS AND DESIGN METHODS6th Edition.
WorkPlace Pro Utilities.
Resource Management and Accounting Working Group Working Group Scope and Components Progress made Current issues being worked Next steps Discussions involving.
Identifying Reversible Functions From an ROBDD Adam MacDonald.
1 Robot Networking Greg McChesney Texas Tech University Apr 21, 2009 CS5331: Autonomous Mobile Robots.
 Introduction to Operating System Introduction to Operating System  Types Of An Operating System Types Of An Operating System  Single User Single User.
Eric Keller, Evan Green Princeton University PRESTO /22/08 Virtualizing the Data Plane Through Source Code Merging.
CSCI 6962: Server-side Design and Programming Web Services.
Chapter 41 Processes Chapter 4. 2 Processes  Multiprogramming operating systems are built around the concept of process (also called task).  A process.
Suite zTPFGI Facilities. Suite Focus Three of zTPFGI’s facilities:  zAutomation  zTREX  Logger.
Memory Management 3 Tanenbaum Ch. 3 Silberschatz Ch. 8,9.
Requirements for a Next Generation Framework: ATLAS Experience S. Kama, J. Baines, T. Bold, P. Calafiura, W. Lampl, C. Leggett, D. Malon, G. Stewart, B.
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 6 System Calls OS System.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Operating Systems Overview Part 2: History (continued)
Suite zTPFGI Facilities. Suite Focus Three of zTPFGI’s facilities:  zAutomation  zTREX  Logger.
Event Data History David Adams BNL Atlas Software Week December 2001.
9 February 2000CHEP2000 Paper 3681 CDF Data Handling: Resource Management and Tests E.Buckley-Geer, S.Lammel, F.Ratnikov, T.Watts Hardware and Resources.
Introduction Advantages/ disadvantages Code examples Speed Summary Running on the AOD Analysis Platforms 1/11/2007 Andrew Mehta.
To make ByteStream Data There is a plan that the High Level Trigger (HLT) software uses the offline framework (Athena). HLT group wants to read in ByteStream.
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
TDAQ Upgrade Software Plans John Baines, Tomasz Bold Contents: Future Framework Exploitation of future Technologies Work for Phase-II IDR.
Parametric Optimization Of Some Critical Operating System Functions An Alternative Approach To The Study Of Operating Systems Design.
Precomputation- based Prefetching By James Schatz and Bashar Gharaibeh.
1 Computer Systems II Introduction to Processes. 2 First Two Major Computer System Evolution Steps Led to the idea of multiprogramming (multiple concurrent.
Overview of Operating Systems Introduction to Operating Systems: Module 0.
David Adams ATLAS DIAL: Distributed Interactive Analysis of Large datasets David Adams BNL August 5, 2002 BNL OMEGA talk.
HIGUCHI Takeo Department of Physics, Faulty of Science, University of Tokyo Representing dBASF Development Team BELLE/CHEP20001 Distributed BELLE Analysis.
INFSO-RI Enabling Grids for E-sciencE Ganga 4 – The Ganga Evolution Andrew Maier.
GLAST LAT Offline SoftwareCore review, Jan. 17, 2001 Review of the “Core” software: Introduction Environment: THB, Thomas, Ian, Heather Geometry: Joanne.
Processes & Threads Introduction to Operating Systems: Module 5.
Chapter 7 - Interprocess Communication Patterns
SPI NIGHTLIES Alex Hodgkins. SPI nightlies  Build and test various software projects each night  Provide a nightlies summary page that displays all.
1 Process Description and Control Chapter 3. 2 Process A program in execution An instance of a program running on a computer The entity that can be assigned.
Cs431-cotter1 Processes and Threads Tanenbaum 2.1, 2.2 Crowley Chapters 3, 5 Stallings Chapter 3, 4 Silberschaz & Galvin 3, 4.
Slide 6-1 Chapter 6 System Software Considerations Introduction to Information Systems Judith C. Simon.
Slide 1 2/22/2016 Policy-Based Management With SNMP SNMPCONF Working Group - Interim Meeting May 2000 Jon Saperia.
Chapter 3: Processes. 3.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts - 7 th Edition, Feb 7, 2006 Chapter 3: Processes Process Concept.
Chapter – 8 Software Tools.
Introduction Contain two or more CPU share common memory and peripherals. Provide greater system throughput. Multiple processor executing simultaneous.
Unified Parallel C at LBNL/UCB Berkeley UPC Runtime Report Jason Duell LBNL September 9, 2004.
Channels. Models for Communications Synchronous communications – E.g. Telephone call Asynchronous communications – E.g. .
AliRoot survey: Reconstruction P.Hristov 11/06/2013.
Analysis framework plans A.Gheata Offline week 13 July 2011.
I/O aspects for parallel event processing frameworks Workshop on Concurrency in the many-Cores Era Peter van Gemmeren (Argonne/ATLAS)
Object-Oriented Track Reconstruction in the PHENIX Detector at RHIC Outline The PHENIX Detector Tracking in PHENIX Overview Algorithms Object-Oriented.
Wednesday NI Vision Sessions
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
1 Module 3: Processes Reading: Chapter Next Module: –Inter-process Communication –Process Scheduling –Reading: Chapter 4.5, 6.1 – 6.3.
David Adams ATLAS Hybrid Event Store Integration with Athena/StoreGate David Adams BNL March 5, 2002 ATLAS Software Week Event Data Model and Detector.
Processes and threads.
Queues.
Triggering events with GPGPU in ATLAS
Operation System Program 4
Channels.
Channels.
Channels.
Presentation transcript:

Framework support for Accelerators Sami Kama

Introduction Current Status Future Accelerator use modes Symmetric resource Asymmetric resource 09/11/2015 TIM Meeting Berkeley 2

Offloading now Accelerators are exposed to framework through use of OffloadSvc and APE server Athena algorithms prepare necessary data and do an offload request to OffloadSvc OffloadSvc sends request to APE server. Server executes the request on data by selecting appropriate module, and returns the results to OffloadSvc OffloadSvc passes response to algorithm, which convert results back to Athena EDM 09/11/2015 TIM Meeting Berkeley 3

Example HLT Demonstrator 1.HLT Algorithm asks for offload to TrigDetAccelSvc 2.TrigDetAccelSvc converts C++ classes for raw and reconstructed quantities from the Athena Event Data Model(EDM) to GPU optimized EDM through Data Export Tools 3.Then it adds metadata and requests offload through OffloadSvc 4.OffloadSvc manages multiple requests and communication with APE server 5.Results are converted back to Athena EDM by TrigDetAccelSvc and handed to requesting algorithm TrigDetAccelSvc Export Tool Athena EDM GPU EDM OffloadSvc Data+MetaData HLT Algoritm APE Server Request Result Export tools and server communication need to be fast (serial sections) 09/11/2015 TIM Meeting Berkeley Implemented for each detector such as TrigInDetAccelSvc Module

Current status Multiple Athena’s and AthenaMP are supported Some glitches after forking, possible race condition in Yampl OffloadSvc can dump offload requests and responses for re-players. Needed for quick development and validation of accelerator algorithm Works fine for multi-process frameworks! 09/11/2015 TIM Meeting Berkeley 5

Offloading in future Client-Server approach is still valid if Accelerator is remote Framework can not include or work with Libraries Languages Compilers Programming paradigms needed by accelerator 09/11/2015 TIM Meeting Berkeley 6

Future Accelerator use modes Offloading should be integrated in scheduler. Could be a scheduler extension or plugin to modify graphs. Two different approaches depending on accelerator Accelerator is a symmetric resource Accelerator is an asymmetric resource 09/11/2015 TIM Meeting Berkeley 7

CPU Symmetric Resource Accelerator algorithms generate same output containers so next algorithm can execute in either CPU or accelerator. Easy to handle in current design May need differentiation of algorithms if different implementations are required (eg. CPU-GPU vs CPU-KNL) A B C D A B C D Accelerator 09/11/2015 TIM Meeting Berkeley 8

Asymmetric resource Accelerator algorithms execute several steps internally and create final containers There is no support for this mode currently Probably most efficient use of accelerator Scheduler has to have a separate graph segment for accelerator paths Event processing graph changes for depending on the availability of the resource May need data conversions steps CPU A B C D A-D Accelerator X Y time 09/11/2015 TIM Meeting Berkeley 9

Summary The choice of offloading model highly depends on availability of algorithms and accelerators Probably it will happen in an evolutionary way Plugin system can help Athena-APE will still work in new framework 09/11/2015 TIM Meeting Berkeley 10

BACKUP

Multi-Work resource management MODULE GPU LB NLB M CONSTANTS Work1(LB-N) Work2 (LB-N) WorkX (LB-M) LB N ptr LB M ptr ptr Scratch space for work to use Multi-event data such as MF and noise values Global constants such as geometry and hash tables addwork() Allocated by module at configure time depending on configuration create Works are processed and deleted returning resources to Module 12 09/11/2015 TIM Meeting Berkeley

WORK MutliGPU utilization Work contexts GPU1 areas GPU2 areas GPU3 areas GPU4 areas Module initializes multiple workspace on each GPU Queues them in cyclic manner Assigns front of the queue to new Works Workspace returns back to queue when job is deleted Front Back 13 09/11/2015 TIM Meeting Berkeley