Lawrence Livermore National Laboratory Matthew LeGendre LLNL-PRES-481072 Lawrence Livermore National Laboratory, P. O. Box 808, Livermore,

Slides:



Advertisements
Similar presentations
6/1/2014FLOCON 2009, Scottsdale, AZ. DoD Disclaimer 6/1/2014FLOCON 2009, Scottsdale, AZ This document was prepared as a service to the DoD community.
Advertisements

Laser Direct Manufacturing of Nuclear Power Components
Smart Grid Communication System (SGCS) Jeff Nichols Sr. Director IT Infrastructure San Diego Gas & Electric 1.
Concentrating Solar Deployment Systems (CSDS) A New Model for Estimating U.S. Concentrating Solar Power Market Potential Nate Blair, Walter Short, Mark.
Chorus and other Microkernels Presented by: Jonathan Tanner and Brian Doyle Articles By: Jon Udell Peter D. Varhol Dick Pountain.
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin May 2-4, 2011 ProcControlAPI and StackwalkerAPI Integration into Dyninst Todd Frederick and Dan.
Paradyn Project Paradyn / Dyninst Week College Park, Maryland March 26-28, 2012 Paradyn Project Upcoming Features in Dyninst and its Components Bill Williams.
Technical Report NREL/TP April 2007 Controlled Hydrogen Fleet and Infrastructure Demonstration and Validation Project Spring 2007 Composite Data.
NREL is a national laboratory of the U.S. Department of Energy, Office of Energy Efficiency & Renewable Energy, operated by the Alliance for Sustainable.
Smart Grid Primer Funded by the U.S. Department of Energy, Office of Electricity Delivery and Energy Reliability Energy Bar Association – Primer for Lawyers.
Contiki A Lightweight and Flexible Operating System for Tiny Networked Sensors Presented by: Jeremy Schiff.
Processes CSCI 444/544 Operating Systems Fall 2008.
Figure 1.1 Interaction between applications and the operating system.
Slide 1 Upgrading the United States Transuranium and Uranium Registries’ Pathology Database Stacey L. McCord, MS USTUR Project Associate
© 2011 New York Independent System Operator, Inc. All Rights Reserved. Smart Grid Investment Grant Project Update Jim McNierney Enterprise Architect New.
LLNL-PRES-XXXXXX This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
1 Jon Sudduth Project Engineer, Intelligent Grid Deployment SWEDE April 26, 2011.
NREL is a national laboratory of the U.S. Department of Energy, Office of Energy Efficiency & Renewable Energy, operated by the Alliance for Sustainable.
Jeffrey C Quick, Utah Geological Survey Sara Pletcher, Project Manager National Energy Technology Laboratory.
1 Floyd Galvan October 12-13, 2011.
PRES-ET A011 Lynn J. Harkey SDIT Project Engineer Uranium Processing Facility Project B&W Y-12 August 26, 2009 The Process, Methods and Tool Used.
Andrew Bernat, Bill Williams Paradyn / Dyninst Week Madison, Wisconsin April 29-May 1, 2013 New Features in Dyninst
A Comparative Study of the Linux and Windows Device Driver Architectures with a focus on IEEE1394 (high speed serial bus) drivers Melekam Tsegaye
NREL is a national laboratory of the U.S. Department of Energy, Office of Energy Efficiency & Renewable Energy, operated by the Alliance for Sustainable.
Processes and Threads Processes have two characteristics: – Resource ownership - process includes a virtual address space to hold the process image – Scheduling/execution.
Lecture 3 Process Concepts. What is a Process? A process is the dynamic execution context of an executing program. Several processes may run concurrently,
USAEE Conference 2011, CJN Oct 2011 The Role of CCS under a Clean Energy Standard 30 th USAEE/IAEE Conference Oct 10, 2011 Washington, DC Chris Nichols,
Martin Schulz Center for Applied Scientific Computing Lawrence Livermore National Laboratory Lawrence Livermore National Laboratory, P. O. Box 808, Livermore,
2011 Broward Municipal Green Initiatives Survey Results GHG Mitigation Energy 2/3 of Broward’s reporting municipalities have implemented incentives or.
Operating Systems Lecture 7 OS Potpourri Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard. Zhiqing Liu School of Software.
Gas-Electric System Interface Study OPSI Annual Meeting October 8, 2013 Raleigh, North Carolina.
Geographic Variation of Mercury Content, and Mercury Emissions Predicted For Existing Technologies, by U.S. County of Coal Origin Authors: Jeffrey C Quick.
Y-12 Integration of Security and Safety Basis, Including Firearms Safety David Sheffey Safety Analysis, Compliance, and Oversight Manager B&W Technical.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
NREL is a national laboratory of the U.S. Department of Energy, Office of Energy Efficiency & Renewable Energy, operated by the Alliance for Sustainable.
Lawrence Livermore National Laboratory Pianola: A script-based I/O benchmark Lawrence Livermore National Laboratory, P. O. Box 808, Livermore, CA
CS333 Intro to Operating Systems Jonathan Walpole.
1 Technical Report NREL/TP May 2010 Controlled Hydrogen Fleet and Infrastructure Demonstration and Validation Project Spring 2010 Composite Data.
NREL is a national laboratory of the U.S. Department of Energy, Office of Energy Efficiency & Renewable Energy, operated by the Alliance for Sustainable.
Martin Schulz Center for Applied Scientific Computing Lawrence Livermore National Laboratory ASC STAT Team: Greg Lee, Dong Ahn (LLNL), Dane Gardner (LANL)
November 2005 New Features in Paradyn and Dyninst Matthew LeGendre Ray Chen
What’s All This I Hear About Information “Architecture?” InterLab 06 Joe Chervenak & Marsha Luevane National Renewable Energy Laboratory.
Processes CS 6560: Operating Systems Design. 2 Von Neuman Model Both text (program) and data reside in memory Execution cycle Fetch instruction Decode.
Technical Report NREL/TP April 2008 Controlled Hydrogen Fleet and Infrastructure Demonstration and Validation Project Spring 2008 Composite Data.
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 14 Threads 2 Read Ch.
Lawrence Livermore National Laboratory S&T Principal Directorate - Computation Directorate Tools and Scalable Application Preparation Project Computation.
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 12-14, 2010 Binary Rewriting with Dyninst Madhavi Krishnan and Dan McNulty.
1 Technical Report NREL/TP June 2010 Early Fuel Cell Market Deployments: ARRA Quarter 1 of 2010 Composite Data Products Final Version February.
V UNCLASSIFIED This document has been reviewed by a Y-12 DC/UCNI-RO and has been determined to be UNCLASSIFIED and contains no UCNI. This review does not.
National Renewable Energy Laboratory 1 Innovation for Our Energy Future.
Evaluation of the Impact to the Safety Basis of Research Conducted in Production Facilities at the Y-12 National Security Complex Rebecca N. Bell Senior.
LLNL-PRES This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin May 2-4, 2011 Paradyn Project Deconstruction of Dyninst: Best Practices and Lessons Learned Bill.
NREL is a national laboratory of the U.S. Department of Energy, Office of Energy Efficiency & Renewable Energy, operated by the Alliance for Sustainable.
2015 NARUC Winter Meeting Nick Wagner – Iowa Utilities Board 1.
Reference Implementation of the High Performance Debugging (HPD) Standard Kevin London ( ) Shirley Browne ( ) Robert.
Spring 2016 ICC Meeting – Subcommittee F Estimating the Value / Benefit of Diagnostics 1 of 3 – Perhaps? Josh Perkel and Nigel Hampton NEETRAC.
Scalable Coupled ICT and Power Grid Simulation - High-performance Coupled Transmission, Distribution, and Communication Simulation Tool 15PESGM2794 Liang.
New Features in Dyninst 5.1
CS399 New Beginnings Jonathan Walpole.
Threads and Locks.
Many-core Software Development Platforms
Subset Selection in Multiple Linear Regression
New Features in Dyninst 6.1 and 6.2
September Workshop and Advisory Board Meeting Presenter Affiliation
September Workshop and Advisory Board Meeting Presenter Affiliation
State Participation in Nonproliferation Regime Networks
Jonathan Walpole Computer Science Portland State University
Mitigating Inter-Job Interference Using Adaptive Flow-Aware Routing
2/3 20% 71% Half 54% Over Half 45% 14% Introduction GHG Mitigation
Presentation transcript:

Lawrence Livermore National Laboratory Matthew LeGendre LLNL-PRES Lawrence Livermore National Laboratory, P. O. Box 808, Livermore, CA This work performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344 Challenges for ProcControlAPI and DyninstAPI on BlueGene May 3, 2011

2 Challenges for ProcControlAPI and Dyninst on BlueGene Lawrence Livermore National Laboratory Disclaimer This document was prepared as an account of work sponsored by an agency of the United States government. Neither the United States government nor Lawrence Livermore National Security, LLC, nor any of their employees makes any warranty, expressed or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States government or Lawrence Livermore National Security, LLC. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States government or Lawrence Livermore National Security, LLC, and shall not be used for advertising or product endorsement purposes.

3 Challenges for ProcControlAPI and Dyninst on BlueGene Lawrence Livermore National Laboratory DyninstAPI Components at LLNL  STAT – Stack Trace Analysis Tool Debugging at scale Uses StackwalkerAPI, ProcControlAPI, SymtabAPI  Open|SpeedShop & CBTF Performance Analysis Tool Uses DyninstAPI  Research Tools: Libra, AutomaDeD, CBI, PnMPI, TAU Most use StackwalkerAPI and SymtabAPI

4 Challenges for ProcControlAPI and Dyninst on BlueGene Lawrence Livermore National Laboratory BlueGene Machines at LLNL Dawn BlueGene/P 144k cores Sequoia BlueGene/Q 1.6 million cores BGL BlueGene/L 208k cores  Want DyninstAPI and Components on BlueGene Static binary rewriting working 1 st Party tools working Dynamic instrumentation and ProcControlAPI remain

5 Challenges for ProcControlAPI and Dyninst on BlueGene Lawrence Livermore National Laboratory BlueGene From a 3 rd Party Tool’s POV Login Node Tool Front-End IO Nodes Tool Daemons (ProcControl or Dyninst) Compute Nodes Applications Traditional place for MRNet. This talk

6 Challenges for ProcControlAPI and Dyninst on BlueGene Lawrence Livermore National Laboratory BlueGene From a Debugger’s POV  Tool daemon runs on IO nodes  Debug operations communicated to CIOD via pipe  CIOD Process sends debug operations to CNK (Compute Node Kernel). CIOD Process Tool Daemon Process pipe = read request = read result

7 Challenges for ProcControlAPI and Dyninst on BlueGene Lawrence Livermore National Laboratory Linux Typical 1:1 Exceptional 1:8 BlueGene Debugging Vs. Traditional OSs BG/L or BG/PBG/Q BGL 1:256 Dawn 1:128  Ratio of debug servers to debugee processes:  Speed of debug interface operations: Linux Half a microsecond (speed of syscall) BG/L or BG/P Hundreds of microseconds (speed of network) Sequoia 1:8192

8 Challenges for ProcControlAPI and Dyninst on BlueGene Lawrence Livermore National Laboratory Serial Operations are Not Feasible  Currently ~0.3 seconds to start-up one dynamically linked “Hello World” under debugger Estimate 41 minutes to serially start 8192 “Hello Worlds”  Debuggers traditional operate this way CIOD Process Debug Server Process pipe = attach request = attach result

9 Challenges for ProcControlAPI and Dyninst on BlueGene Lawrence Livermore National Laboratory Need to Parallelize Debugger Operations  BlueGene debugger interface supports parallelization of operations.  Must exploit this in tools. CIOD Process Debug Server Process pipe = attach request = attach result

10 Challenges for ProcControlAPI and Dyninst on BlueGene Lawrence Livermore National Laboratory Parallelize Across Software Stack  Need to parallelize all layers E.g., Parallelizing ProcControlAPI is useless if StackwalkerAPI uses it serially. int foo(char *s, int global_j, void *space) { for(int i = 0; i < 4; i++) { printf(“[%s:%u] - I am foo: %s\n”, __FILE__, __LINE__); global_j += (int) space; } while (global_j != *s) sscanf(“%s here %d\n”, &s, &global_j); } User Code int foo(char *s, int global_j, void *space) { for(int i = 0; i < 4; i++) { printf(“[%s:%u] - I am foo: %s\n”, __FILE__, __LINE__); global_j += (int) space; } while (global_j != *s) sscanf(“%s here %d\n”, &s, &global_j); } ProcControlAPI int foo(char *s, int global_ for(int i = 0; i < 4; i++) printf(“[%s:%u] – I’m f global_j += (int) spac } while (global_j != *s) sscanf(“%s here %d\ } int foo(char *s, int global_ for(int i = 0; i < 4; i++) printf(“[%s:%u] – I’m f global_j += (int) spac } while (global_j != *s) sscanf(“%s here %d\ } StackwalkerAPI/ DyninstAPI

11 Challenges for ProcControlAPI and Dyninst on BlueGene Lawrence Livermore National Laboratory Two Parallelization Mechanisms Group Operations Operate on many processes at once Conceptually similar to Proc++. Good for doing the same operation across all processes E.g., attach to processes Built on top of asynchronous operations Asynchronous Operations Start next operation while original is still pending Conceptually similar to asynchronous I/O Good for doing unique operations on each process E.g., stackwalking Supported by BlueGene Debugger Interface

12 Challenges for ProcControlAPI and Dyninst on BlueGene Lawrence Livermore National Laboratory Asynchronous Interface { Process::registerEventCallback(EventType::Async, async_cb_func); proc->readMemoryAsync(addr, buffer, size, arg); Process::handleEvents();... } cb_ret_t async_cb_func(Event::ptr orig) { EventAsyncMem::ptr ev = orig.getEventAsyncMem(); bool result = ev->getReturnCode(); char *mem = ev->getBuffer(); void *arg = ev->getArg();... } Register callbacks for async completion. Async versions of: readMemory writeMemory getRegister setRegister Use handleEvents for synchronization. Callback function gets results when operation completes.

13 Challenges for ProcControlAPI and Dyninst on BlueGene Lawrence Livermore National Laboratory Group Operations  Create ProcessSet and ThreadSet classes Similar to existing Process and Thread interfaces Dynamically add/remove processes to sets Library/offset addressing Sets of return results  Eventually use similar concept in StackwalkerAPI and DyninstAPI interfaces  Interfaces functional on non-BlueGene platforms, but not necessarily a performance benefit.

14 Challenges for ProcControlAPI and Dyninst on BlueGene Lawrence Livermore National Laboratory Event Handling  Groups and asynchronous operations do not help with event handling. E.g., 8k processes hit a breakpoint at once.  Thread ProcControlAPI Utilize resources on multi core IO-nodes Hide threading from user Current model designed for simplicity and correctness, not performance

15 Challenges for ProcControlAPI and Dyninst on BlueGene Lawrence Livermore National Laboratory Threading for Parallelism int foo(char *s, int global_j, void *space) { for(int i = 0; i < 4; i++) { printf(“[%s:%u] - I am foo: %s\n”, __FILE__, __LINE__); global_j += (int) space; } while (global_j != *s) sscanf(“%s here %d\n”, &s, &global_j); } Generator int foo(char *s, int global_j, void *space) { for(int i = 0; i < 4; i++) { printf(“[%s:%u] - I am foo: %s\n”, __FILE__, __LINE__); global_j += (int) space; } while (global_j != *s) sscanf(“%s here %d\n”, &s, &global_j); } Handler int foo(char *s, int global_j, void *space) { for(int i = 0; i < 4; i++) { printf(“[%s:%u] - I am foo: %s\n”, __FILE__, __LINE__); global_j += (int) space; } while (global_j != *s) sscanf(“%s here %d\n”, &s, &global_j); } User queue  Current System:  Generator – Receives events from OS  Handler – Handles events from generator  User – ‘main’ thread. Interacts with user. Handles events from generator  User and Handler share a single lock

16 Challenges for ProcControlAPI and Dyninst on BlueGene Lawrence Livermore National Laboratory Threading for Parallelism  Pool of handler threads Events handled in parallel  Replace single User/Handler lock with per-process lock  Finer grained locking, requires significant work  May experiment with pool of generator threads int foo(char *s, int global_j, void *space) { for(int i = 0; i < 4; i++) { printf(“[%s:%u] - I am foo: %s\n”, __FILE__, __LINE__); global_j += (int) space; } while (global_j != *s) sscanf(“%s here %d\n”, &s, &global_j); } Generator int foo(char *s, int global_j, void *space) { for(int i = 0; i < 4; i++) { printf(“[%s:%u] - I am foo: %s\n”, __FILE__, __LINE__); global_j += (int) space; } while (global_j != *s) sscanf(“%s here %d\n”, &s, &global_j); } User int foo(char *s, int global_j, void *space) { for(int i = 0; i < 4; i++) { printf(“[%s:%u] - I am foo: %s\n”, __FILE__, __LINE__); global_j += (int) space; } while (global_j != *s) sscanf(“%s here %d\n”, &s, &global_j); } Handler queue

17 Challenges for ProcControlAPI and Dyninst on BlueGene Lawrence Livermore National Laboratory Memory Utilization  ProcControlAPI/StackwalkerAPI – Less of a concern Estimate 31MB base heap utilization on 8,192 processes.  DyninstAPI Keeps per-process data structures for tracking instrumentation Keeps per-process parse data Use ProcessSets for instrumentation? Re-engineer to reduce per-process data tracking?

18 Challenges for ProcControlAPI and Dyninst on BlueGene Lawrence Livermore National Laboratory Launching  LaunchMON component from LLNL can handle startup: Attach tool daemon to existing job Launch tool daemon alongside new job  At ProcControlAPI level no Process::createProcess. Attach only. Where you can launch processes. Where you want to launch tool processes

19 Challenges for ProcControlAPI and Dyninst on BlueGene Lawrence Livermore National Laboratory Implementation: ProcControlAPI Internals  ProcControlAPI’s internal layers only support async operations.  Handlers that are waiting for async operations can suspend themselves. Will be re-invoked when async operation completes. Handlers need to be idempotent.  Model async debug interface on Linux Define debug_async_simultate in proccontrol/src/linux.h Much easier to debug than native BlueGene

20 Challenges for ProcControlAPI and Dyninst on BlueGene Lawrence Livermore National Laboratory Implementation: Testsuite Changes  test_driver split into test_driver and testdriver_be LaunchMON integrated into test_driver Process launch completely rewritten  Test specification file controls where tests run Login Node runTests test_driver IO Nodes testdriver_be Compute Nodes mutatees RPCs

21 Challenges for ProcControlAPI and Dyninst on BlueGene Lawrence Livermore National Laboratory Scaling Challenges on BlueGene  Port code to use group or async interfaces Existing mechanisms should be functional, but may not scale. New interfaces will be functional on other OSs, but no performance benefits.  Be aware of memory/CPU demands at scale  Use LaunchMON to start tool processes  Try to use manage subsets of processes

22 Challenges for ProcControlAPI and Dyninst on BlueGene Lawrence Livermore National Laboratory Questions? Matthew LeGendre Lawrence Livermore National Laboratory