Extensible Networking Platform 1 Liquid Architecture Cycle Accurate Performance Measurement Richard Hough Phillip Jones, Scott Friedman, Roger Chamberlain,

Slides:



Advertisements
Similar presentations
Processes and Threads Chapter 3 and 4 Operating Systems: Internals and Design Principles, 6/E William Stallings Patricia Roy Manatee Community College,
Advertisements

Threads, SMP, and Microkernels
Full-System Timing-First Simulation Carl J. Mauer Mark D. Hill and David A. Wood Computer Sciences Department University of Wisconsin—Madison.
Thread Criticality Predictors for Dynamic Performance, Power, and Resource Management in Chip Multiprocessors Abhishek Bhattacharjee Margaret Martonosi.
1 SECURE-PARTIAL RECONFIGURATION OF FPGAs MSc.Fisnik KRAJA Computer Engineering Department, Faculty Of Information Technology, Polytechnic University of.
Shobana Padmanabhan Phillip Jones, David Schuehler, Praveen Krishnamurthy, Scott Friedman, Huakai Zhang, Ron Cytron, John Lockwood, Roger Chamberlain,
LOGO HW/SW Co-Verification -- Mentor Graphics® Seamless CVE By: Getao Liang March, 2006.
Thin Servers with Smart Pipes: Designing SoC Accelerators for Memcached Bohua Kou Jing gao.
© ABB Group Jun-15 Evaluation of Real-Time Operating Systems for Xilinx MicroBlaze CPU Anders Rönnholm.
Energy Evaluation Methodology for Platform Based System-On- Chip Design Hildingsson, K.; Arslan, T.; Erdogan, A.T.; VLSI, Proceedings. IEEE Computer.
UC Berkeley 1 Time dilation in RAMP Zhangxi Tan and David Patterson Computer Science Division UC Berkeley.
Scheduling with Optimized Communication for Time-Triggered Embedded Systems Slide 1 Scheduling with Optimized Communication for Time-Triggered Embedded.
Figure 1.1 Interaction between applications and the operating system.
ABACUS: A Hardware-Based Software Profiler for Modern Processors Eric Matthews Lesley Shannon School of Engineering Science Sergey Blagodurov Sergey Zhuravlev.
ThreadsThreads operating systems. ThreadsThreads A Thread, or thread of execution, is the sequence of instructions being executed. A process may have.
WebQuilt and Mobile Devices: A Web Usability Testing and Analysis Tool for the Mobile Internet Tara Matthews Seattle University April 5, 2001 Faculty Mentor:
1 A survey on Reconfigurable Computing for Signal Processing Applications Anne Pratoomtong Spring2002.
8/16/2015\course\cpeg323-08F\Topics1b.ppt1 A Review of Processor Design Flow.
Out-of-Order OpenRISC 2 semesters project Semester A: Implementation of OpenRISC on XUPV5 board Final A Presentation By: Vova Menis-Lurie Sonia Gershkovich.
Load Test Planning Especially with HP LoadRunner >>>>>>>>>>>>>>>>>>>>>>
Cortex-M3 Debugging System
Stack Management Each process/thread has two stacks  Kernel stack  User stack Stack pointer changes when exiting/entering the kernel Q: Why is this necessary?
Computer System Architectures Computer System Software
1 Design and Performance of a Web Server Accelerator Eric Levy-Abegnoli, Arun Iyengar, Junehwa Song, and Daniel Dias INFOCOM ‘99.
Operating System Review September 10, 2012Introduction to Computer Security ©2004 Matt Bishop Slide #1-1.
M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.
Operating System Support for Virtual Machines Samuel T. King, George W. Dunlap,Peter M.Chen Presented By, Rajesh 1 References [1] Virtual Machines: Supporting.
Eric Keller, Evan Green Princeton University PRESTO /22/08 Virtualizing the Data Plane Through Source Code Merging.
Operating Systems for Reconfigurable Systems John Huisman ID:
Chapter 4 Threads, SMP, and Microkernels Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design.
High-Level Interconnect Architectures for FPGAs An investigation into network-based interconnect systems for existing and future FPGA architectures Nick.
Uncovering the Multicore Processor Bottlenecks Server Design Summit Shay Gal-On Director of Technology, EEMBC.
High-Level Interconnect Architectures for FPGAs Nick Barrow-Williams.
Scalable Web Server on Heterogeneous Cluster CHEN Ge.
Contents 1.Introduction, architecture 2.Live demonstration 3.Extensibility.
IT253: Computer Organization
- Washington University in St. Louis Apr 26, 2004 Liquid Architecture.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
07/09/04 Johan Muskens ( TU/e Computer Science, System Architecture and Networking.
 Virtual machine systems: simulators for multiple copies of a machine on itself.  Virtual machine (VM): the simulated machine.  Virtual machine monitor.
Task Graph Scheduling for RTR Paper Review By Gregor Scott.
BridgePoint Integration John Wolfe / Robert Day Accelerated Technology.
4/19/20021 TCPSplitter: A Reconfigurable Hardware Based TCP Flow Monitor David V. Schuehler.
Harmony: A Run-Time for Managing Accelerators Sponsor: LogicBlox Inc. Gregory Diamos and Sudhakar Yalamanchili.
Hot Interconnects TCP-Splitter: A Reconfigurable Hardware Based TCP/IP Flow Monitor David V. Schuehler
CISC Machine Learning for Solving Systems Problems Presented by: Suman Chander B Dept of Computer & Information Sciences University of Delaware Automatic.
Dusty Caches for Reference Counting Garbage Collection Scott Friedman, Praveen Krishnamurthy, Roger Chamberlain, Ron K. Cytron, Jason Fritts Dept. of Computer.
SOC Virtual Prototyping: An Approach towards fast System- On-Chip Solution Date – 09 th April 2012 Mamta CHALANA Tech Leader ST Microelectronics Pvt. Ltd,
Architecture View Models A model is a complete, simplified description of a system from a particular perspective or viewpoint. There is no single view.
Module 9 Planning and Implementing Monitoring and Maintenance.
Vertical Profiling : Understanding the Behavior of Object-Oriented Applications Sookmyung Women’s Univ. PsLab Sewon,Moon.
ECHO A System Monitoring and Management Tool Yitao Duan and Dawey Huang.
Power Guru: Implementing Smart Power Management on the Android Platform Written by Raef Mchaymech.
E-MOS: Efficient Energy Management Policies in Operating Systems
Liquid Architecture D. Schuehler, B. Brodie, R. Chamberlain, R. Cytron, S. Friedman, J. Fritts, P. Jones, P. Krishnamurthy, J. Lockwood, S. Padmanabhan,
Running Commodity Operating Systems on Scalable Multiprocessors Edouard Bugnion, Scott Devine and Mendel Rosenblum Presentation by Mark Smith.
“Temperature-Aware Task Scheduling for Multicore Processors” Masters Thesis Proposal by Myname 1 This slides presents title of the proposed project State.
Processes and Threads Chapter 3 and 4 Operating Systems: Internals and Design Principles, 6/E William Stallings Patricia Roy Manatee Community College,
Computer System Structures
Introduction to threads
Programmable Hardware: Hardware or Software?
Operating Systems CMPSC 473
Reconfigurable Hardware Scheduler for RTS
Taeweon Suh § Hsien-Hsin S. Lee § Shih-Lien Lu † John Shen †
Multithreaded Programming
Shielding applications from an untrusted cloud with Haven
Virtual Memory: Working Sets
NetFPGA - an open network development platform
TEE-Perf A Profiler for Trusted Execution Environments
Chapter 4 The Von Neumann Model
Presentation transcript:

Extensible Networking Platform 1 Liquid Architecture Cycle Accurate Performance Measurement Richard Hough Phillip Jones, Scott Friedman, Roger Chamberlain, Jason Fritts, John Lockwood, and Ron Cytron Funded by NSF Grant ITR

Extensible Networking Platform 2 Liquid ArchitectureOutline Introduction Motivation Background Architecture Usage Results Future Work Related Work Conclusion

Extensible Networking Platform 3 Liquid Architecture Introduction – What Are We Doing? Creating a module for capturing cycle- accurate profiles of hardware events during the runtime of programs on real systems

Extensible Networking Platform 4 Liquid Architecture Introduction – What Are We Doing? Creating a module for capturing cycle- accurate profiles of hardware events during the runtime of programs on real systems Statistics Module

Extensible Networking Platform 5 Liquid Architecture Introduction – What Are We Doing? Creating a module for capturing cycle- accurate profiles of hardware events during the runtime of programs on real systems Statistics Module Program Runtime Program Bottlenecks

Extensible Networking Platform 6 Liquid Architecture Introduction – What Are We Doing? Creating a module for capturing cycle- accurate profiles of hardware events during the runtime of programs on real systems Statistics Module Program Runtime Program Bottlenecks Cache Hits Memory Accesses ISA Decoding

Extensible Networking Platform 7 Liquid Architecture Introduction – What Are We Doing? Creating a module for capturing cycle- accurate profiles of hardware events during the runtime of programs on real systems Statistics Module Program Runtime Program Bottlenecks Cache Hits Memory Accesses ISA Decoding

Extensible Networking Platform 8 Liquid Architecture Background - FPX Designed and implemented on the FPX platform The FPX platform is: –Designed for developing pluggable network circuits –Contains a Virtex 2000e FPGA for design deployment –Possesses a smaller FPGA used as a network interface device Can potentially operate at gigabit line rates

Extensible Networking Platform 9 Liquid Architecture Background - LEON2 Developed by Gaisler Research –Sparc-V8 –Open-Source VHDL –Widely used European Space Agency, etc. –Second in popularity only to the Microblaze

Extensible Networking Platform 10 Liquid Architecture Motivation – Why Not Use Software? Software Profiling Is: –Inaccurate Many data points estimated Time slices not absolute Profiling affects results –Inefficient Unreasonable for real-system deployment –Ineffective Difficult to separate OS overhead

Extensible Networking Platform 11 Liquid Architecture Motivation – Why Not Use Simulation? Simulation is: –Slow A simple simulation could require 100X more time than running the program –Bound by the quality of the model The model used may be inaccurate Processors often tweaked without updating the documentation [Larus]

Extensible Networking Platform 12 Liquid Architecture Motivation – Why Use FPGAs? ASICs are expensive –FPGAs provide good blend of cost and accuracy Software simulation of processors is incredibly slow Allows for easy prototyping –Test new caching methods, tweak the ISA, etc.

Extensible Networking Platform 13 Liquid Architecture Motivation – Why Put Statsmod In A FPGA? The Statistics Module Allows You To: –Pull Event Signals from anywhere –Evaluate both software and hardware optimizations Tweak the architecture Integrate hardware accelerated modules into software solutions Adjust the software algorithm –Gather repeatable and reliable results

Extensible Networking Platform 14 Liquid Architecture Architecture – Naïve Solution Interested in 10 events and counters –Naïve solution implements a counter for each possibility 100 counters! –Not scalable for large systems

Extensible Networking Platform 15 Liquid Architecture Architecture – Our Solution Better Approach –Associate counters to events and methods at run time –Covers the problem area, but uses less chip space

Extensible Networking Platform 16 Liquid Architecture Architecture – An In Depth Look

Extensible Networking Platform 17 Liquid Architecture Architecture – Scalability Address Range Registers Counters Events Naïve Approach

Extensible Networking Platform 18 Liquid Architecture Usage

Extensible Networking Platform 19 Liquid Architecture Results – What do we get? The next few slides contain data from the Linpack benchmark running on the FPGA –Linpack is a FPU intensive benchmark While the following slides focus on runtime, it is important to remember that the graphs could in principle be of *any* event

Extensible Networking Platform 20 Liquid Architecture Results 323,686,726 Clock Cycles

Extensible Networking Platform 21 Liquid Architecture Results

Extensible Networking Platform 22 Liquid Architecture Results

Extensible Networking Platform 23 Liquid Architecture Results

Extensible Networking Platform 24 Liquid Architecture Future Work – Where can we go? As of a week ago, the StatsMod was successfully integrated into a Linux OS running on Leon –Changes have been made to allow a clear separation between Process IDs OS, background tasks, threads –A device driver allows any program, including the program being profiled, to gather the statistics

Extensible Networking Platform 25 Liquid Architecture Future Work – Where can we go? Programs could now potentially collect statistics on themselves perform runtime introspection –Adjust operation to conserve power, memory accesses, etc. –Deeper integration could occur at the kernel level to affect scheduler decisions Adds a new dimension for slicing resources –Network activity, device activity, page faults, etc.

Extensible Networking Platform 26 Liquid Architecture Related Work SnoopP –Developed by Lesley Shannon and Paul Chow at the University of Toronto –Collects timing characteristics of programs running on a Microblaze processor Focuses on clock cycles only –Integrated into the EDK

Extensible Networking Platform 27 Liquid Architecture Conclusion In closing, I would like to thank: –Phillip Jones for his hard work and support –Ron Cytron for his mentoring and persistence –Scott Friedman for his work on the web interface –The rest of the Liquid Architecture team –And WISA for the invitation to present

Extensible Networking Platform 28 Liquid Architecture Questions?

Extensible Networking Platform 29 Liquid Architecture Background – Liquid

Extensible Networking Platform 30 Liquid Architecture Usage 1.Connect to a secure web server controlling the FPGA hardware 2.Upload the desired binary executable, associated mapfile, and desired programming bitfile 3.A perl script parses the map file and provides a graphical interface for selecting the desired address ranges and events 4.Statistic results are tabulated at the end of the program’s execution