Introduction to LMbench

Slides:



Advertisements
Similar presentations
COMP375 Computer Architecture and Organization Senior Review.
Advertisements

Ch. 2 Protocol Architecture. 2.1 The Need for a Protocol Architecture Same set of layered functions need to exist in the two communicating systems. Key.
CA 714CA Midterm Review. C5 Cache Optimization Reduce miss penalty –Hardware and software Reduce miss rate –Hardware and software Reduce hit time –Hardware.
Lecture 6: Multicore Systems
Flash: An efficient and portable Web server Authors: Vivek S. Pai, Peter Druschel, Willy Zwaenepoel Presented at the Usenix Technical Conference, June.
Cloud Computing & Business Tim Preston Tuesday, September 27, 2011.
Spring 2003CSE P5481 Introduction Why memory subsystem design is important CPU speeds increase 55% per year DRAM speeds increase 3% per year rate of increase.
Jump to first page Flash An efficient and portable Web server presented by Andreas Anagnostatos CSE 291 Feb. 2, 2000 Vivek S. Pai Peter Druschel Willy.
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Bugnion et al. Presented by: Ahmed Wafa.
The Transport Layer Chapter 6. The TCP Segment Header TCP Header.
Memory Management April 28, 2000 Instructor: Gary Kimura.
Threads CS 416: Operating Systems Design, Spring 2001 Department of Computer Science Rutgers University
1 Chapter 3.1 : Memory Management Storage hierarchy Storage hierarchy Important memory terms Important memory terms Earlier memory allocation schemes Earlier.
NPACI: National Partnership for Advanced Computational Infrastructure August 17-21, 1998 NPACI Parallel Computing Institute 1 Cluster Archtectures and.
Evaluating current processors performance and machines stability R. Esposito 2, P. Mastroserio 2, F. Taurino 2,1, G. Tortone 2 1 INFM, Sez. di Napoli,
Hardware specifications. Hard drive The hard drive is what stores all your data. It houses the hard disk, where all your files and folders are physically.
Lecture 2: Technology Trends and Performance Evaluation Performance definition, benchmark, summarizing performance, Amdahl’s law, and CPI.
The Linux Benchmark Project Randy Appleton Kurt Payne Joe Schmeltzer Carey Stortz
1 Design and Performance of a Web Server Accelerator Eric Levy-Abegnoli, Arun Iyengar, Junehwa Song, and Daniel Dias INFOCOM ‘99.
Chapter 34 Java Technology for Active Web Documents methods used to provide continuous Web updates to browser – Server push – Active documents.
Troubleshooting. What is an ‘Application Error’ and how do I fix it?
Assignment 5/9 – 2005 INF 5070 – Media Servers and Distribution Systems:
IT253: Computer Organization
The Performance of Microkernel-Based Systems
Parallel Programming on the SGI Origin2000 With thanks to Igor Zacharov / Benoit Marchand, SGI Taub Computer Center Technion Moshe Goldberg,
Operating Systems David Goldschmidt, Ph.D. Computer Science The College of Saint Rose CIS 432.
CPE 631 Project Presentation Hussein Alzoubi and Rami Alnamneh Reconfiguration of architectural parameters to maximize performance and using software techniques.
Managing Distributed, Shared L2 Caches through OS-Level Page Allocation Sangyeun Cho and Lei Jin Dept. of Computer Science University of Pittsburgh.
TCP/IP Protocol Suite 1 Chapter 19 Upon completion you will be able to: File Transfer: FTP and TFTP Understand the connections needed for FTP file transfer.
Reduced Communication Protocol for Clusters Clunix Inc. Donghyun Kim
Logical & Physical Address Nihal Güngör. Logical Address In simplest terms, an address generated by the CPU is known as a logical address. Logical addresses.
Performance Analysis of HPC with Lmbench Didem Unat Supervisor: Nahil Sobh July 22 nd 2005 netfiles.uiuc.edu/dunat2/www.
Ch. 2 Protocol Architecture. 2.1 The Need for a Protocol Architecture Same set of layered functions need to exist in the two communicating systems. Key.
CORE Lab. E.E. 1 Soft timers : efficient microsecond so ftware timer support for network proc essing Mohit Aron and Peter Druschel 17 th ACM Symposium.
Low Overhead Real-Time Computing General Purpose OS’s can be highly unpredictable Linux response times seen in the 100’s of milliseconds Work around this.
Running Commodity Operating Systems on Scalable Multiprocessors Edouard Bugnion, Scott Devine and Mendel Rosenblum Presentation by Mark Smith.
Taeho Kgil, Trevor Mudge Advanced Computer Architecture Laboratory The University of Michigan Ann Arbor, USA CASES’06.
Information Technology (IT). Information Technology – technology used to create, store, exchange, and use information in its various forms (business data,
Computer Architecture Lecture 12: Virtual Memory I
Translation Lookaside Buffer
Bus Interfacing Processor-Memory Bus Backplane Bus I/O Bus
Web Server Load Balancing/Scheduling
Module 12: I/O Systems I/O hardware Application I/O Interface
Web Server Load Balancing/Scheduling
CS 6560: Operating Systems Design
Memory Caches & TLB Virtual Memory
From Address Translation to Demand Paging
Computer Software.
From Address Translation to Demand Paging
CS533 Concepts of Operating Systems
Some Real Problem What if a program needs more memory than the machine has? even if individual programs fit in memory, how can we run multiple programs?
Chapter 6 The Transport Layer.
CS510 Operating System Foundations
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
Operating System Concepts
CS703 - Advanced Operating Systems
Processes and Threads.
Architectural Support for OS
Lecture 15: Memory Design
Fine-grained vs Coarse-grained multithreading
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
A simple network connecting two machines
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
Lecture 3: Main Memory.
Architectural Support for OS
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
DSS Architecture MBA 572 Craig K. Tyran Fall 2002.
File Transfer: FTP and TFTP
Low Overhead Interrupt Handling with SMT
Presentation transcript:

Introduction to LMbench Speaker: 陳雋中

[猜謎] 雞鴨鵝誰當過兵? Ans:鴨子 役畢鴨鴨~役畢役畢鴨~

What is LMbench? Suite of simple, portable benchmarks It measures two key features: latency and bandwidth To transfer data between processor, cache, memory , network and disk Compares different systems performance Results available for most major vendors (SUN, HP, IBM, DEC, SGI, PCs

What is LMbench? (cont) Bandwidth benchmarks Cached file read Memory copy (bcopy) Memory read Memory write Pipe TCP Latency benchmarks Context switching. Networking: connection establishment, pipe, TCP, UDP, and RPC hot potato File system creates and deletes. Process creation. Signal handling System call overhead Memory read latency Miscellaneous Processor clock rate calculation

Memory latency results The memory latency test shows the latency of all of the system (data) caches, i.e., level 1, 2, and 3, if present, as well as main memory and TLB miss latency. In addition the sizes of the caches can be read off of a properly plotted set of results. The hardware folks like this. This benchmark has found bugs in operating system page coloring schemes. Context switching results Everybody seems to love context switching numbers. This particular benchmark is quite careful not to just quote the ``in cache'' numbers. It varies both the number and size of the procesess and plots the results in such a way that it is easy to see when you don't fit in the cache. You can also see the real costs of a cold cache context switch.

Start running Vary parameters to adjust different workload DEMO

Results(1/3) Basic system parameters

Results(2/3)

Results(3/3)

Figure(1/2)

Figure(2/2)

Q:把石頭帶回家犯了什麼罪?? A:妨礙風化......