A Measurement Based Memory Performance Evaluation of Streaming Media Servers Garba Isa Yau and Abdul Waheed Department of Computer Engineering King Fahd.

Slides:



Advertisements
Similar presentations
Cost-Based Cache Replacement and Server Selection for Multimedia Proxy Across Wireless Internet Qian Zhang Zhe Xiang Wenwu Zhu Lixin Gao IEEE Transactions.
Advertisements

1 Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems Brian Forney Andrea Arpaci-Dusseau Remzi Arpaci-Dusseau Wisconsin Network Disks University.
University of Michigan Electrical Engineering and Computer Science Anatomizing Application Performance Differences on Smartphones Junxian Huang, Qiang.
Cloud Download : Using Cloud Utilities to Achieve High-quality Content Distribution for Unpopular Videos Yan Huang, Tencent Research, Shanghai, China Zhenhua.
An Analysis of Database Workload Performance on Simultaneous Multithreaded Processors Jack L. Lo, Luiz André Barroso, Susan Eggers Kourosh Gharachorloo,
Caching Strategies in Transcoding-Enabled Proxy System for Streaming Media Distribution Networks Bo Shen Sung-Ju Lee Sujoy Basu IEEE Transactions On Multimedia,
Memory System Characterization of Big Data Workloads
Performance Analysis of Orb Rabin Karki and Thangam V. Seenivasan 1.
An Adaptable Benchmark for MPFS Performance Testing A Master Thesis Presentation Yubing Wang Advisor: Prof. Mark Claypool.
Peer-to-peer Multimedia Streaming and Caching Service Jie WEI, Zhen MA May. 29.
Performance Engineering Laboratories Computer Engineering Department King Fahd University of Petroleum & Minerals (KFUPM), Dhahran.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Sep 5, 2005 Lecture 2.
CSE 190: Internet E-Commerce Lecture 16: Performance.
1 A Framework for Lazy Replication in P2P VoD Bin Cheng 1, Lex Stein 2, Hai Jin 1, Zheng Zhang 2 1 Huazhong University of Science & Technology (HUST) 2.
Improving Proxy Cache Performance: Analysis of Three Replacement Policies Dilley, J.; Arlitt, M. A journal paper of IEEE Internet Computing, Volume: 3.
Energy Efficient Prefetching – from models to Implementation 6/19/ Adam Manzanares and Xiao Qin Department of Computer Science and Software Engineering.
A Hierarchical Characterization of a Live Streaming Media Workload E. Veloso, V. Almeida W. Meira, A. Bestavros, S. Jin Proceedings of Internet Measurement.
OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.
Performance Evaluation
Efficient Support for Interactive Browsing Operations in Clustered CBR Video Servers IEEE Transactions on Multimedia, Vol. 4, No.1, March 2002 Min-You.
Evaluating System Performance in Gigabit Networks King Fahd University of Petroleum and Minerals (KFUPM) INFORMATION AND COMPUTER SCIENCE DEPARTMENT Dr.
Adaptive Content Delivery for Scalable Web Servers Authors: Rahul Pradhan and Mark Claypool Presented by: David Finkel Computer Science Department Worcester.
An Intelligent Cache System with Hardware Prefetching for High Performance Jung-Hoon Lee; Seh-woong Jeong; Shin-Dug Kim; Weems, C.C. IEEE Transactions.
Peer-to-peer Multimedia Streaming and Caching Service by Won J. Jeon and Klara Nahrstedt University of Illinois at Urbana-Champaign, Urbana, USA.
The Medusa Proxy A Tool For Exploring User- Perceived Web Performance Mimika Koletsou and Geoffrey M. Voelker University of California, San Diego Proceeding.
Loopback: Exploiting Collaborative Caches for Large-Scale Streaming Ewa Kusmierek, Yingfei Dong, Member, IEEE, and David H. C. Du, Fellow, IEEE.
On-Demand Media Streaming Over the Internet Mohamed M. Hefeeda, Bharat K. Bhargava Presented by Sam Distributed Computing Systems, FTDCS Proceedings.
World Wide Web Caching: Trends and Technology Greg Barish and Katia Obraczka USC Information Science Institute IEEE Communications Magazine, May 2000 Presented.
Iometer: Why, What, Where, and How? Presented By Sohail Sarwar Supervisor Dr. Raihan Ur Rasool 1.
The Origin of the VM/370 Time-sharing system Presented by Niranjan Soundararajan.
Measuring zSeries System Performance Dr. Chu J. Jong School of Information Technology Illinois State University 06/11/2012 Sponsored in part by Deer &
Toolbox for Dimensioning Windows Storage Systems Jalil Boukhobza, Claude Timsit 12/09/2006 Versailles Saint Quentin University.
Dan Tang, Yungang Bao, Yunji Chen, Weiwu Hu, Mingyu Chen
Design and Implement an Efficient Web Application Server Presented by Tai-Lin Han Date: 11/28/2000.
CuMAPz: A Tool to Analyze Memory Access Patterns in CUDA
1 Cache Me If You Can. NUS.SOC.CS5248 OOI WEI TSANG 2 You Are Here Network Encoder Sender Middlebox Receiver Decoder.
Global NetWatch Copyright © 2003 Global NetWatch, Inc. Factors Affecting Web Performance Getting Maximum Performance Out Of Your Web Server.
User side and server side factors that influence the performance of the website P2 Unit 28.
On Windows File Access Modes : A Performance Study Jalil Boukhobza & Claude Timsit laboratory Versailles Saint Quentin University.
1 Towards Cinematic Internet Video-on-Demand Bin Cheng, Lex Stein, Hai Jin and Zheng Zhang HUST and MSRA Huazhong University of Science & Technology Microsoft.
Segment-Based Proxy Caching of Multimedia Streams Authors: Kun-Lung Wu, Philip S. Yu, and Joel L. Wolf IBM T.J. Watson Research Center Proceedings of The.
An I/O Simulator for Windows Systems Jalil Boukhobza, Claude Timsit 27/10/2004 Versailles Saint Quentin University laboratory.
A Measurement Based Memory Performance Evaluation of High Throughput Servers Garba Isa Yau Department of Computer Engineering King Fahd University of Petroleum.
INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu.
Srihari Makineni & Ravi Iyer Communications Technology Lab
L7: Performance Frans Kaashoek Spring 2013.
1 Evaluation of Cooperative Web Caching with Web Polygraph Ping Du and Jaspal Subhlok Department of Computer Science University of Houston presented at.
1 MEMORY PERFORMANCE EVALUATION OF HIGH THOUGHPUT SERVERS Garba Ya’u Isa Master’s Thesis Oral Defense Computer Engineering King Fahd University of Petroleum.
Performance of Web Proxy Caching in Heterogeneous Bandwidth Environments IEEE Infocom, 1999 Anja Feldmann et.al. AT&T Research Lab 발표자 : 임 민 열, DB lab,
Improving Disk Throughput in Data-Intensive Servers Enrique V. Carrera and Ricardo Bianchini Department of Computer Science Rutgers University.
MiddleMan: A Video Caching Proxy Server NOSSDAV 2000 Brian Smith Department of Computer Science Cornell University Ithaca, NY Soam Acharya Inktomi Corporation.
Sunpyo Hong, Hyesoon Kim
Eager Writeback — A Technique for Improving Bandwidth Utilization
Tackling I/O Issues 1 David Race 16 March 2010.
On the Importance of Optimizing the Configuration of Stream Prefetches Ilya Ganusov Martin Burtscher Computer Systems Laboratory Cornell University.
1 Evaluation of Cooperative Web Caching with Web Polygraph Ping Du and Jaspal Subhlok Department of Computer Science University of Houston presented at.
LIOProf: Exposing Lustre File System Behavior for I/O Middleware
Analyzing Memory Access Intensity in Parallel Programs on Multicore Lixia Liu, Zhiyuan Li, Ahmed Sameh Department of Computer Science, Purdue University,
Taeho Kgil, Trevor Mudge Advanced Computer Architecture Laboratory The University of Michigan Ann Arbor, USA CASES’06.
Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.
Chapter 9 – Real Memory Organization and Management
Parallel Data Laboratory, Carnegie Mellon University
Cache Memory Presentation I
Characterization of Parallel Scientific Simulations
Memory Management for Scalable Web Data Servers
Auburn University COMP7500 Advanced Operating Systems I/O-Aware Load Balancing Techniques (2) Dr. Xiao Qin Auburn University.
Spare Register Aware Prefetching for Graph Algorithms on GPUs
Request Behavior Variations
Virtual Memory: Working Sets
Database System Architectures
Presentation transcript:

A Measurement Based Memory Performance Evaluation of Streaming Media Servers Garba Isa Yau and Abdul Waheed Department of Computer Engineering King Fahd University of Petroleum & Minerals Dhahran Saudi Arabia 10th Annual IEEE Technical Exchange Meeting Presented at the March 23-24, 2003

Outline Introduction Motivation Experiments Results and Discussion Conclusions and Future Research Operating system Impact on performance

Introduction Basic architecture Unlike ordinary file downloads or Web applications, streaming media have:  stringent timing requirement  high bandwidth requirement  CPU intensive  high memory requirement

Motivation CPU – Memory speed gap  CPU speed doubles in about 18 months (Moore’s Law)  Memory access time improves by only one-third in 10 years Hierarchical memory architecture introduced to alleviate CPU–memory speed gap  It works on locality of reference of data temporal locality spatial locality Streaming media content is a continuous data  working set is normally large, cannot fit into cache  it has very poor temporal locality (data reuse is poor) Hierarchical memory architecture becomes ineffective

Experiments Testbed Metrics:  cache misses (L1 & L2)  page fault rate  throughput  server CPU utilization Factors:  number of streams  media encoding rate (56kbps and 300kbps)  stream distribution (unique or multiple)

Experiments cont. Servers:  Apple Darwin streaming server  Microsoft Windows media server Clients:  DSS- Streaming Load Simulator  WMS - Media load simulator Tools:  Intel Vtune performance analyzer  Windows performance monitor  netstat, vmstat, sar etc.

Results and Discussion L1 C ache Performance L1 cache misses (56kpbs)L1 cache misses (300kbps) L1 cache misses are mostly influenced by number of streams Worst-case performance when the number of streams is high, 300kbps encoding rate and multiple media contents are requested by clients

L2 Cache Performance Results and Discussion cont. L2 cache misses (300kbps) Comparison For both L1 and L2 caches, windows media server has better cache performance compared to Darwin streaming server

Memory Performance Results and Discussion cont. Page fault rate (300kbps) Requests for unique media object does not incur much page faults since object can easily be served from memory Requests for multiple objects leads to high page fault rate since a lot of data blocks will have to be fetched from the disk High page fault rate leads to client’s timeout due to long delay

Results and Discussion cont. Throughput and CPU utilization Throughput (300kbps)CPU utilization (300kbps) Windows media server has higher throughput compared to Darwin streaming server For unique streams, CPU utilization scales with number of streams throughout, while is not the case with multiple streams

Memory Transfer Test ECT (extended copy transfer) Characterizing the memory performance to observe what might be the impact of OS on memory performance Locality of reference:  temporal locality – varying working set size (block size)  spatial locality – varying access pattern (strides)

Conclusion Future research media object pre-fetching and stream batching are techniques we are exploring to improve memory performance of the servers Both media servers exhibit similar cache/memory behavior Worst cache/memory performance at 300kbps encoding rate and multiple stream distribution High cache misses and page faults lead to performance degradation as a result of significant wastage in CPU cycles For streaming media servers, apart from I/O bottleneck, memory subsystem is a potential bottleneck on performance.