INSTITUTE OF COMPUTING TECHNOLOGY Benchmarking Datacenter and Big Data Systems Wanling Gao, Zhen Jia, Lei Wang, Yuqing Zhu, Chunjie Luo, Yingjie Shi, Yongqiang.

Slides:



Advertisements
Similar presentations
Collaborative QoS Prediction in Cloud Computing Department of Computer Science & Engineering The Chinese University of Hong Kong Hong Kong, China Rocky.
Advertisements

LIBRA: Lightweight Data Skew Mitigation in MapReduce
DBMSs on a Modern Processor: Where Does Time Go? Anastassia Ailamaki Joint work with David DeWitt, Mark Hill, and David Wood at the University of Wisconsin-Madison.
A Graph-based Recommender System Zan Huang, Wingyan Chung, Thian-Huat Ong, Hsinchun Chen Artificial Intelligence Lab The University of Arizona 07/15/2002.
VMware Capacity Planner 2.7 Discussion and Demo from Engineering May 2009.
Distributed Approximate Spectral Clustering for Large- Scale Datasets FEI GAO, WAEL ABD-ALMAGEED, MOHAMED HEFEEDA PRESENTED BY : BITA KAZEMI ZAHRANI 1.
Exploring Latent Features for Memory- Based QoS Prediction in Cloud Computing Yilei Zhang, Zibin Zheng, and Michael R. Lyu
Cloud Computing Resource provisioning Keke Chen. Outline  For Web applications statistical Learning and automatic control for datacenters  For data.
Enabling Efficient On-the-fly Microarchitecture Simulation Thierry Lafage September 2000.
Paul D. Bryan, Jason A. Poovey, Jesse G. Beu, Thomas M. Conte Georgia Institute of Technology.
1 A Characterization of Big Data Benchmarks Wen.Xiong Zhibin Yu, Zhendong Bei, Juanjuan Zhao, Fan Zhang, Yubin Zou, Xue Bai, Ye Li, Chengzhong Xu Shenzhen.
Analysis of Database Workloads on Modern Processors Advisor: Prof. Shan Wang P.h.D student: Dawei Liu Key Laboratory of Data Engineering and Knowledge.
Efficient Autoscaling in the Cloud using Predictive Models for Workload Forecasting Roy, N., A. Dubey, and A. Gokhale 4th IEEE International Conference.
Chapter 4 M. Keshtgary Spring 91 Type of Workloads.
Memory System Characterization of Big Data Workloads
Project 4 U-Pick – A Project of Your Own Design Proposal Due: April 14 th (earlier ok) Project Due: April 25 th.
An Adaptable Benchmark for MPFS Performance Testing A Master Thesis Presentation Yubing Wang Advisor: Prof. Mark Claypool.
Recommender systems Ram Akella February 23, 2011 Lecture 6b, i290 & 280I University of California at Berkeley Silicon Valley Center/SC.
Justin Meza Qiang Wu Sanjeev Kumar Onur Mutlu Revisiting Memory Errors in Large-Scale Production Data Centers Analysis and Modeling of New Trends from.
A Characterization of Processor Performance in the VAX-11/780 From the ISCA Proceedings 1984 Emer & Clark.
Personalized Ontologies for Web Search and Caching Susan Gauch Information and Telecommunications Technology Center Electrical Engineering and Computer.
INSTITUTE OF COMPUTING TECHNOLOGY BPOE-4 workshop The fourth workshop on Big Data Benchmarks, Performance Optimization, and Emerging Hardware Salt Lake.
Basics of Operating Systems March 4, 2001 Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard.
Conference title1 A New Methodology for Studying Realistic Processors in Computer Science Degrees Crispín Gómez, María E. Gómez y Julio Sahuquillo DISCA.
«Tag-based Social Interest Discovery» Proceedings of the 17th International World Wide Web Conference (WWW2008) Xin Li, Lei Guo, Yihong Zhao Yahoo! Inc.,
INSTITUTE OF COMPUTING TECHNOLOGY BigDataBench: a Big Data Benchmark Suite from Internet Services Lei Wang, Jianfeng Zhan, Chunjie Luo, Yuqing Zhu, Qiang.
DBMSs On A Modern Processor: Where Does Time Go? by A. Ailamaki, D.J. DeWitt, M.D. Hill, and D. Wood University of Wisconsin-Madison Computer Science Dept.
RuleML-2007, Orlando, Florida1 Towards Knowledge Extraction from Weblogs and Rule-based Semantic Querying Xi Bai, Jigui Sun, Haiyan Che, Jin.
Location-aware MapReduce in Virtual Cloud 2011 IEEE computer society International Conference on Parallel Processing Yifeng Geng1,2, Shimin Chen3, YongWei.
An Autonomic Framework in Cloud Environment Jiedan Zhu Advisor: Prof. Gagan Agrawal.
Master Thesis Defense Jan Fiedler 04/17/98
Uncovering the Multicore Processor Bottlenecks Server Design Summit Shay Gal-On Director of Technology, EEMBC.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
1 Time & Cost Sensitive Data-Intensive Computing on Hybrid Clouds Tekin Bicer David ChiuGagan Agrawal Department of Compute Science and Engineering The.
Thread Criticality Predictors for Dynamic Performance, Power, and Resource Management in Chip Multiprocessors Abhishek Bhattacharjee and Margaret Martonosi.
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. LogKV: Exploiting Key-Value.
1 Wenguang WangRichard B. Bunt Department of Computer Science University of Saskatchewan November 14, 2000 Simulating DB2 Buffer Pool Management.
Patterns of Event Causality Suggest More Effective Corrective Actions Abstract: The Occurrence Reporting and Processing System (ORPS) has used a consistent.
Benchmarking MapReduce-Style Parallel Computing Randal E. Bryant Carnegie Mellon University.
Architectural Characterization of an IBM RS6000 S80 Server Running TPC-W Workloads Lei Yang & Shiliang Hu Computer Sciences Department, University of.
Profiling Memory Subsystem Performance in an Advanced POWER Virtualization Environment The prominent role of the memory hierarchy as one of the major bottlenecks.
INSTITUTE OF COMPUTING TECHNOLOGY Understanding Big Data Workloads on Modern Processors using BigDataBench Jianfeng Zhan
Microsoft Research1 Characterizing Alert and Browse Services for Mobile Clients Atul Adya, Victor Bahl, Lili Qiu Microsoft Research USENIX Annual Technical.
1 Challenges in Scaling E-Business Sites  Menascé and Almeida. All Rights Reserved. Daniel A. Menascé Department of Computer Science George Mason.
Towards Dynamic Green-Sizing for Database Servers Mustafa Korkmaz, Alexey Karyakin, Martin Karsten, Kenneth Salem University of Waterloo.
Performance Analysis of the Compaq ES40--An Overview Paper evaluates Compaq’s ES40 system, based on the Alpha Only concern is performance: no power.
Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS
A Measurement Based Memory Performance Evaluation of Streaming Media Servers Garba Isa Yau and Abdul Waheed Department of Computer Engineering King Fahd.
Big traffic data processing framework for intelligent monitoring and recording systems 學生 : 賴弘偉 教授 : 許毅然 作者 : Yingjie Xia a, JinlongChen a,b,n, XindaiLu.
Sunpyo Hong, Hyesoon Kim
Lecture-6 Bscshelp.com. Todays Lecture  Which Kinds of Applications Are Targeted?  Business intelligence  Search engines.
Lecture 3. Performance Prof. Taeweon Suh Computer Science & Engineering Korea University COSE222, COMP212, CYDF210 Computer Architecture.
Presenter: Yue Zhu, Linghan Zhang A Novel Approach to Improving the Efficiency of Storing and Accessing Small Files on Hadoop: a Case Study by PowerPoint.
Data mining in web applications
Cloud Benchmarking, Tools, and Challenges
Experience Report: System Log Analysis for Anomaly Detection
Lecture 2: Performance Evaluation
Ignacio Cano, Srinivas Aiyar, Arvind Krishnamurthy
Measurement-based Design
Running virtualized Hadoop, does it make sense?
Preface to the special issue on context-aware recommender systems
Characterization of Parallel Scientific Simulations
Hadoop Clusters Tess Fulkerson.
Adaptive Cache Replacement Policy
Yu Su, Yi Wang, Gagan Agrawal The Ohio State University
Milad Hashemi, Onur Mutlu, Yale N. Patt
Pei Fan*, Ji Wang, Zibin Zheng, Michael R. Lyu
Dept. of Computer Science, Univ. of Rochester
The Performance of Big Data Workloads in Cloud Datacenters
Srinivas Neginhal Anantharaman Kalyanaraman CprE 585: Survey Project
Presentation transcript:

INSTITUTE OF COMPUTING TECHNOLOGY Benchmarking Datacenter and Big Data Systems Wanling Gao, Zhen Jia, Lei Wang, Yuqing Zhu, Chunjie Luo, Yingjie Shi, Yongqiang He, Shiming Gong, Xiaona Li, Shujie Zhang, Bizhu Qiu, Lixin Zhang, Jianfeng Zhan 1

Big Data Benchmarking Workshop Acknowledgements This work is supported by the Chinese 973 project (Grant No.2011CB302502), the Hi- Tech Research and Development (863) Program of China (Grant No.2011AA01A203, No.2013AA01A213), the NSFC project (Grant No , No ), the BNSFproject (Grant No ), and Huawei funding. 2/

Big Data Benchmarking Workshop Executive summary An open-source project on datacenter and big data benchmarking ICTBench Several case studies using ICTBench 3/

Big Data Benchmarking Workshop Question One Gap between Industry and Academia Longer and longer distance Code Data sets 4/

Big Data Benchmarking Workshop Question Two Different benchmark requirements Architecture communities Simulation is very slow Small data and code sets System communities Large-scale deployment is valuable. Users need real-world applications There are three kinds of lies: lies, damn lies, and benchmarks 5/

Big Data Benchmarking Workshop State-of-Practice Benchmark Suites SPEC CPU SPEC Web HPCC PARSEC TPCC YCSB Gridmix 6/

Big Data Benchmarking Workshop Why a New Benchmark Suite for Datacenter Computing No benchmark suite covers diversity of data center workloads State-of-art: CloudSuite Only includes six applications according to their popularity 7/

Big Data Benchmarking Workshop Memory Level Parallelism(MLP): Simultaneously outstanding cache misses Why a New Benchmark Suite (Cont’) MLP 8/ CloudSuite our benchmark suite DCBench

Big Data Benchmarking Workshop Scale-out performance Why a New Benchmark Suite (Cont’) Speed up Cloudsuite Data analysis benchmark Working nodes DCBench 9/

Big Data Benchmarking Workshop Outline Background and Motivation Our ICTBench Case studies 10/

Big Data Benchmarking Workshop ICTBench Project ICTBench: three benchmark suites DCBench: architecture (application, OS, and VM execution) BigDataBench: system (large-scale big data applications) CloudRank: Cloud benchmarks (distributed managements) not covered in this talk Project homepage The source code is available 11/

Big Data Benchmarking Workshop DCBench DCBench: typical data center workloads Different from scientific computing: FLOPS Cover applications in important domains Search engine, electronic commence etc. Each benchmark = a single application Purposes Architecture system (small-to-medium) researches 12/

Big Data Benchmarking Workshop BigDataBench Characterizing big data applications Not including data-intensive super computing Synthetic data sets varying from 10G~ PB Each benchmark = a single big application. Purposes large-scale system and architecture researches 13/

Big Data Benchmarking Workshop CloudRank Cloud computing Elastic resource management Consolidating different workloads Cloud benchmarks Each benchmark = a group of consolidated data center workloads. services/ data processing/ desktop Purposes Capacity planning, system evaluation and researches User can customize their benchmarks. 14/

Big Data Benchmarking Workshop Benchmarking Methodology To decide and rank main application domains according to a publicly available metric e.g. page view and daily visitors To single out the main applications from main applications domains 15/

Big Data Benchmarking Workshop Top Sites on the Web More details in Top Sites on the Web 16/

Big Data Benchmarking Workshop Benchmarking Methodology To decide and rank main application domains according to a publicly available metric e.g. page view and daily visitors To single out the main applications from main applications domains 17/

Big Data Benchmarking Workshop Main algorithms in Search Engine Algorithms used in Search: Pagerank Graph mining Segmentation Feature Reduction Grep Statistical counting Vector calculation sort Recommendation …… Top Sites on The Web 18/

Big Data Benchmarking Workshop Main Algorithms in Search Engines ( Nutch ) Word Grep Word Count Segmentation Sort Classification DecisionTree BFS Segmentation Scoring & Sort Merge Sort Vector calculate PageRank 19/

Big Data Benchmarking Workshop Main Algorithms in Social Networks Algorithms used in Social Network: Recommendation Clustering Classification Graph mining Grep Feature Reduction Statistical counting Vector calculation Sort …… Top Sites on The Web 20/

Big Data Benchmarking Workshop Main Algorithms in Electronic Commerce Algorithms used in electronic commerce: Recommendation Associate rule mining Warehouse operation Clustering Classification Statistical counting Vector calculation …… Top Sites on The Web 21/

Big Data Benchmarking Workshop Overview of DCBench CategoryWorkloadsProgrammin g model languagesource Basic operationSortMapReduceJavaHadoop WordcountMapReduceJavaHadoop GrepMapReduceJavaHadoop ClassificationNaïve BayesMapReduceJavaMahout Support Vector Machine MapReduceJavaImplemented by ourself ClusterK-meansMapReduceJavaMahout MPIC++IBM PML Fuzzy k-meansMapReduceJavaMahout MPIC++IBM PML Recommendatio n Item based Collaborative Filtering MapReduceJavaMahout Association rule mining Frequent pattern growth MapReduceJavaMahout Segmentation Hidden Markov modelMapReduceJavaImplemented by ourself 22/

Big Data Benchmarking Workshop CategoryWorkloadsProgramming model languagesource Warehouse operation Database operationsMapReduceJavaHive-bench Feature reduction Principal Component Analysis MPIC++IBM PML Kernel Principal Component Analysis MPIC++IBM PML Vector calculate Paper similarity analysis All-PairsC&C++Implemented by ourself Graph miningBreadth-first searchMPIC++Graph500 PagerankMapReduceJavaMahout ServiceSearch engineC/SJavanutch AuctionC/SJavaRubis ServiceMedia streamingC/SJavaCloudsuite Overview of DCBench (Cont’) 23/

Big Data Benchmarking Workshop Methodology of Generating Big Data To preserve the characteristics of real-world data Small-scale Data Big Data Characteristic Analysis Expand SemanticLocality TemporallySpatially e.g. word frequency Word reuse distance Word distribution in documents 24/

Big Data Benchmarking Workshop Workloads in BigDataBench 1.0 Beta Analysis Workloads Simple but representative operations Sort, Grep, Wordcount Highly recognized algorithms Naïve Bayes, SVM Search Engine Service Workloads Widely deployed services Nutch Server 25/

Big Data Benchmarking Workshop Variety of Workloads are Included WorkloadsOff-line Base Operations I/O bound Sort CPU bound Wordcount Hybrid Grep Machine Learning Naïve Bayes SVMOn-line Nutch Server 26/

Big Data Benchmarking Workshop Features of Workloads Workloads Resource Characteristic Computing ComplexityInstructions Sort I/O boundO(n*lgn)Integer comparison domination Wordcount CPU bound O(n) Integer comparison and calculation domination Grep Hybrid O(n) Integer comparison domination Naïve Bayes /O(m*n) [m: the length of dictionary] Floating-point computation domination SVM /O(M*n) [M: the number of support vectors * dimension] Floating-point computation domination Nutch Server I/O & CPU boundInteger comparison domination 27/

Big Data Benchmarking Workshop Content Background and Motivation Our ICTBench Case studies 28/

Big Data Benchmarking Workshop Use Case 1: Microarchitecture Characterization Using DCBench Five nodes cluster one mater and four slaves(working nodes) Each node: 29/

Big Data Benchmarking Workshop Instructions Execution level DCBench: Data analysis workloads have more app-level instructions Service workloads have higher percentages of kernel-level instructions service Data analysis 30/

Big Data Benchmarking Workshop Pipeline Stall DC workloads have severe front end stall (i.e. instruction fetch stall) Services: more RAT(Register Allocation Table) stall Data analysis: more RS(Reservation Station) and ROB(ReOrder Buffer) full stall 31/

Big Data Benchmarking Workshop Architecture Block Diagram 32/

Big Data Benchmarking Workshop Front End Stall Reasons For DC, High Instruction cache miss and Instruction TLB miss make the front end inefficiency 33/

Big Data Benchmarking Workshop MLC Behaviors DC workloads have more MLC misses than HPC Data analysis workloads own better locality (less L2 cache misses) Data analysis Service HPCC 34/

Big Data Benchmarking Workshop LLC Behaviors LLC is good enough for DC workloads Most L2 cache misses can be satisfied by LLC 35/

Big Data Benchmarking Workshop DTLB Behaviors DC workloads own more DTLB miss than HPC Most data analysis workloads have less DTLB miss Data analysis Service HPCC 36/

Big Data Benchmarking Workshop Branch Prediction DC: Data analysis workloads have pretty good branch behaviors Service’s branch is hard to predict Data analysis Service HPCC 37/

Big Data Benchmarking Workshop DC Workloads Characteristics Data analysis applications share many inherent characteristics, which place them in a different class from desktop, HPC, traditional server and scale-out service workloads. More details can be found at our IISWC 2013 paper. Characterizing Data Analysis Workloads in Data Centers. Zhen Jia, et al IEEE International Symposium on Workload Characterization ( IISWC- 2013) 38/

Big Data Benchmarking Workshop Use Case 2: Architecture Research Using BigDataBench 1.0 Beta Data Scale 10 GB – 2 TB Hadoop Configuration 1 master 14 slave node 39/

Big Data Benchmarking Workshop Use Case 2: Architecture Research Some micro-architectural events are tending towards stability when the data volume increases to a certain extent Cache and TLB behaviors have different trends with increasing data volumes for different workloads L1I_miss/1000ins: increase for Sort, decrease for Grep 40/

Big Data Benchmarking Workshop Search Engine Service Experiments Same phenomena is observed Micro-architectural events are tending towards stability when the index size increases to a certain extent Big data impose challenges to architecture researches since large-scale simulation is time-consuming Index size : 2GB ~ 8GB Segment size : 4.4GB ~ 17.6GB 41/

Big Data Benchmarking Workshop Use Case 3: System Evaluation Using BigDataBench 1.0 Beta Data Scale 10 GB – 2 TB Hadoop Configuration 1 master 14 slave node 42/

Big Data Benchmarking Workshop System Evaluation a threshold for each workload 100MB ~ 1TB System is fully loaded when the data volume exceeds the threshold Sort is an exception An inflexion point(10GB ~ 1TB) Data processing rate decreases after this point Global data access requirements I/O and network bottleneck System performance is dependent on applications and data volumes. 43/

Big Data Benchmarking Workshop Conclusion ICTBench DCBench BigDataBench CloudRank An open-source project on datacenter and big data benchmarking /

Big Data Benchmarking Workshop Publications BigDataBench: a Big Data Benchmark Suite from Web Search Engines. Wanling Gao, et al. The Third Workshop on Architectures and Systems for Big Data (ASBD 2013) in conjunction with ISCA Characterizing Data Analysis Workloads in Data Centers. Zhen Jia, et al IEEE International Symposium on Workload Characterization ( IISWC-2013) Characterizing OS behavior of Scale-out Data Center Workloads. Chen Zheng et al. Seventh Annual Workshop on the Interaction amongst Virtualization, Operating Systems and Computer Architecture (WIVOSCA 2013). In Conjunction with ISCA 2013.[ Characterization of Real Workloads of Web Search Engines. Huafeng Xi et al IEEE International Symposium on Workload Characterization ( IISWC-2011). The Implications of Diverse Applications and Scalable Data Sets in Benchmarking Big Data Systems. Zhen Jia et al. Second workshop of big data benchmarking (WBDB 2012 India) & Lecture Note in Computer Science (LNCS) CloudRank-D: Benchmarking and Ranking Cloud Computing Systems for Data Processing Applications. Chunjie Luo et al. Front. Comput. Sci. (FCS) 2012, 6(4): 347–362 45/

Big Data Benchmarking Workshop Thank you! Any questions? 46/