INSTITUTE OF COMPUTING TECHNOLOGY Benchmarking Datacenter and Big Data Systems Wanling Gao, Zhen Jia, Lei Wang, Yuqing Zhu, Chunjie Luo, Yingjie Shi, Yongqiang He, Shiming Gong, Xiaona Li, Shujie Zhang, Bizhu Qiu, Lixin Zhang, Jianfeng Zhan 1
Big Data Benchmarking Workshop Acknowledgements This work is supported by the Chinese 973 project (Grant No.2011CB302502), the Hi- Tech Research and Development (863) Program of China (Grant No.2011AA01A203, No.2013AA01A213), the NSFC project (Grant No , No ), the BNSFproject (Grant No ), and Huawei funding. 2/
Big Data Benchmarking Workshop Executive summary An open-source project on datacenter and big data benchmarking ICTBench Several case studies using ICTBench 3/
Big Data Benchmarking Workshop Question One Gap between Industry and Academia Longer and longer distance Code Data sets 4/
Big Data Benchmarking Workshop Question Two Different benchmark requirements Architecture communities Simulation is very slow Small data and code sets System communities Large-scale deployment is valuable. Users need real-world applications There are three kinds of lies: lies, damn lies, and benchmarks 5/
Big Data Benchmarking Workshop State-of-Practice Benchmark Suites SPEC CPU SPEC Web HPCC PARSEC TPCC YCSB Gridmix 6/
Big Data Benchmarking Workshop Why a New Benchmark Suite for Datacenter Computing No benchmark suite covers diversity of data center workloads State-of-art: CloudSuite Only includes six applications according to their popularity 7/
Big Data Benchmarking Workshop Memory Level Parallelism(MLP): Simultaneously outstanding cache misses Why a New Benchmark Suite (Cont’) MLP 8/ CloudSuite our benchmark suite DCBench
Big Data Benchmarking Workshop Scale-out performance Why a New Benchmark Suite (Cont’) Speed up Cloudsuite Data analysis benchmark Working nodes DCBench 9/
Big Data Benchmarking Workshop Outline Background and Motivation Our ICTBench Case studies 10/
Big Data Benchmarking Workshop ICTBench Project ICTBench: three benchmark suites DCBench: architecture (application, OS, and VM execution) BigDataBench: system (large-scale big data applications) CloudRank: Cloud benchmarks (distributed managements) not covered in this talk Project homepage The source code is available 11/
Big Data Benchmarking Workshop DCBench DCBench: typical data center workloads Different from scientific computing: FLOPS Cover applications in important domains Search engine, electronic commence etc. Each benchmark = a single application Purposes Architecture system (small-to-medium) researches 12/
Big Data Benchmarking Workshop BigDataBench Characterizing big data applications Not including data-intensive super computing Synthetic data sets varying from 10G~ PB Each benchmark = a single big application. Purposes large-scale system and architecture researches 13/
Big Data Benchmarking Workshop CloudRank Cloud computing Elastic resource management Consolidating different workloads Cloud benchmarks Each benchmark = a group of consolidated data center workloads. services/ data processing/ desktop Purposes Capacity planning, system evaluation and researches User can customize their benchmarks. 14/
Big Data Benchmarking Workshop Benchmarking Methodology To decide and rank main application domains according to a publicly available metric e.g. page view and daily visitors To single out the main applications from main applications domains 15/
Big Data Benchmarking Workshop Top Sites on the Web More details in Top Sites on the Web 16/
Big Data Benchmarking Workshop Benchmarking Methodology To decide and rank main application domains according to a publicly available metric e.g. page view and daily visitors To single out the main applications from main applications domains 17/
Big Data Benchmarking Workshop Main algorithms in Search Engine Algorithms used in Search: Pagerank Graph mining Segmentation Feature Reduction Grep Statistical counting Vector calculation sort Recommendation …… Top Sites on The Web 18/
Big Data Benchmarking Workshop Main Algorithms in Search Engines ( Nutch ) Word Grep Word Count Segmentation Sort Classification DecisionTree BFS Segmentation Scoring & Sort Merge Sort Vector calculate PageRank 19/
Big Data Benchmarking Workshop Main Algorithms in Social Networks Algorithms used in Social Network: Recommendation Clustering Classification Graph mining Grep Feature Reduction Statistical counting Vector calculation Sort …… Top Sites on The Web 20/
Big Data Benchmarking Workshop Main Algorithms in Electronic Commerce Algorithms used in electronic commerce: Recommendation Associate rule mining Warehouse operation Clustering Classification Statistical counting Vector calculation …… Top Sites on The Web 21/
Big Data Benchmarking Workshop Overview of DCBench CategoryWorkloadsProgrammin g model languagesource Basic operationSortMapReduceJavaHadoop WordcountMapReduceJavaHadoop GrepMapReduceJavaHadoop ClassificationNaïve BayesMapReduceJavaMahout Support Vector Machine MapReduceJavaImplemented by ourself ClusterK-meansMapReduceJavaMahout MPIC++IBM PML Fuzzy k-meansMapReduceJavaMahout MPIC++IBM PML Recommendatio n Item based Collaborative Filtering MapReduceJavaMahout Association rule mining Frequent pattern growth MapReduceJavaMahout Segmentation Hidden Markov modelMapReduceJavaImplemented by ourself 22/
Big Data Benchmarking Workshop CategoryWorkloadsProgramming model languagesource Warehouse operation Database operationsMapReduceJavaHive-bench Feature reduction Principal Component Analysis MPIC++IBM PML Kernel Principal Component Analysis MPIC++IBM PML Vector calculate Paper similarity analysis All-PairsC&C++Implemented by ourself Graph miningBreadth-first searchMPIC++Graph500 PagerankMapReduceJavaMahout ServiceSearch engineC/SJavanutch AuctionC/SJavaRubis ServiceMedia streamingC/SJavaCloudsuite Overview of DCBench (Cont’) 23/
Big Data Benchmarking Workshop Methodology of Generating Big Data To preserve the characteristics of real-world data Small-scale Data Big Data Characteristic Analysis Expand SemanticLocality TemporallySpatially e.g. word frequency Word reuse distance Word distribution in documents 24/
Big Data Benchmarking Workshop Workloads in BigDataBench 1.0 Beta Analysis Workloads Simple but representative operations Sort, Grep, Wordcount Highly recognized algorithms Naïve Bayes, SVM Search Engine Service Workloads Widely deployed services Nutch Server 25/
Big Data Benchmarking Workshop Variety of Workloads are Included WorkloadsOff-line Base Operations I/O bound Sort CPU bound Wordcount Hybrid Grep Machine Learning Naïve Bayes SVMOn-line Nutch Server 26/
Big Data Benchmarking Workshop Features of Workloads Workloads Resource Characteristic Computing ComplexityInstructions Sort I/O boundO(n*lgn)Integer comparison domination Wordcount CPU bound O(n) Integer comparison and calculation domination Grep Hybrid O(n) Integer comparison domination Naïve Bayes /O(m*n) [m: the length of dictionary] Floating-point computation domination SVM /O(M*n) [M: the number of support vectors * dimension] Floating-point computation domination Nutch Server I/O & CPU boundInteger comparison domination 27/
Big Data Benchmarking Workshop Content Background and Motivation Our ICTBench Case studies 28/
Big Data Benchmarking Workshop Use Case 1: Microarchitecture Characterization Using DCBench Five nodes cluster one mater and four slaves(working nodes) Each node: 29/
Big Data Benchmarking Workshop Instructions Execution level DCBench: Data analysis workloads have more app-level instructions Service workloads have higher percentages of kernel-level instructions service Data analysis 30/
Big Data Benchmarking Workshop Pipeline Stall DC workloads have severe front end stall (i.e. instruction fetch stall) Services: more RAT(Register Allocation Table) stall Data analysis: more RS(Reservation Station) and ROB(ReOrder Buffer) full stall 31/
Big Data Benchmarking Workshop Architecture Block Diagram 32/
Big Data Benchmarking Workshop Front End Stall Reasons For DC, High Instruction cache miss and Instruction TLB miss make the front end inefficiency 33/
Big Data Benchmarking Workshop MLC Behaviors DC workloads have more MLC misses than HPC Data analysis workloads own better locality (less L2 cache misses) Data analysis Service HPCC 34/
Big Data Benchmarking Workshop LLC Behaviors LLC is good enough for DC workloads Most L2 cache misses can be satisfied by LLC 35/
Big Data Benchmarking Workshop DTLB Behaviors DC workloads own more DTLB miss than HPC Most data analysis workloads have less DTLB miss Data analysis Service HPCC 36/
Big Data Benchmarking Workshop Branch Prediction DC: Data analysis workloads have pretty good branch behaviors Service’s branch is hard to predict Data analysis Service HPCC 37/
Big Data Benchmarking Workshop DC Workloads Characteristics Data analysis applications share many inherent characteristics, which place them in a different class from desktop, HPC, traditional server and scale-out service workloads. More details can be found at our IISWC 2013 paper. Characterizing Data Analysis Workloads in Data Centers. Zhen Jia, et al IEEE International Symposium on Workload Characterization ( IISWC- 2013) 38/
Big Data Benchmarking Workshop Use Case 2: Architecture Research Using BigDataBench 1.0 Beta Data Scale 10 GB – 2 TB Hadoop Configuration 1 master 14 slave node 39/
Big Data Benchmarking Workshop Use Case 2: Architecture Research Some micro-architectural events are tending towards stability when the data volume increases to a certain extent Cache and TLB behaviors have different trends with increasing data volumes for different workloads L1I_miss/1000ins: increase for Sort, decrease for Grep 40/
Big Data Benchmarking Workshop Search Engine Service Experiments Same phenomena is observed Micro-architectural events are tending towards stability when the index size increases to a certain extent Big data impose challenges to architecture researches since large-scale simulation is time-consuming Index size : 2GB ~ 8GB Segment size : 4.4GB ~ 17.6GB 41/
Big Data Benchmarking Workshop Use Case 3: System Evaluation Using BigDataBench 1.0 Beta Data Scale 10 GB – 2 TB Hadoop Configuration 1 master 14 slave node 42/
Big Data Benchmarking Workshop System Evaluation a threshold for each workload 100MB ~ 1TB System is fully loaded when the data volume exceeds the threshold Sort is an exception An inflexion point(10GB ~ 1TB) Data processing rate decreases after this point Global data access requirements I/O and network bottleneck System performance is dependent on applications and data volumes. 43/
Big Data Benchmarking Workshop Conclusion ICTBench DCBench BigDataBench CloudRank An open-source project on datacenter and big data benchmarking /
Big Data Benchmarking Workshop Publications BigDataBench: a Big Data Benchmark Suite from Web Search Engines. Wanling Gao, et al. The Third Workshop on Architectures and Systems for Big Data (ASBD 2013) in conjunction with ISCA Characterizing Data Analysis Workloads in Data Centers. Zhen Jia, et al IEEE International Symposium on Workload Characterization ( IISWC-2013) Characterizing OS behavior of Scale-out Data Center Workloads. Chen Zheng et al. Seventh Annual Workshop on the Interaction amongst Virtualization, Operating Systems and Computer Architecture (WIVOSCA 2013). In Conjunction with ISCA 2013.[ Characterization of Real Workloads of Web Search Engines. Huafeng Xi et al IEEE International Symposium on Workload Characterization ( IISWC-2011). The Implications of Diverse Applications and Scalable Data Sets in Benchmarking Big Data Systems. Zhen Jia et al. Second workshop of big data benchmarking (WBDB 2012 India) & Lecture Note in Computer Science (LNCS) CloudRank-D: Benchmarking and Ranking Cloud Computing Systems for Data Processing Applications. Chunjie Luo et al. Front. Comput. Sci. (FCS) 2012, 6(4): 347–362 45/
Big Data Benchmarking Workshop Thank you! Any questions? 46/