On Benchmarking Frequent Itemset Mining Algorithms Balázs Rácz, Ferenc Bodon, Lars Schmidt-Thieme Budapest University of Technology and Economics Computer.

Slides:



Advertisements
Similar presentations
Runahead Execution: An Alternative to Very Large Instruction Windows for Out-of-order Processors Onur Mutlu, The University of Texas at Austin Jared Start,
Advertisements

Software & Services Group, Developer Products Division Copyright© 2010, Intel Corporation. All rights reserved. *Other brands and names are the property.
1 Architectural Complexity: Opening the Black Box Methods for Exposing Internal Functionality of Complex Single and Multiple Processor Systems EECC-756.
CS 345 Computer System Overview
CSCE 212 Chapter 4: Assessing and Understanding Performance Instructor: Jason D. Bakos.
Copyright © 1998 Wanda Kunkle Computer Organization 1 Chapter 2.5 Comparing and Summarizing Performance.
Chapter 4 Processor Technology and Architecture. Chapter goals Describe CPU instruction and execution cycles Explain how primitive CPU instructions are.
Copyright © 1998 Wanda Kunkle Computer Organization 1 Chapter 2.1 Introduction.
Chapter 2: Impact of Machine Architectures What is the Relationship Between Programs, Programming Languages, and Computers.
Introduction Operating Systems’ Concepts and Structure Lecture 1 ~ Spring, 2008 ~ Spring, 2008TUCN. Operating Systems. Lecture 1.
Cmpt-225 Simulation. Application: Simulation Simulation  A technique for modeling the behavior of both natural and human-made systems  Goal Generate.
A Characterization of Processor Performance in the VAX-11/780 From the ISCA Proceedings 1984 Emer & Clark.
Technology Fundamentals 6 th pd. Terms to know Decimal Binary Hexadecimal Input Output Operating system Printer firewall Hardware Software Data Mainframe.
Copyright Arshi Khan1 System Programming Instructor Arshi Khan.
Cisc Complex Instruction Set Computing By Christopher Wong 1.
Improving the Performance of Object-Oriented Languages with Dynamic Predication of Indirect Jumps José A. Joao *‡ Onur Mutlu ‡* Hyesoon Kim § Rishi Agarwal.
What is Concurrent Programming? Maram Bani Younes.
Authors: Tong Li, Dan Baumberger, David A. Koufaty, and Scott Hahn [Systems Technology Lab, Intel Corporation] Source: 2007 ACM/IEEE conference on Supercomputing.
1 Layers of Computer Science, ISA and uArch Alexander Titov 20 September 2014.
1.1 1 Introduction Foundations of Computer Science  Cengage Learning.
C++ Programming. Table of Contents History What is C++? Development of C++ Standardized C++ What are the features of C++? What is Object Orientation?
High level & Low level language High level programming languages are more structured, are closer to spoken language and are more intuitive than low level.
CuMAPz: A Tool to Analyze Memory Access Patterns in CUDA
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
Introduction By: Dr. Javad Razjouyan. Programming Languages.
CS 1308 Computer Literacy and the Internet. Introduction  Von Neumann computer  “Naked machine”  Hardware without any helpful user-oriented features.
Architectural Support for Fine-Grained Parallelism on Multi-core Architectures Sanjeev Kumar, Corporate Technology Group, Intel Corporation Christopher.
Computer Programming A program is a set of instructions a computer follows in order to perform a task. solve a problem Collectively, these instructions.
OPERATING SYSTEMS Goals of the course Definitions of operating systems Operating system goals What is not an operating system Computer architecture O/S.
INVITATION TO COMPUTER SCIENCE, JAVA VERSION, THIRD EDITION Chapter 6: An Introduction to System Software and Virtual Machines.
ACMSE’04, ALDepartment of Electrical and Computer Engineering - UAH Execution Characteristics of SPEC CPU2000 Benchmarks: Intel C++ vs. Microsoft VC++
Guide to Programming with Python Chapter One Getting Started: The Game Over Program.
Computer Science 516 Week 3 Lecture Notes. Computer Architecture - Common Points This lecture will cover some common things which characterize computer.
CE Operating Systems Lecture 3 Overview of OS functions and structure.
Operating Systems David Goldschmidt, Ph.D. Computer Science The College of Saint Rose CIS 432.
CS 346 – Chapter 2 OS services –OS user interface –System calls –System programs How to make an OS –Implementation –Structure –Virtual machines Commitment.
GPUs: Overview of Architecture and Programming Options Lee Barford firstname dot lastname at gmail dot com.
Basic Systems and Software. Were we left off Computers are programmable (formal) machines. Digital information is stored as a series of two states (1.
CS 127 Introduction to Computer Science. What is a computer?  “A machine that stores and manipulates information under the control of a changeable program”
Computer Architecture 2 nd year (computer and Information Sc.)
Precomputation- based Prefetching By James Schatz and Bashar Gharaibeh.
Processor Structure and Function Chapter8:. CPU Structure  CPU must:  Fetch instructions –Read instruction from memory  Interpret instructions –Instruction.
Platform Abstraction Group 3. Question How to deal with different types hardware and software platforms? What detail to expose to the programmer? What.
DR. SIMING LIU SPRING 2016 COMPUTER SCIENCE AND ENGINEERING UNIVERSITY OF NEVADA, RENO CS 219 Computer Organization.
Lecture 27 Multiprocessor Scheduling. Last lecture: VMM Two old problems: CPU virtualization and memory virtualization I/O virtualization Today Issues.
Sunpyo Hong, Hyesoon Kim
CMSC 611: Advanced Computer Architecture Performance & Benchmarks Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some.
ITP 109 Week 2 Trina Gregory Introduction to Java.
Analyzing Memory Access Intensity in Parallel Programs on Multicore Lixia Liu, Zhiyuan Li, Ahmed Sameh Department of Computer Science, Purdue University,
HIERARCHICAL TEMPORAL MEMORY WHY CANT COMPUTERS BE MORE LIKE THE BRAIN?
CSE 340 Computer Architecture Summer 2016 Understanding Performance.
What’s going on here? Can you think of a generic way to describe both of these?
Computer Organization CS345 David Monismith Based upon notes by Dr. Bill Siever and from the Patterson and Hennessy Text.
Introduction to Operating Systems Concepts
Before You Begin Nahla Abuel-ola /WIT.
Chapter 1 Introduction.
Key Ideas from day 1 slides
Application-Specific Customization of Soft Processor Microarchitecture
Chapter 1 Introduction.
CSCE 212 Chapter 4: Assessing and Understanding Performance
Programming Languages
Compiler Construction
Douglas Lacy & Daniel LeCheminant CS 252 December 10, 2003
CMSC 611: Advanced Computer Architecture
José A. Joao* Onur Mutlu‡ Yale N. Patt*
CMSC 611: Advanced Computer Architecture
Tonga Institute of Higher Education IT 141: Information Systems
Tonga Institute of Higher Education IT 141: Information Systems
System calls….. C-program->POSIX call
What Are Performance Counters?
Presentation transcript:

On Benchmarking Frequent Itemset Mining Algorithms Balázs Rácz, Ferenc Bodon, Lars Schmidt-Thieme Budapest University of Technology and Economics Computer and Automation Research Institute of the Hungarian Academy of Sciences Computer-Based New Media Group, Institute for Computer Science

OSDM05, On Benchmarking Frequent Itemset Mining Algorithms 2 History Over 100 papers on Frequent Itemset Mining Many of them claim to be the ‘best’  Based on benchmarks run against some publicly available implementation on some datasets FIMI03, 04 workshop: extensive benchmarks with many implementations and data sets  Serves as a guideline ever since How ‘fair’ was the benchmark and what did it measure?

OSDM05, On Benchmarking Frequent Itemset Mining Algorithms 3 On FIMI contests Problem 1: We are interested in the quality of algorithms, but we can only measure implementations.  No good theoretical data model yet for analytical comparison  We’ll see later: would need good hardware model Problem 2: If we gave our algorithms and ideas to a very talented and experienced low-level programmer, that could completely re-draw the current FIMI rankings.  A FIMI contest is all about the ‘constant factor’

OSDM05, On Benchmarking Frequent Itemset Mining Algorithms 4 On FIMI contests (2) Problem 3: Seemingly unimportant implementation details can hide all algorithmic features when benchmarking.  These details are often unnoticed even by the author and almost never published.

OSDM05, On Benchmarking Frequent Itemset Mining Algorithms 5 On FIMI contests (3) Problem 4: FIM implementations are complete ‘suites’ of a basic algorithm and several algorithmic/implementational optimizations. Comparing such complete ‘suites’ tells us what is fast, but does not tell us why. Recommendation:  Modular programming  Benchmarks on the individual features

OSDM05, On Benchmarking Frequent Itemset Mining Algorithms 6 On FIMI contests (4) Problem 5: All ‘dense’ mining tasks’ run time is dominated by I/O. Problem 6: On ‘dense’ datasets FIMI benchmarks are measuring the ability of submitters to code a fast integer-to-string conversion function. Recommendation:  Have as much identical code as possible   library of FIM functions

OSDM05, On Benchmarking Frequent Itemset Mining Algorithms 7 On FIMI contests (5) Problem 7: Run time differences are small Problem 8: Run time varies from run to run  The very same executable on the very same input  Bug or feature of modern hardware?  What to measure? Recommendation: ‘winner takes all’ evaluation of a mining task is unfair

OSDM05, On Benchmarking Frequent Itemset Mining Algorithms 8 On FIMI contests (6) Problem 9: Traditional run-time (+memory need) benchmarks do not tell us whether an implementation is better than an other in algorithmic aspects, or implementational (hardware-friendliness) aspects. Problem 10: Traditional benchmarks do not show whether on a slightly different hardware architecture (like AMD vs. Intel) the conclusions would still hold or not. Recommendation: extend benchmarks

OSDM05, On Benchmarking Frequent Itemset Mining Algorithms 9 Library and pluggability Code reusal, pluggable components, data structures Object oriented design Do not sacrifice efficiency  No virtual method calls allowed in the core  Then how? C++ templates  Allow pluggability with inlining  Plugging requires source code change, but several versions can coexist  Sometimes tricky to code with templates

OSDM05, On Benchmarking Frequent Itemset Mining Algorithms 10 I/O efficiency Variations of output routine:  normal-simple: renders each itemset and each item separately to text  normal-cache: caches the string representation of item identifiers  df-buffered: (depth-first) reuses the string representation of the last line, appends the last item  df-cache: like df-buffered, but also caches the string representation of item identifiers

OSDM05, On Benchmarking Frequent Itemset Mining Algorithms 12 Benchmarking: desiderata 1. The benchmark should be stable, and reproducible. Ideally it should have no variation, surely not on the same hardware. 2. The benchmark numbers should reflect the actual performance. The benchmark should be a fairly accurate model of actual hardware. 3. The benchmark should be hardware-independent, in the sense that it should be stable against the slight variation of the underlying hardware architecture, like changing the processor manufacturer or model.

OSDM05, On Benchmarking Frequent Itemset Mining Algorithms 13 Benchmarking: reality Different implementations stress different aspects of the hardware Migrating to other hardware:  May be better in one aspect, worse in another one Ranking cannot be migrated between HW Complex benchmark results are necessary  Win due to algorithmic or HW-friendliness reason? Performance is not as simple as ‘run time in seconds’

OSDM05, On Benchmarking Frequent Itemset Mining Algorithms 14 Benchmark platform Virtual machine  How to define?  How to code the implementations?  Cost function? Instrumentation (simulation of actual CPU)  Slow (100-fold slower than plain run time)  Accuracy?  Cost function?

OSDM05, On Benchmarking Frequent Itemset Mining Algorithms 15 Benchmark platform (2) Run-time measurement  Performance counters  Present in all modern processor (since i586)  Count performance-related events real-time  PerfCtr kernel patch under Linux, vendor-specific software under Windows  Problem: measured numbers reflect the actual execution, thus are subject to variation

Three sets of bars: wide, centered total size shows total clockticks used, i.e. run-time, purple shows time of stall (CPU waiting for sth) Three sets of bars: narrow, centered brown shows # of instructions (u-ops) executed – stable, cyan shows wasted u-ops due to branch mis- predictions Three sets of bars: narrow, right lbrown shows ticks of memory r/w (mostly wait) black shows read-ahead (prefetch)

OSDM05, On Benchmarking Frequent Itemset Mining Algorithms 19 Conclusion We cannot measure algorithms, only implementations Modular implementations with pluggable features Shared code for the common functionality (like I/O)  FIMI library with C++ templates Benchmark: run time varies, depends on hardware used  Complex benchmarks needed  Conclusions on algorithmic aspects or hardware friendliness?

OSDM05, On Benchmarking Frequent Itemset Mining Algorithms 20 Thank you for your attention Big question: how does the choice of compiler influence the performance and the ranking?