A small excursion Empirical Computer Science Binary Search versus Linear Search.

Slides:



Advertisements
Similar presentations
Designed and Presented by Dr. Ayman Elshenawy Elsefy Dept. of Systems & Computer Eng.. Al-Azhar University
Advertisements

Analysis of Algorithms CS Data Structures Section 2.6.
1 Top-k Spatial Joins
Chapter 3: The Efficiency of Algorithms Invitation to Computer Science, Java Version, Third Edition.
HST 952 Computing for Biomedical Scientists Lecture 9.
The Path to Multi-core Tools Paul Petersen. Multi-coreToolsThePathTo 2 Outline Motivation Where are we now What is easy to do next What is missing.
David Notkin Autumn 2009 CSE303 Lecture 7 bash today, C tomorrow Quick reprise: debugging, performance What’s homework 2B? (yes, it’s posted) Some looks.
CIS 101: Computer Programming and Problem Solving Lecture 6 Usman Roshan Department of Computer Science NJIT.
Evan Korth New York University Computer Science I Classes and Objects Professor: Evan Korth New York University.
Cmpt-225 Algorithm Efficiency.
27-Jun-15 Profiling code, Timing Methods. Optimization Optimization is the process of making a program as fast (or as small) as possible Here’s what the.
1 Algorithms and Analysis CS 2308 Foundations of CS II.
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
Algorithmic Complexity Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
© 2006 Pearson Addison-Wesley. All rights reserved10 A-1 Chapter 10 Algorithm Efficiency and Sorting.
Cmpt-225 Simulation. Application: Simulation Simulation  A technique for modeling the behavior of both natural and human-made systems  Goal Generate.
Algorithms and Data Structures Hash Tables and Associative Arrays.
Algorithm Analysis Dr. Bernard Chen Ph.D. University of Central Arkansas.
Counting the Cost Recall linear search & binary search Number of items Worst CaseExpected Case Number of probes (comparisons) LinearBinary Linear Binary.
Data Structures and Algorithms Semester Project – Fall 2010 Faizan Kazi Comparison of Binary Search Tree and custom Hash Tree data structures.
1 Layers of Computer Science, ISA and uArch Alexander Titov 20 September 2014.
1 Complexity Lecture Ref. Handout p
C++ Programming. Table of Contents History What is C++? Development of C++ Standardized C++ What are the features of C++? What is Object Orientation?
Array operations II manipulating arrays and measuring performance.
Ch4b- 2 EE/CS/CPE Computer Organization  Seattle Pacific University Performance metrics I’m concerned with how long it takes to run my program.
Computing hardware CPU.
1 TOPIC 1 INTRODUCTION TO COMPUTER SCIENCE AND PROGRAMMING Topic 1 Introduction to Computer Science and Programming Notes adapted from Introduction to.
1 Arrays 2: Sorting and Searching Admin. §1) No class Thursday. §2) Will cover Strings next Tuesday. §3) Take in report. §4) Hand out program assignment.
Timing Trials An investigation arising out of the Assignment CS32310 – Nov 2013 H Holstein 1.
Computer Systems Organization CS 1428 Foundations of Computer Science.
Forensics and CS Philip Chan. CSI: Crime Scene Investigation high tech forensics tools DNA profiling Use.
SEARCHING. Vocabulary List A collection of heterogeneous data (values can be different types) Dynamic in size Array A collection of homogenous data (values.
L. Grewe.  An array ◦ stores several elements of the same type ◦ can be thought of as a list of elements: int a[8]
ICS 321 Fall 2011 Overview of Storage & Indexing (i) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 11/9/20111Lipyeow.
MonkeySort Keith Gallagher University of Durham Loyola College in Maryland presented by Lewis Berman.
Analysis of Algorithms CSCI Previous Evaluations of Programs Correctness – does the algorithm do what it is supposed to do? Generality – does it.
CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans Lecture 23: Review.
CS 178: Programming with Multimedia Objects Aditya P. Mathur Professor of Computer Sciences Purdue University, West Lafayette August 25, 2004 Last update:
Sorting – Insertion and Selection. Sorting Arranging data into ascending or descending order Influences the speed and complexity of algorithms that use.
Operating Systems (CS 340 D) Princess Nora University Faculty of Computer & Information Systems Computer science Department.
CS261 Data Structures Ordered Bag Dynamic Array Implementation.
JETT 2005 Session 5: Algorithms, Efficiency, Hashing and Hashtables.
Navigation Timing Studies of the ATLAS High-Level Trigger Andrew Lowe Royal Holloway, University of London.
Outline Announcements: –HW III due Friday! –HW II returned soon Software performance Architecture & performance Measuring performance.
Operating Systems (CS 340 D) Princess Nora University Faculty of Computer & Information Systems Computer science Department.
CPSC 252 Hashing Page 1 Hashing We have already seen that we can search for a key item in an array using either linear or binary search. It would be better.
ACCELERATING QUERY-BY-HUMMING ON GPU Pascal Ferraro, Pierre Hanna, Laurent Imbert, Thomas Izard ISMIR 2009 Presenter: Chung-Che Wang (Focus on the performance.
CS 106 Introduction to Computer Science I 03 / 02 / 2007 Instructor: Michael Eckmann.
1 the hash table. hash table A hash table consists of two major components …
Intro To Algorithms Searching and Sorting. Searching A common task for a computer is to find a block of data A common task for a computer is to find a.
COM S 228 Algorithms and Analysis Instructor: Ying Cai Department of Computer Science Iowa State University Office: Atanasoff 201.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Lecture 3 Sorting and Selection. Comparison Sort.
Outline Announcements: –HW II Idue Friday! Validating Model Problem Software performance Measuring performance Improving performance.
TOPIC 5 ASSIGNMENT SORTING, HASH TABLES & LINKED LISTS Yerusha Nuh & Ivan Yu.
OCR A Level F453: The function and purpose of translators Translators a. describe the need for, and use of, translators to convert source code.
CSE 351 Caches. Before we start… A lot of people confused lea and mov on the midterm Totally understandable, but it’s important to make the distinction.
CS203 Lecture 14. Hashing An object may contain an arbitrary amount of data, and searching a data structure that contains many large objects is expensive.
Sections 10.5 – 10.6 Hashing.
Interpreted languages Jakub Yaghob
Algorithmic Efficency
Applying Control Theory to Stream Processing Systems
Computer Science 112 Fundamentals of Programming II
An Empirical Analysis of Java Performance Quality
Lesson 15: Processing Arrays
Compile-time Frequency Scaling for CPU Energy and EDP Improvement
CS510 - Portland State University
Intro to Computer Science CS1510 Dr. Sarah Diesburg
Computer Science 3 03A-Searching
RANDOM NUMBERS SET # 1:
Presentation transcript:

A small excursion Empirical Computer Science Binary Search versus Linear Search

Question: From a practical point of view, is there a significant advantage in using binary search rather than linear search?

What we will do describe linear search (in java) describe binary search (in java) propose an experiment present our results

How we realise our experiments Given a string s and an array of strings sorted in lex order, is the string s in the array?

Linear Search

What is this?

Binary Search

Code for Experiments

Paranoia Prior to executing experiments hundreds of thousands of calls were made to linSearch and binSearch making sure that they were in agreement (and initially they were not!)

Data Sets Used Data sets were produced using the following program

Confused?

Given a random sorted subset of the dictionary, called data for each entry in the dictionary determine if it is present in the sorted set data measure the total time for these probes Repeat with varying size of data set Details

Experiments were run on a low spec unix machine

Results

Data set size probed into times

Results CPU milliseconds to perform probes using binary search

Results CPU milliseconds to perform probes using linear search

Conclusion Binary Search is significantly faster than linear search However, the data set must be sorted Note also, linear search appear to scale linearly with problem size ARE YOU SURE?

What would be the effect of using different computers? Would we get the same results?

What’s happening with simeulue? Different platforms

What’s happening with vahanga?

Different platforms What’s happening with Jeremy’s machine?

Any suggestions?

Different platforms Jeremy now calls System.gc() prior to experiment

These regions look different. Why? Different platforms

Jeremy now calls System.gc() prior to experiment These regions look different. Why? Different platforms Could it be the cache?

Inline garbage collection

On simeulue

How would we convince ourselves that binSearch scales O(log2(n))? How about if we plot y against log(x)? i.e. y = log(x) We would hope to see a straight line Question

Conclusion … it ain’t easy need to be very sure about what we are actually measuring we were measuring garbage collection, cache, cpu time (what’s that?) … beware of small scale statistics our sample size was 1! (one data set at each size) was it right to be measuring cpu time only? we could have measured comparisons, or mems (memory access), … be paranoid Empirical CS