Download presentation
Presentation is loading. Please wait.
Published byBryan Sims Modified over 8 years ago
1
A small excursion Empirical Computer Science Binary Search versus Linear Search
2
Question: From a practical point of view, is there a significant advantage in using binary search rather than linear search?
3
What we will do describe linear search (in java) describe binary search (in java) propose an experiment present our results
4
How we realise our experiments Given a string s and an array of strings sorted in lex order, is the string s in the array?
5
Linear Search
6
What is this?
7
Binary Search
8
Code for Experiments
9
Paranoia Prior to executing experiments hundreds of thousands of calls were made to linSearch and binSearch making sure that they were in agreement (and initially they were not!)
10
Data Sets Used Data sets were produced using the following program
12
Confused?
14
Given a random sorted subset of the dictionary, called data for each entry in the dictionary determine if it is present in the sorted set data measure the total time for these 25104 probes Repeat with varying size of data set Details
15
Experiments were run on a low spec unix machine
16
Results
17
Data set size probed into 25104 times
18
Results CPU milliseconds to perform 25104 probes using binary search
19
Results CPU milliseconds to perform 25104 probes using linear search
21
Conclusion Binary Search is significantly faster than linear search However, the data set must be sorted Note also, linear search appear to scale linearly with problem size ARE YOU SURE?
22
What would be the effect of using different computers? Would we get the same results?
23
What’s happening with simeulue? Different platforms
24
What’s happening with vahanga?
25
Different platforms What’s happening with Jeremy’s machine?
26
Any suggestions?
27
Different platforms Jeremy now calls System.gc() prior to experiment
28
These regions look different. Why? Different platforms
29
Jeremy now calls System.gc() prior to experiment These regions look different. Why? Different platforms Could it be the cache?
33
Inline garbage collection
35
On simeulue
36
How would we convince ourselves that binSearch scales O(log2(n))? How about if we plot y against log(x)? i.e. y = log(x) We would hope to see a straight line Question
37
Conclusion … it ain’t easy need to be very sure about what we are actually measuring we were measuring garbage collection, cache, cpu time (what’s that?) … beware of small scale statistics our sample size was 1! (one data set at each size) was it right to be measuring cpu time only? we could have measured comparisons, or mems (memory access), … be paranoid Empirical CS
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.