Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sd&m software design & management GmbH & Co. KG Thomas-Dehler-Straße 27 81737 München Telefon (0 89) 6 38 12-0 Telefax (0 89) 6 38 12-150

Similar presentations


Presentation on theme: "Sd&m software design & management GmbH & Co. KG Thomas-Dehler-Straße 27 81737 München Telefon (0 89) 6 38 12-0 Telefax (0 89) 6 38 12-150"— Presentation transcript:

1 sd&m software design & management GmbH & Co. KG Thomas-Dehler-Straße 27 81737 München Telefon (0 89) 6 38 12-0 Telefax (0 89) 6 38 12-150 http://www.sdm.de 1 Module “Bit::Vector” “Bit::Vector - more than the name suggests” Steffen Beyer YAPC::Europe, London, UK, ICA, September 22-24 2000

2 sd&m 2 Agenda What does it do? Purpose(s) Summary of available methods Characteristics Alternatives Some Applications Questions & Answers, Suggestions

3 sd&m 3 What does it do? The Bit::Vector module implements bit arrays of arbitrary size. Not very sexy, you may think. But actually bit vectors are the base of all computations performed by a computer! Your CPU calls them "processor registers"... By the way, is everybody familiar with two's complement binary representation and arithmetics?

4 sd&m 4 Purpose(s) Efficient storage and handling of bit arrays Extend your CPU to any desired number of bits Efficient set operations Efficient big integer arithmetic

5 sd&m 5 Summary of available methods (See file "BitVector.txt") Especially interesting methods: – "Interval_Substitute()" (is to bit vectors what "splice" is to Perl arrays) – "Interval_Scan_...()" (finds contiguous blocks of set bits) – "Chunk_...()" (allows access to packets of bits at a time of chooseable size) – "...Reverse()" (same to bit vectors as Perl's "reverse" for strings)

6 sd&m 6 Characteristics (1/3) Internally written in C (thus fast) Relies on CPU's machine word operations for maximum speed Auto-adapts to size of machine word at runtime Uses efficient algorithms (mostly "divide-and-conquer"), time complexity of many functions O(1), O(n), O(n ld n) C library at the core can also be used stand-alone (without Perl) Free Software (GPL+Artistic), C library also LGPL

7 sd&m 7 Characteristics (2/3) - Efficient Algorithms Example: Exponentiation (x k ) E.g.27 13 (base 10)k = 13 =27 * 27 * 27 * 27 * 27 * 27 * 27 * 27 * 27 * 27 * 27 * 27 * 27 =11011 1101 (base 2)n = int(ld k) = 3 =(11011 8 ) 1 * (11011 4 ) 1 * (11011 2 ) 0 * (11011 1 ) 1 Worst case: 2n multiplications = O(n) = O(ld k) instead of k - 1= O(k) – here: only 5 instead of 12 Example: Conversion to decimal representation Divides bit vector modulo largest power of 10 fitting into a machine word, then uses machine word math operations to break remainder down further Example: Bit counting (number of set bits)

8 sd&m 8 Characteristics (3/3) Object-oriented interface, e.g. $vec1->intersection($vec2,$vec3); Optionally ( * ) provides overloaded operators – one set of operands for set operations, e.g. $set1 = $set2 & $set3 ; – one set of operands for big integer math, e.g. $bigsum += $bigint; ( * ) : will be optional in version 6.0 (for improved loading speed of "plain" module), is always loaded now

9 sd&m 9 Alternatives (1/2) vec() – confusing – insufficiently powerful for many applications PDL – complicated – designed primarily for astronomical data analysis and heavy duty number crunching (written in C, internally) Math::PARI – very powerful – requires separate C library "PARI" Math::BigInt (is in the Core of Perl 5.6) – slow (written entirely in Perl, stores digits in Perl arrays) Math::BigInteger – unmaintained, doesn't compile (uses XS and a C library)

10 sd&m 10 Alternatives (2/2) Set::Bag- implements multisets Set::IntSpan- optimized for.newsrc file type sets (also supported by Bit::Vector, but need more memory) Set::Object- implements sets of arbitrary objects (can be simulated with Bit::Vector using lookup table, set operations will then be faster) Set::Scalar- similar to Set::Object (?), but also allows recursion (set of sets) Set::Window- optimized for intervals of integers (needs much less memory than Bit::Vector, but only of limited use since the whole interval is either in or out)

11 sd&m 11 Simulating Set::Object using lookup table See file "SetObject.pl"

12 sd&m 12 Some Applications Set::IntRange - sets of integers (universe = some interval) Math::MatrixBool - useful for graph algorithms (e.g. shortest paths / Kleene's Algorithm) Slice (multiple document version generator) Parse table generators for compiler-compilers à la "yacc" (calculating first, follow & lookahead character sets) Cryptography Easy manipulation of data (files), any number of bits at a time

13 sd&m 13 Application "Slice" See – homepage screenshot "Slice.bmp" – file "file.in" – file "Slice.txt" – file "file.html.en.OK" – file "file.html.de.OK" – URL http://www.engelschall.com/sw/slice/

14 sd&m 14 Application "Date::Calc" v5.0 (coming soon) Stores years in bit vectors (one year = one bit vector, one day = one bit) Bit is "on" if corresponding day is a holiday Performs calculations taking holidays into account

15 sd&m 15 Questions & Answers, Suggestions Please feel free to ask! Suggestions are welcome.


Download ppt "Sd&m software design & management GmbH & Co. KG Thomas-Dehler-Straße 27 81737 München Telefon (0 89) 6 38 12-0 Telefax (0 89) 6 38 12-150"

Similar presentations


Ads by Google