Presentation is loading. Please wait.

Presentation is loading. Please wait.

Copyright 2000, Georgia Tech

Similar presentations


Presentation on theme: "Copyright 2000, Georgia Tech"— Presentation transcript:

1 Copyright 2000, Georgia Tech
Optimizing Squeak Measuring the Speed of Squeak MessageTally and TimeProfileBrowser Changes to improve speed Choose operations appropriately Choose collections to improve speed How collections work Build a primitive When building a primitive is useful/necessary Coming soon: How the VM works and how to build primitives... 2/19/2019 Copyright 2000, Georgia Tech

2 Copyright 2000, Georgia Tech
MessageTally MessageTally provides a variety of tools for analyzing your code. time: - Returns the time in milliseconds that it took to do some operation MessageTally time: [ timesRepeat: [4 * 4]] “44” MessageTally time: [ timesRepeat: [4.0 * 4.0]] “80” MessageTally time: [ timesRepeat: [4 * 4.0]] “76” MessageTally time: [ timesRepeat: [4.0 * 4]] “79” Notice that floating point operations take much more time than integer. 2/19/2019 Copyright 2000, Georgia Tech

3 What’s eating up the time?
Part of it: Floats are slower But the bigger part of it is the unpacking of Object -> Class (to figure out type) -> NativeFormat MessageTally time: [10000 timesRepeat: [ * ]] “9” MessageTally time: [10000 timesRepeat: [ * ]] “8” MessageTally time: [10000 timesRepeat: [4 * ]] “9” MessageTally time: [10000 timesRepeat: [ * 4]] “7” The multiply isn’t taking the most time. Most of the time is taking up by finding the type (like float) and getting the type ready for the low-level operation. This is also called boxing and unboxing. 2/19/2019 Copyright 2000, Georgia Tech

4 A Different Way to Look at Executing Code
At regular intervals, interrupt the executing process with a “spy” process Figure out which method it is that it executing at that moment Reports The “tree” of which methods called which other methods The percentage of time spent (over the whole tree) in each “leaf” The percentage of time won’t be accurate since it doesn’t track exactly when the method started and finished. 2/19/2019 Copyright 2000, Georgia Tech

5 Copyright 2000, Georgia Tech
MessageTally spyOn: Does a “spy” on a process Reports percentages of time A “primitive” leaf is attributed to its method (one-level up) To trim tree, <2% is not shown, but can be added into leaves Example: MessageTally spyOn: [10000 timesRepeat: [ printString]] By “primitive” we mean a machine language routine. 2/19/2019 Copyright 2000, Georgia Tech

6 Copyright 2000, Georgia Tech
- 139 tallies, 2407 msec. **Tree** 100.0 Float(Object)>>printString 74.1 Float(Number)>>printOn: |74.1 Float>>printOn:base: | Float>>absPrintOn:base: | Character class>>digitValue: | primitives | False>>| | Float(Number)>>ceiling | |12.9 Float(Number)>>floor | LimitedWriteStream(WriteStream)>>nextPut: 25.9 String class(SequenceableCollection class)>>streamContents:limitedTo: 17.3 LimitedWriteStream(WriteStream)>>contents |15.1 String(SequenceableCollection)>>copyFrom:to: | String(Object)>>species 7.2 LimitedWriteStream class(PositionableStream class)>>on: 5.8 LimitedWriteStream(WriteStream)>>on: 3.6 LimitedWriteStream(PositionableStream)>>on: The numbers are the approximate percentage of time spent in that method. The format is percentage of time in the method then the class and >> then the method. 2/19/2019 Copyright 2000, Georgia Tech

7 Copyright 2000, Georgia Tech
**Leaves** 18.7 Character class>>digitValue: 18.0 False>>| 16.5 Float>>absPrintOn:base: 15.1 String(Object)>>species 12.9 Float(Number)>>floor 4.3 LimitedWriteStream(WriteStream)>>nextPut: 2.9 SmallInteger(Magnitude)>>max: 2/19/2019 Copyright 2000, Georgia Tech

8 Copyright 2000, Georgia Tech
Where does the time go? Notice in that example The biggest piece of the printString execution is the conversion of each individual digit to character Character class>>digitValue: But second biggest is a logical Or. 18.0 False>>| Where is that happening? Ask Mark what is happening here. 2/19/2019 Copyright 2000, Georgia Tech

9 Copyright 2000, Georgia Tech
TimeProfileBrowser TimeProfileBrowser does do spying, like MessageTally TimeProfileBrowser onBlock: [10000 timesRepeat: [ printString]] But it also acts as a code browser so that you can see each piece of code! 2/19/2019 Copyright 2000, Georgia Tech

10 Copyright 2000, Georgia Tech
TimeProfileBrowser There is a class False which has a | “or” method which just returns the argument to the method. This is because anything or’d with a false value will depend on the thing it is or’d with. 2/19/2019 Copyright 2000, Georgia Tech

11 Copyright 2000, Georgia Tech
The Problem of Spying Spying is inaccurate Run the same test several times: Different results each time! Have to run something often enough (e.g., 1000 timesRepeat:…) to catch the right methods Alternative, accurate counts with tallySends: Uses the fact that Squeak’s VM is generated from a working simulation of the VM Actually simulates the VM to get perfectly accurate counts of how often each method is called. Can also be useful for debugging: It’s a trace! Run the test a few times and see that you get different results. Because the spying interrupts the process and checks the method that it is in it won’t always return the same results. 2/19/2019 Copyright 2000, Georgia Tech

12 MessageTally tallySends: [3.14159 printString]
This simulation took 0.0 seconds. **Tree** 2 Float(Object)>>printString 1 Float(Number)>>printOn: |1 Float>>printOn:base: | 1 Float>>absPrintOn:base: | |7 SmallInteger>>* | | |7 SmallInteger(Integer)>>* | | | 7 Float>>adaptToInteger:andSend: | |7 LimitedWriteStream(WriteStream)>>nextPut: | |6 Character class>>digitValue: The numbers show the number of times the method was called. 2/19/2019 Copyright 2000, Georgia Tech

13 Measuring Squeak’s Speed
Now that we have tools for measuring Squeak, let’s start figuring out what’s slow and what’s fast. What’s fast: Integer arithmetic is faster than floating point (expected) Special messages, coded into the bytecode + - > < at: at:put: bitOr: bitAnd: class = == new value do: size For each of the special messages above there is a translation into a single bytecode. 2/19/2019 Copyright 2000, Georgia Tech

14 Copyright 2000, Georgia Tech
The VM and Bytecodes The VM (e.g., squeak.exe) interprets bytecodes Bytecodes are the machine language of a virtual machine The “VM” is, strictly speaking, a “VM simulator” or “interpreter” You can see bytecodes for a method by doing “show bytecodes” from code pane Click on the “Source” button and choose “bytecodes” in the menu to see the bytecodes for the method. 2/19/2019 Copyright 2000, Georgia Tech

15 Special Messages are fast lookups
Special messages, like +, actually map to a single bytecode One memory access, no lookup Non-special messages involve passing a pointer to a memory location where the message selector is stored 2/19/2019 Copyright 2000, Georgia Tech

16 Copyright 2000, Georgia Tech
A Word on Primitives Primitives are the bottommost layer of the method hierarchy They are not defined in terms of bytecodes, but in terms of the native code Think of them as subroutine calls into the VM You can make up your own primitives! In latest versions of Squeak, they can even be dynamically loaded 2/19/2019 Copyright 2000, Georgia Tech

17 But much of speed is Squeak-level choices
Integers vs. floats, Squeak-code vs. primitives are low-level VM decisions Most of what determines fast or slow code is at the level of your Squeak code Choices in collections Algorithm coding 2/19/2019 Copyright 2000, Georgia Tech

18 Brief review of Collections
Dictionary: Takes a key and a value, e.g., aDict at: ‘dog’ put: ‘Rufus’. Array: Just like any language OrderedCollection: Like a Java vector Bag: You can add to it, and it remembers the number of identical elements Set: You can add to it, and it remembers only the element Look at the documentation for Bag in Squeak. It stores each different object but if there are several identical elements it only stores one of the identical and the count of how many there are. 2/19/2019 Copyright 2000, Georgia Tech

19 Copyright 2000, Georgia Tech
Speed of Adding Dictionaries are the most general indexed collection, but they’re also slow to add to. d := Dictionary new. MessageTally time: [1 to: do: [:i | d at: i put: i]]. “152” a := Array new: MessageTally time: [1 to: do: [:i | a at: i put: i]]. “2” Arrays are much faster for inserting than Dictionaries. 2/19/2019 Copyright 2000, Georgia Tech

20 OrderedCollections are only slow to grow
oc := OrderedCollection new: MessageTally time: [1 to: do: [:i | oc add: i]]. “17” MessageTally time: [1 to: do: [:i | oc at: i put: i]]. “11” Once an OrderedCollection is the right size, at:put: is within six times the speed of an Array (2 ms from previous slide) It’s slower because Array’s at:put: is a primitive, while OC’s checks bounds first 2/19/2019 Copyright 2000, Georgia Tech

21 Why are OrderedCollections slow to grow?
add: newObject ^self addLast: newObject addLast: newObject "Add newObject to the end of the receiver. Answer newObject." lastIndex = array size ifTrue: [self makeRoomAtLast]. lastIndex := lastIndex + 1. array at: lastIndex put: newObject. ^ newObject “makeRoomAtLast calls self grow…” 2/19/2019 Copyright 2000, Georgia Tech

22 OrderedCollections double in size on each grow!
"Become larger. Typically, a subclass has to override this if the subclass adds instance variables." | newArray | newArray := Array new: self size + self growSize. newArray replaceFrom: 1 to: array size with: array startingAt: 1. array:= newArray growSize ^ array size max: 2 “returns the maximum of the array size or 2” The growSize method return the maximum of the size of the array or 2. In grow the array will usually double in size. 2/19/2019 Copyright 2000, Georgia Tech

23 Copyright 2000, Georgia Tech
Is that bad? Think about the average case of adding to an OrderedCollection Most of the time it won’t need to grow Doubling in size means that you’ll not do it very often! You’re trading off space for time, a classic tradeoff 2/19/2019 Copyright 2000, Georgia Tech

24 Copyright 2000, Georgia Tech
Speed of Access MessageTally time: [1 to: do: [:i | d at: i]]. “Dictionary: 60” MessageTally time: [1 to: do: [:i | a at: i]]. “Array: 2” MessageTally time: [1 to: do: [:i | oc at: i]]. “OrderedCollection: 9” For iteration Arrays are the fastest with OrderedCollections second and Dictionaries last. 2/19/2019 Copyright 2000, Georgia Tech

25 SortedCollections are great but slow
SortedCollections keep their components sorted, but that’s a cost (note that the below are a magnitude less than previous) sc := SortedCollection new. MessageTally time: [1 to: 1000 do: [:i | sc add: i]]. “12” MessageTally time: [1 to: 1000 do: [:i | d at: i]]. “4” SortedCollections are slower to add to than dictionaries. 2/19/2019 Copyright 2000, Georgia Tech

26 Adding to Non-Sequenced Collections
o := OrderedCollection new. MessageTally time: [1 to: do: [:i | o add: i]]. “14” s := Set new. MessageTally time: [1 to: do: [:i | s add: i]]. “113” b := Bag new. MessageTally time: [1 to: do: [:i | b add: i]]. “265” OrderedCollections are the fastest with Sets being much slower and Bags even slower. 2/19/2019 Copyright 2000, Georgia Tech

27 Copyright 2000, Georgia Tech
Let’s find an element! MessageTally time: [10 timesRepeat: [o detect: [:n | n >= 5000]]]. “45” MessageTally time: [10 timesRepeat: [s detect: [:n | n >= 5000]]]. “48” MessageTally time: [10 timesRepeat: [b detect: [:n | n >= 5000]]]. “256” Bags looks unbearably slow! Why would you ever use one? Detect will walk through the entire contents and return just the items that match the conditional. 2/19/2019 Copyright 2000, Georgia Tech

28 Iteration is the wrong way to find an element!
MessageTally time: [100 timesRepeat: [o includes: 5000]]. “444” MessageTally time: [100 timesRepeat: [s includes: 5000]]. “0” MessageTally time: [100 timesRepeat: [b includes: 5000]]. “0” Sets and bags are great for finding items. 2/19/2019 Copyright 2000, Georgia Tech

29 How are Bags so fast? Dictionaries!
Bags are so fast because their implementation is actually a Dictionary (a hashtable)! Dictionaries are not slow! They’re slow if you use them as arrays, and they’re slow to iterate across But for finding a specific element, they are blindingly fast! 2/19/2019 Copyright 2000, Georgia Tech

30 Implementation of Bags
Bags have one instance variable, a Dictionary named contents add: newObject ^self add: newObject withOccurrences: 1 add: newObject withOccurrences: anInteger "Add the element newObject to the receiver. Do so as though the element were added anInteger number of times. Answer newObject." contents at: newObject put: (contents at: newObject ifAbsent: [0]) + anInteger. ^ newObject 2/19/2019 Copyright 2000, Georgia Tech

31 Dictionaries are key to fast lookups
Dictionaries are used heavily in Squeak E.g., Smalltalk is a kind of Dictionary Everything in Smalltalk (or Squeak) knows its own hash Hash functions need to be Fast Unique for unique objects Captures how objects differ in actual practice 2/19/2019 Copyright 2000, Georgia Tech

32 Some Sample Hash Functions
“Integer” hash ^(self lastDigit bitShift: 8) + (self digitAt: 1) “Float” hash "Both words of the float are used; 8 bits are removed from each end to clear most of the exponent regardless of the byte ordering. (The bitAnd:'s ensure that the intermediate results do not become a large integer.) Slower than the original version in the ratios 12:5 to 2:1 depending on values. (DNS, 11 May, 1997)" ^ (((self basicAt: 1) bitAnd: 16r00FFFF00) + ((self basicAt: 2) bitAnd: 16r00FFFF00)) bitShift: -8 2/19/2019 Copyright 2000, Georgia Tech

33 Copyright 2000, Georgia Tech
More hash functions “Character” hash ^value “Point” hash ^(x hash bitShift: 2) bitXor: y hash “String” hash | l m | (l _ m _ self size) <= 2 ifTrue: [l = 2 ifTrue: [m _ 3] ifFalse: [l = 1 ifTrue: [^((self at: 1) asciiValue bitAnd: 127) * 106]. ^21845]]. ^(self at: 1) asciiValue * 48 + ((self at: (m - 1)) asciiValue + l) 2/19/2019 Copyright 2000, Georgia Tech

34 Copyright 2000, Georgia Tech
Summary Lots of ways to time/trace in Squeak MessageTally and TimeProfileBrowser Making things fast in Squeak Choose data types wisely Use primitives Code wisely Arrays vs. hashing - for iteration, arrays; for finding, hashing 2/19/2019 Copyright 2000, Georgia Tech


Download ppt "Copyright 2000, Georgia Tech"

Similar presentations


Ads by Google