Presentation is loading. Please wait.

Presentation is loading. Please wait.

Overheads for Computers as Components, 2nd ed.

Similar presentations


Presentation on theme: "Overheads for Computers as Components, 2nd ed."— Presentation transcript:

1 Overheads for Computers as Components, 2nd ed.
Introduction What are embedded computing systems? Challenges in embedded computing system design. Design methodologies. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

2 Overheads for Computers as Components, 2nd ed.
Definition Embedded computing system: any device that includes a programmable computer but is not itself a general-purpose computer. Take advantage of application characteristics to optimize the design: don’t need all the general-purpose bells and whistles. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

3 Overheads for Computers as Components, 2nd ed.
Embedding a computer output analog input CPU analog mem embedded computer © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

4 Overheads for Computers as Components, 2nd ed.
Examples Cell phone. Printer. Automobile: engine, brakes, dash, etc. Airplane: engine, flight controls, nav/comm. Digital television. Household appliances. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

5 Overheads for Computers as Components, 2nd ed.
Early history Late 1940’s: MIT Whirlwind computer was designed for real-time operations. Originally designed to control an aircraft simulator. First microprocessor was Intel 4004 in early 1970’s. HP-35 calculator used several chips to implement a microprocessor in 1972. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

6 Overheads for Computers as Components, 2nd ed.
Early history, cont’d. Automobiles used microprocessor-based engine controllers starting in 1970’s. Control fuel/air mixture, engine timing, etc. Multiple modes of operation: warm-up, cruise, hill climbing, etc. Provides lower emissions, better fuel efficiency. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

7 Microprocessor varieties
Microcontroller: includes I/O devices, on-board memory. Digital signal processor (DSP): microprocessor optimized for digital signal processing. Typical embedded word sizes: 8-bit, 16-bit, 32-bit. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

8 Overheads for Computers as Components, 2nd ed.
Application examples Simple control: front panel of microwave oven, etc. Canon EOS 3 has three microprocessors. 32-bit RISC CPU runs autofocus and eye control systems. Digital TV: programmable CPUs + hardwired logic for video/audio decode, menus, etc. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

9 Automotive embedded systems
Today’s high-end automobile may have 100 microprocessors: 4-bit microcontroller checks seat belt; microcontrollers run dashboard devices; 16/32-bit microprocessor controls engine. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

10 BMW 850i brake and stability control system
Anti-lock brake system (ABS): pumps brakes to reduce skidding. Automatic stability control (ASC+T): controls engine to improve stability. ABS and ASC+T communicate. ABS was introduced first---needed to interface to existing ABS module. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

11 Overheads for Computers as Components, 2nd ed.
BMW 850i, cont’d. sensor sensor brake brake hydraulic pump ABS brake brake sensor sensor © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

12 Characteristics of embedded systems
Sophisticated functionality. Real-time operation. Low manufacturing cost. Low power. Designed to tight deadlines by small teams. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

13 Functional complexity
Often have to run sophisticated algorithms or multiple algorithms. Cell phone, laser printer. Often provide sophisticated user interfaces. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

14 Overheads for Computers as Components, 2nd ed.
Real-time operation Must finish operations by deadlines. Hard real time: missing deadline causes failure. Soft real time: missing deadline results in degraded performance. Many systems are multi-rate: must handle operations at widely varying rates. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

15 Non-functional requirements
Many embedded systems are mass-market items that must have low manufacturing costs. Limited memory, microprocessor power, etc. Power consumption is critical in battery-powered devices. Excessive power consumption increases system cost even in wall-powered devices. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

16 Overheads for Computers as Components, 2nd ed.
Design teams Often designed by a small team of designers. Often must meet tight deadlines. 6 month market window is common. Can’t miss back-to-school window for calculator. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

17 Why use microprocessors?
Alternatives: field-programmable gate arrays (FPGAs), custom logic, etc. Microprocessors are often very efficient: can use same logic to perform many different functions. Microprocessors simplify the design of families of products. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

18 The performance paradox
Microprocessors use much more logic to implement a function than does custom logic. But microprocessors are often at least as fast: heavily pipelined; large design teams; aggressive VLSI technology. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

19 Overheads for Computers as Components, 2nd ed.
Power Custom logic uses less power, but CPUs have advantages: Modern microprocessors offer features to help control power consumption. Software design techniques can help reduce power consumption. Heterogeneous systems: some custom logic for well-defined functions, CPUs+software for everything else. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

20 Overheads for Computers as Components, 2nd ed.
Platforms Embedded computing platform: hardware architecture + associated software. Many platforms are multiprocessors. Examples: Single-chip multiprocessors for cell phone baseband. Automotive network + processors. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

21 The physics of software
Computing is a physical act. Software doesn’t do anything without hardware. Executing software consumes energy, requires time. To understand the dynamics of software (time, energy), we need to characterize the platform on which the software runs. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

22 What does “performance” mean?
In general-purpose computing, performance often means average-case, may not be well-defined. In real-time systems, performance means meeting deadlines. Missing the deadline by even a little is bad. Finishing ahead of the deadline may not help. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

23 Characterizing performance
We need to analyze the system at several levels of abstraction to understand performance: CPU. Platform. Program. Task. Multiprocessor. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

24 Challenges in embedded system design
How much hardware do we need? How big is the CPU? Memory? How do we meet our deadlines? Faster hardware or cleverer software? How do we minimize power? Turn off unnecessary logic? Reduce memory accesses? © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

25 Overheads for Computers as Components, 2nd ed.
Challenges, etc. Does it really work? Is the specification correct? Does the implementation meet the spec? How do we test for real-time characteristics? How do we test on real data? How do we work on the system? Observability, controllability? What is our development platform? © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

26 Overheads for Computers as Components, 2nd ed.
Design methodologies A procedure for designing a system. Understanding your methodology helps you ensure you didn’t skip anything. Compilers, software engineering tools, computer-aided design (CAD) tools, etc., can be used to: help automate methodology steps; keep track of the methodology itself. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

27 Overheads for Computers as Components, 2nd ed.
Design goals Performance. Overall speed, deadlines. Functionality and user interface. Manufacturing cost. Power consumption. Other requirements (physical size, etc.) © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

28 Overheads for Computers as Components, 2nd ed.
Levels of abstraction requirements specification architecture component design system integration © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

29 Overheads for Computers as Components, 2nd ed.
Top-down vs. bottom-up Top-down design: start from most abstract description; work to most detailed. Bottom-up design: work from small components to big system. Real design uses both techniques. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

30 Overheads for Computers as Components, 2nd ed.
Stepwise refinement At each level of abstraction, we must: analyze the design to determine characteristics of the current state of the design; refine the design to add detail. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

31 Overheads for Computers as Components, 2nd ed.
Requirements Plain language description of what the user wants and expects to get. May be developed in several ways: talking directly to customers; talking to marketing representatives; providing prototypes to users for comment. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

32 Functional vs. non-functional requirements
output as a function of input. Non-functional requirements: time required to compute output; size, weight, etc.; power consumption; reliability; etc. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

33 Overheads for Computers as Components, 2nd ed.
Our requirements form © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

34 Example: GPS moving map requirements
Moving map obtains position from GPS, paints map from local database. I-78 Scotch Road lat: lon: 32 19 © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

35 Overheads for Computers as Components, 2nd ed.
GPS moving map needs Functionality: For automotive use. Show major roads and landmarks. User interface: At least 400 x 600 pixel screen. Three buttons max. Pop-up menu. Performance: Map should scroll smoothly. No more than 1 sec power-up. Lock onto GPS within 15 seconds. Cost: $120 street price = approx. $30 cost of goods sold. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

36 GPS moving map needs, cont’d.
Physical size/weight: Should fit in hand. Power consumption: Should run for 8 hours on four AA batteries. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

37 GPS moving map requirements form
© 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

38 Overheads for Computers as Components, 2nd ed.
Specification A more precise description of the system: should not imply a particular architecture; provides input to the architecture design process. May include functional and non-functional elements. May be executable or may be in mathematical form for proofs. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

39 Overheads for Computers as Components, 2nd ed.
GPS specification Should include: What is received from GPS; map data; user interface; operations required to satisfy user requests; background operations needed to keep the system running. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

40 Overheads for Computers as Components, 2nd ed.
Architecture design What major components go satisfying the specification? Hardware components: CPUs, peripherals, etc. Software components: major programs and their operations. Must take into account functional and non-functional specifications. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

41 GPS moving map block diagram
display GPS receiver search engine renderer database user interface © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

42 GPS moving map hardware architecture
display frame buffer CPU GPS receiver memory panel I/O © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

43 GPS moving map software architecture
database search renderer pixels position user interface timer © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

44 Designing hardware and software components
Must spend time architecting the system before you start coding. Some components are ready-made, some can be modified from existing designs, others must be designed from scratch. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

45 Overheads for Computers as Components, 2nd ed.
System integration Put together the components. Many bugs appear only at this stage. Have a plan for integrating components to uncover bugs quickly, test as much functionality as early as possible. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

46 Overheads for Computers as Components, 2nd ed.
Summary Embedded computers are all around us. Many systems have complex embedded hardware and software. Embedded systems pose many design challenges: design time, deadlines, power, etc. Design methodologies help us manage the design process. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

47 Overheads for Computers as Components, 2nd ed.
Introduction Object-oriented design. Unified Modeling Language (UML). © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

48 Overheads for Computers as Components, 2nd ed.
System modeling Need languages to describe systems: useful across several levels of abstraction; understandable within and between organizations. Block diagrams are a start, but don’t cover everything. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

49 Object-oriented design
Object-oriented (OO) design: A generalization of object-oriented programming. Object = state + methods. State provides each object with its own identity. Methods provide an abstract interface to the object. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

50 Overheads for Computers as Components, 2nd ed.
Objects and classes Class: object type. Class defines the object’s state elements but state values may change over time. Class defines the methods used to interact with all objects of that type. Each object has its own state. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

51 Overheads for Computers as Components, 2nd ed.
OO design principles Some objects will closely correspond to real-world objects. Some objects may be useful only for description or implementation. Objects provide interfaces to read/write state, hiding the object’s implementation from the rest of the system. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

52 Overheads for Computers as Components, 2nd ed.
UML Developed by Booch et al. Goals: object-oriented; visual; useful at many levels of abstraction; usable for all aspects of design. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

53 Overheads for Computers as Components, 2nd ed.
UML object object name class name d1: Display pixels is a 2-D array pixels: array[] of pixels elements menu_items attributes comment © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

54 Overheads for Computers as Components, 2nd ed.
UML class Display class name pixels elements menu_items mouse_click() draw_box operations © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

55 Overheads for Computers as Components, 2nd ed.
The class interface The operations provide the abstract interface between the class’s implementation and other classes. Operations may have arguments, return values. An operation can examine and/or modify the object’s state. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

56 Choose your interface properly
If the interface is too small/specialized: object is hard to use for even one application; even harder to reuse. If the interface is too large: class becomes too cumbersome for designers to understand; implementation may be too slow; spec and implementation are probably buggy. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

57 Relationships between objects and classes
Association: objects communicate but one does not own the other. Aggregation: a complex object is made of several smaller objects. Composition: aggregation in which owner does not allow access to its components. Generalization: define one class in terms of another. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

58 Overheads for Computers as Components, 2nd ed.
Class derivation May want to define one class in terms of another. Derived class inherits attributes, operations of base class. Derived_class UML generalization Base_class © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

59 Class derivation example
Display base class pixels elements menu_items pixel() set_pixel() mouse_click() draw_box derived class BW_display Color_map_display © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

60 Overheads for Computers as Components, 2nd ed.
Multiple inheritance base classes Speaker Display Multimedia_display derived class © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

61 Links and associations
Link: describes relationships between objects. Association: describes relationship between classes. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

62 Overheads for Computers as Components, 2nd ed.
Link example Link defines the contains relationship: message msg = msg1 length = 1102 message set count = 2 message msg = msg2 length = 2114 © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

63 Overheads for Computers as Components, 2nd ed.
Association example # contained messages # containing message sets message message set 0..* 1 msg: ADPCM_stream length : integer count : integer contains © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

64 Overheads for Computers as Components, 2nd ed.
Stereotypes Stereotype: recurring combination of elements in an object or class. Example: <<foo>> © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

65 Behavioral description
Several ways to describe behavior: internal view; external view. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

66 Overheads for Computers as Components, 2nd ed.
State machines transition a b state state name © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

67 Event-driven state machines
Behavioral descriptions are written as event-driven state machines. Machine changes state when receiving an input. An event may come from inside or outside of the system. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

68 Overheads for Computers as Components, 2nd ed.
Types of events Signal: asynchronous event. Call: synchronized communication. Timer: activated by time. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

69 Overheads for Computers as Components, 2nd ed.
Signal event <<signal>> mouse_click a leftorright: button x, y: position mouse_click(x,y,button) b declaration event description © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

70 Overheads for Computers as Components, 2nd ed.
Call event draw_box(10,5,3,2,blue) c d © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

71 Overheads for Computers as Components, 2nd ed.
Timer event tm(time-value) e f © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

72 Overheads for Computers as Components, 2nd ed.
Example state machine start finish input/output region = menu/ which_menu(i) mouse_click(x,y,button)/ find_region(region) call_menu(I) region found got menu item called menu item region = drawing/ find_object(objid) highlight(objid) object highlighted found object © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

73 Overheads for Computers as Components, 2nd ed.
Sequence diagram Shows sequence of operations over time. Relates behaviors of multiple objects. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

74 Sequence diagram example
m: Mouse d1: Display u: Menu mouse_click(x,y,button) which_menu(x,y,i) time call_menu(i) © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

75 Overheads for Computers as Components, 2nd ed.
Summary Object-oriented design helps us organize a design. UML is a transportable system design language. Provides structural and behavioral description primitives. © 2008 Wayne Wolf Overheads for Computers as Components, 2nd ed.

76 Overheads for Computers as Components 2nd ed.
Introduction Example: model train controller. © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

77 Overheads for Computers as Components 2nd ed.
Purposes of example Follow a design through several levels of abstraction. Gain experience with UML. © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

78 Overheads for Computers as Components 2nd ed.
Model train setup rcvr motor power supply console ECC address header command © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

79 Overheads for Computers as Components 2nd ed.
Requirements Console can control 8 trains on 1 track. Throttle has at least 63 levels. Inertia control adjusts responsiveness with at least 8 levels. Emergency stop button. Error detection scheme on messages. © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

80 Overheads for Computers as Components 2nd ed.
Requirements form © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

81 Digital Command Control
DCC created by model railroad hobbyists, picked up by industry. Defines way in which model trains, controllers communicate. Leaves many system design aspects open, allowing competition. This is a simple example of a big trend: Cell phones, digital TV rely on standards. © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

82 Overheads for Computers as Components 2nd ed.
DCC documents Standard S-9.1, DCC Electrical Standard. Defines how bits are encoded on the rails. Standard S-9.2, DCC Communication Standard. Defines packet format and semantics. © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

83 DCC electrical standard
Voltage moves around the power supply voltage; adds no DC component. 1 is 58 ms, 0 is at least 100 ms. logic 1 logic 0 time 58 ms >= 100 ms © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

84 DCC communication standard
Basic packet format: PSA(sD)+E. P: preamble = S: packet start bit = 0. A: address data byte. s: data byte start bit. D: data byte (data payload). E: packet end bit = 1. © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

85 Overheads for Computers as Components 2nd ed.
DCC packet types Baseline packet: minimum packet that must be accepted by all DCC implementations. Address data byte gives receiver address. Instruction data byte gives basic instruction. Error correction data byte gives ECC. © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

86 Conceptual specification
Before we create a detailed specification, we will make an initial, simplified specification. Gives us practice in specification and UML. Good idea in general to identify potential problems before investing too much effort in detail. © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

87 Overheads for Computers as Components 2nd ed.
Basic system commands © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

88 Typical control sequence
:console :train_rcvr set-inertia set-speed set-speed estop set-speed © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

89 Overheads for Computers as Components 2nd ed.
Message classes command set-speed set-inertia estop value: integer value: unsigned- integer © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

90 Roles of message classes
Implemented message classes derived from message class. Attributes and operations will be filled in for detailed specification. Implemented message classes specify message type by their class. May have to add type as parameter to data structure in implementation. © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

91 Subsystem collaboration diagram
Shows relationship between console and receiver (ignores role of track): 1..n: command :console :receiver © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

92 System structure modeling
Some classes define non-computer components. Denote by *name. Choose important systems at this point to show basic relationships. © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

93 Overheads for Computers as Components 2nd ed.
Major subsystem roles Console: read state of front panel; format messages; transmit messages. Train: receive message; interpret message; control the train. © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

94 Console system classes
1 1 1 1 1 1 panel formatter transmitter 1 1 1 1 receiver* sender* © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

95 Overheads for Computers as Components 2nd ed.
Console class roles panel: describes analog knobs and interface hardware. formatter: turns knob settings into bit streams. transmitter: sends data on track. © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

96 Overheads for Computers as Components 2nd ed.
Train system classes train set 1 1..t 1 1 train 1 motor interface 1 receiver 1 1 1 controller 1 1 1 detector* pulser* © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

97 Overheads for Computers as Components 2nd ed.
Train class roles receiver: digitizes signal from track. controller: interprets received commands and makes control decisions. motor interface: generates signals required by motor. © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

98 Detailed specification
We can now fill in the details of the conceptual specification: more classes; behaviors. Sketching out the spec first helps us understand the basic relationships in the system. © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

99 Overheads for Computers as Components 2nd ed.
Train speed control Motor controlled by pulse width modulation: duty cycle + V - © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

100 Console physical object classes
knobs* pulser* train-knob: integer speed-knob: integer inertia-knob: unsigned- integer emergency-stop: boolean pulse-width: unsigned- integer direction: boolean sender* detector* send-bit() read-bit() : integer © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

101 Panel and motor interface classes
speed: integer train-number() : integer speed() : integer inertia() : integer estop() : boolean new-settings() © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

102 Overheads for Computers as Components 2nd ed.
Class descriptions panel class defines the controls. new-settings() behavior reads the controls. motor-interface class defines the motor speed held as state. © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

103 Transmitter and receiver classes
current: command new: boolean send-speed(adrs: integer, speed: integer) send-inertia(adrs: integer, val: integer) set-estop(adrs: integer) read-cmd() new-cmd() : boolean rcv-type(msg-type: command) rcv-speed(val: integer) rcv-inertia(val:integer) © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

104 Overheads for Computers as Components 2nd ed.
Class descriptions transmitter class has one behavior for each type of message sent. receiver function provides methods to: detect a new message; determine its type; read its parameters (estop has no parameters). © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

105 Overheads for Computers as Components 2nd ed.
Formatter class formatter current-train: integer current-speed[ntrains]: integer current-inertia[ntrains]: unsigned-integer current-estop[ntrains]: boolean send-command() panel-active() : boolean operate() © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

106 Formatter class description
Formatter class holds state for each train, setting for current train. The operate() operation performs the basic formatting task. © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

107 Overheads for Computers as Components 2nd ed.
Control input cases Use a soft panel to show current panel settings for each train. Changing train number: must change soft panel settings to reflect current train’s speed, etc. Controlling throttle/inertia/estop: read panel, check for changes, perform command. © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

108 Control input sequence diagram
:knobs :panel :formatter :transmitter change in control settings read panel panel-active change in speed/ inertia/estop panel settings send-command read panel send-speed, send-inertia. send-estop panel settings read panel change in train number train number change in panel settings new-settings set-knobs © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

109 Formatter operate behavior
update-panel() panel-active() new train number idle send-command() other © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

110 Panel-active behavior
current-train = train-knob update-screen changed = true panel*:read-train() F T current-speed = throttle changed = true panel*:read-speed() F ... ... © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

111 Overheads for Computers as Components 2nd ed.
Controller class controller current-train: integer current-speed[ntrains]: integer current-direction[ntrains]: boolean current-inertia[ntrains]: unsigned-integer operate() issue-command() © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

112 Overheads for Computers as Components 2nd ed.
Setting the speed Don’t want to change speed instantaneously. Controller should change speed gradually by sending several commands. © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

113 Sequence diagram for set-speed command
:receiver :controller :motor-interface :pulser* new-cmd cmd-type rcv-speed set-speed set-pulse set-pulse set-pulse set-pulse set-pulse © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

114 Controller operate behavior
wait for a command from receiver receive-command() issue-command() © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

115 Refined command classes
type: 3-bits address: 3-bits parity: 1-bit set-speed set-inertia estop type=010 value: 7-bits type=001 value: 3-bits type=000 © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

116 Overheads for Computers as Components 2nd ed.
Summary Separate specification and programming. Small mistakes are easier to fix in the spec. Big mistakes in programming cost a lot of time. You can’t completely separate specification and architecture. Make a few tasteful assumptions. © 2000 Morgan Kaufman Overheads for Computers as Components 2nd ed.

117 Overheads for Computers as Components 2nd ed.
Instruction sets Computer architecture taxonomy. Assembly language. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

118 von Neumann architecture
Memory holds data, instructions. Central processing unit (CPU) fetches instructions from memory. Separate CPU and memory distinguishes programmable computer. CPU registers help out: program counter (PC), instruction register (IR), general-purpose registers, etc. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

119 Overheads for Computers as Components 2nd ed.
CPU + memory memory address CPU PC 200 data 200 ADD r5,r1,r3 ADD r5,r1,r3 IR © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

120 Overheads for Computers as Components 2nd ed.
Harvard architecture address CPU data memory data PC address program memory data © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

121 Overheads for Computers as Components 2nd ed.
von Neumann vs. Harvard Harvard can’t use self-modifying code. Harvard allows two simultaneous memory fetches. Most DSPs use Harvard architecture for streaming data: greater memory bandwidth; more predictable bandwidth. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

122 Overheads for Computers as Components 2nd ed.
RISC vs. CISC Complex instruction set computer (CISC): many addressing modes; many operations. Reduced instruction set computer (RISC): load/store; pipelinable instructions. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

123 Instruction set characteristics
Fixed vs. variable length. Addressing modes. Number of operands. Types of operands. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

124 Overheads for Computers as Components 2nd ed.
Programming model Programming model: registers visible to the programmer. Some registers are not visible (IR). © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

125 Multiple implementations
Successful architectures have several implementations: varying clock speeds; different bus widths; different cache sizes; etc. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

126 Overheads for Computers as Components 2nd ed.
Assembly language One-to-one with instructions (more or less). Basic features: One instruction per line. Labels provide names for addresses (usually in first column). Instructions often start in later columns. Columns run to end of line. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

127 ARM assembly language example
label1 ADR r4,c LDR r0,[r4] ; a comment ADR r4,d LDR r1,[r4] SUB r0,r0,r1 ; comment © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

128 Overheads for Computers as Components 2nd ed.
Pseudo-ops Some assembler directives don’t correspond directly to instructions: Define current address. Reserve storage. Constants. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

129 Overheads for Computers as Components 2nd ed.
CPUs Input and output. Supervisor mode, exceptions, traps. Co-processors. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

130 Overheads for Computers as Components 2nd ed.
I/O devices Usually includes some non-digital component. Typical digital interface to CPU: status reg CPU mechanism data reg © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

131 Overheads for Computers as Components 2nd ed.
Application: 8251 UART Universal asynchronous receiver transmitter (UART) : provides serial communication. 8251 functions are integrated into standard PC interface chip. Allows many communication parameters to be programmed. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

132 Overheads for Computers as Components 2nd ed.
Serial communication Characters are transmitted separately: no char bit 0 bit 1 bit n-1 ... start stop time © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

133 Serial communication parameters
Baud (bit) rate. Number of bits per character. Parity/no parity. Even/odd parity. Length of stop bit (1, 1.5, 2 bits). © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

134 Overheads for Computers as Components 2nd ed.
8251 CPU interface 8251 status (8 bit) CPU xmit/ rcv data (8 bit) serial port © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

135 Overheads for Computers as Components 2nd ed.
Programming I/O Two types of instructions can support I/O: special-purpose I/O instructions; memory-mapped load/store instructions. Intel x86 provides in, out instructions. Most other CPUs use memory-mapped I/O. I/O instructions do not preclude memory-mapped I/O. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

136 Overheads for Computers as Components 2nd ed.
ARM memory-mapped I/O Define location for device: DEV1 EQU 0x1000 Read/write code: LDR r1,#DEV1 ; set up device adrs LDR r0,[r1] ; read DEV1 LDR r0,#8 ; set up value to write STR r0,[r1] ; write value to device © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

137 Overheads for Computers as Components 2nd ed.
Peek and poke Traditional HLL interfaces: int peek(char *location) { return *location; } void poke(char *location, char newval) { (*location) = newval; } © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

138 Overheads for Computers as Components 2nd ed.
Busy/wait output Simplest way to program device. Use instructions to test when device is ready. current_char = mystring; while (*current_char != ‘\0’) { poke(OUT_CHAR,*current_char); while (peek(OUT_STATUS) != 0); current_char++; } © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

139 Simultaneous busy/wait input and output
while (TRUE) { /* read */ while (peek(IN_STATUS) == 0); achar = (char)peek(IN_DATA); /* write */ poke(OUT_DATA,achar); poke(OUT_STATUS,1); while (peek(OUT_STATUS) != 0); } © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

140 Overheads for Computers as Components 2nd ed.
Interrupt I/O Busy/wait is very inefficient. CPU can’t do other work while testing device. Hard to do simultaneous I/O. Interrupts allow a device to change the flow of control in the CPU. Causes subroutine call to handle device. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

141 Overheads for Computers as Components 2nd ed.
Interrupt interface intr request status reg CPU intr ack mechanism IR PC data/address data reg © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

142 Overheads for Computers as Components 2nd ed.
Interrupt behavior Based on subroutine call mechanism. Interrupt forces next instruction to be a subroutine call to a predetermined location. Return address is saved to resume executing foreground program. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

143 Interrupt physical interface
CPU and device are connected by CPU bus. CPU and device handshake: device asserts interrupt request; CPU asserts interrupt acknowledge when it can handle the interrupt. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

144 Example: character I/O handlers
void input_handler() { achar = peek(IN_DATA); gotchar = TRUE; poke(IN_STATUS,0); } void output_handler() { © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

145 Example: interrupt-driven main program
while (TRUE) { if (gotchar) { poke(OUT_DATA,achar); poke(OUT_STATUS,1); gotchar = FALSE; } © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

146 Example: interrupt I/O with buffers
Queue for characters: head tail a head tail © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

147 Buffer-based input handler
void input_handler() { char achar; if (full_buffer()) error = 1; else { achar = peek(IN_DATA); add_char(achar); } poke(IN_STATUS,0); if (nchars == 1) { poke(OUT_DATA,remove_char(); poke(OUT_STATUS,1); } } © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

148 Overheads for Computers as Components 2nd ed.
I/O sequence diagram :foreground :input :output :queue empty a empty b bc c © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

149 Debugging interrupt code
What if you forget to change registers? Foreground program can exhibit mysterious bugs. Bugs will be hard to repeat---depend on interrupt timing. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

150 Priorities and vectors
Two mechanisms allow us to make interrupts more specific: Priorities determine what interrupt gets CPU first. Vectors determine what code is called for each type of interrupt. Mechanisms are orthogonal: most CPUs provide both. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

151 Prioritized interrupts
device 1 device 2 device n interrupt acknowledge L1 L2 .. Ln CPU © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

152 Interrupt prioritization
Masking: interrupt with priority lower than current priority is not recognized until pending interrupt is complete. Non-maskable interrupt (NMI): highest-priority, never masked. Often used for power-down. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

153 Example: Prioritized I/O
:interrupts :foreground :A :B :C B C A A,B © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

154 Overheads for Computers as Components 2nd ed.
Interrupt vectors Allow different devices to be handled by different code. Interrupt vector table: Interrupt vector table head handler 0 handler 1 handler 2 handler 3 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

155 Interrupt vector acquisition
:CPU :device receive request receive ack receive vector © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

156 Generic interrupt mechanism
continue execution intr? Assume priority selection is handled before this point. N Y intr priority > current priority? N ignore Y ack Y bus error Y N timeout? vector? Y call table[vector] © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

157 Overheads for Computers as Components 2nd ed.
Interrupt sequence CPU acknowledges request. Device sends vector. CPU calls handler. Software processes request. CPU restores state to foreground program. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

158 Sources of interrupt overhead
Handler execution time. Interrupt mechanism overhead. Register save/restore. Pipeline-related penalties. Cache-related penalties. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

159 Overheads for Computers as Components 2nd ed.
ARM interrupts ARM7 supports two types of interrupts: Fast interrupt requests (FIQs). Interrupt requests (IRQs). Interrupt table starts at location 0. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

160 ARM interrupt procedure
CPU actions: Save PC. Copy CPSR to SPSR. Force bits in CPSR to record interrupt. Force PC to vector. Handler responsibilities: Restore proper PC. Restore CPSR from SPSR. Clear interrupt disable flags. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

161 Overheads for Computers as Components 2nd ed.
ARM interrupt latency Worst-case latency to respond to interrupt is 27 cycles: Two cycles to synchronize external request. Up to 20 cycles to complete current instruction. Three cycles for data abort. Two cycles to enter interrupt handling state. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

162 Overheads for Computers as Components 2nd ed.
C55x interrupts Latency is between 7 and 13 cycles. Maskable interrupt sequence: Interrupt flag register is set. Interrupt enable register is checked. Interrupt mask register is checked. Interrupt flag register is cleared. Appropriate registers are saved. INTM set to 1, DBGM set to 1, EALLOW set to 0. Branch to ISR. Two styles of return: fast and slow. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

163 Overheads for Computers as Components 2nd ed.
Supervisor mode May want to provide protective barriers between programs. Avoid memory corruption. Need supervisor mode to manage the various programs. SHARC does not have a supervisor mode. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

164 Overheads for Computers as Components 2nd ed.
ARM supervisor mode Use SWI instruction to enter supervisor mode, similar to subroutine: SWI CODE_1 Sets PC to 0x08. Argument to SWI is passed to supervisor mode code. Saves CPSR in SPSR. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

165 Overheads for Computers as Components 2nd ed.
Exception Exception: internally detected error. Exceptions are synchronous with instructions but unpredictable. Build exception mechanism on top of interrupt mechanism. Exceptions are usually prioritized and vectorized. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

166 Overheads for Computers as Components 2nd ed.
Trap Trap (software interrupt): an exception generated by an instruction. Call supervisor mode. ARM uses SWI instruction for traps. SHARC offers three levels of software interrupts. Called by setting bits in IRPTL register. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

167 Overheads for Computers as Components 2nd ed.
Co-processor Co-processor: added function unit that is called by instruction. Floating-point units are often structured as co-processors. ARM allows up to 16 designer-selected co-processors. Floating-point co-processor uses units 1, 2. C55x uses co-processors as well. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

168 C55x image/video hardware extensions
Available in 5509 and 5510. Equivalent C-callable functions for other devices. Available extensions: DCT/IDCT. Pixel interpolation Motion estimation. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

169 Overheads for Computers as Components 2nd ed.
DCT/IDCT 2-D DCT/IDCT is computed from two 1-D DCT/IDCT. Put data in different banks to maximize throughput. block Column DCT interim DCT Row DCT © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

170 C55 DCT/IDCT coprocessor extensions
Load, compute, transfer to accumulators: ACy=copr(k8,ACx,Xmem,Ymem) Compute, transfer, mem write: ACy=copr(k8,ACx,ACy), Lmem=ACz Special: ACy=copr(k8,ACx,ACy) © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

171 Software pipelined load/compute/store for DCT
Iteration i-1 Iteration i Dual_load 8 compute 4 Long_store 4 empty 3 empty Iteration i+1 Dual_load 8 compute 4 Long_store 4 empty 3 empty Dual_load op_i(0), load_i+1(0,1) op_i(1), store_i-1(0,1) op_i(2), store_i-1(2,3) op_i(2), store_i-1(4,5) op_i(2), store_i-1(6,7) op_i(2), load_i+1(2,3) 4 empty 3 Dual_load 8 compute empty 4 Long_store © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

172 Overheads for Computers as Components 2nd ed.
C55 motion estimation Search strategy: Full vs. non-full. Accuracy: Full-pixel vs. half-pixel. Number of returned motion vectors: 1 (one 16x16) vs. 4 (four 8x8). Algorithms: 3-step algorithm (distance 4,2,1). 4-step algorithm (distance 8,4,2,1). 4-step with half-pixel refinement. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

173 Four-step motion estimation breakdown
for (i=0; i<4; i++) { compute 3 upper differences for d[i]; compute 3 middle differences for d[i]; compute 3 lower differences for d[i]; compute minimum value; move to next d; } X © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

174 C55 motion estimation accelerator
Includes 3 16-bit pixel data paths, 3 16-bit absolute differences (ADs). Basic operation: [ACx,ACy] = copr(k8,ACx,ACy,Xmem,Ymem,Coeff) K8 = control bits (enable AD units, etc.) ACx, ACy = accumulated absolute differences Xmem, Ymem = pointers to odd, even lines of the search window Pointer to two adjacent pixels from reference window © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

175 Overheads for Computers as Components 2nd ed.
C55 pixel interpolation Given four pixels A, B, C, D, interpolate three half-pixels: A U M R B C D © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

176 Pixel interpolation coprocessor operations
Load pixels and compute: ACy=copr(k8,AC,Lmem) Load pixels, compute, and store: ACy=copr(k8,AACx,Lmem) || Lmem=ACz © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

177 Overheads for Computers as Components 2nd ed.
CPUs Caches. Memory management. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

178 Overheads for Computers as Components 2nd ed.
Caches and CPUs address data cache main memory CPU controller cache address data data © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

179 Overheads for Computers as Components 2nd ed.
Cache operation Many main memory locations are mapped onto one cache entry. May have caches for: instructions; data; data + instructions (unified). Memory access time is no longer deterministic. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

180 Overheads for Computers as Components 2nd ed.
Terms Cache hit: required location is in cache. Cache miss: required location is not in cache. Working set: set of locations used by program in a time interval. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

181 Overheads for Computers as Components 2nd ed.
Types of misses Compulsory (cold): location has never been accessed. Capacity: working set is too large. Conflict: multiple locations in working set map to same cache entry. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

182 Memory system performance
h = cache hit rate. tcache = cache access time, tmain = main memory access time. Average memory access time: tav = htcache + (1-h)tmain © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

183 Multiple levels of cache
L2 cache CPU L1 cache © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

184 Multi-level cache access time
h1 = cache hit rate. h2 = rate for miss on L1, hit on L2. Average memory access time: tav = h1tL1 + (h2-h1)tL2 + (1- h2-h1)tmain © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

185 Overheads for Computers as Components 2nd ed.
Replacement policies Replacement policy: strategy for choosing which cache entry to throw out to make room for a new memory location. Two popular strategies: Random. Least-recently used (LRU). © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

186 Overheads for Computers as Components 2nd ed.
Cache organizations Fully-associative: any memory location can be stored anywhere in the cache (almost never implemented). Direct-mapped: each memory location maps onto exactly one cache entry. N-way set-associative: each memory location can go into one of n sets. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

187 Cache performance benefits
Keep frequently-accessed locations in fast cache. Cache retrieves more than one word at a time. Sequential accesses are faster after first access. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

188 Overheads for Computers as Components 2nd ed.
Direct-mapped cache valid tag data 1 0xabcd byte byte byte ... byte cache block tag index offset = hit value © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

189 Overheads for Computers as Components 2nd ed.
Write operations Write-through: immediately copy write to main memory. Write-back: write to main memory only when location is removed from cache. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

190 Direct-mapped cache locations
Many locations map onto the same cache block. Conflict misses are easy to generate: Array a[] uses locations 0, 1, 2, … Array b[] uses locations 1024, 1025, 1026, … Operation a[i] + b[i] generates conflict misses. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

191 Set-associative cache
A set of direct-mapped caches: Set 1 Set 2 Set n ... hit data © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

192 Example: direct-mapped vs. set-associative
© 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

193 Direct-mapped cache behavior
After 001 access: block tag data After 010 access: block tag data © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

194 Direct-mapped cache behavior, cont’d.
After 011 access: block tag data After 100 access: block tag data © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

195 Direct-mapped cache behavior, cont’d.
After 101 access: block tag data After 111 access: block tag data © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

196 2-way set-associtive cache behavior
Final state of cache (twice as big as direct-mapped): set blk 0 tag blk 0 data blk 1 tag blk 1 data © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

197 2-way set-associative cache behavior
Final state of cache (same size as direct-mapped): set blk 0 tag blk 0 data blk 1 tag blk 1 data © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

198 Overheads for Computers as Components 2nd ed.
Example caches StrongARM: 16 Kbyte, 32-way, 32-byte block instruction cache. 16 Kbyte, 32-way, 32-byte block data cache (write-back). SHARC: 32-instruction, 2-way instruction cache. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

199 Memory management units
Memory management unit (MMU) translates addresses: main memory logical address memory management unit physical address CPU © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

200 Memory management tasks
Allows programs to move in physical memory during execution. Allows virtual memory: memory images kept in secondary storage; images returned to main memory on demand during execution. Page fault: request for location not resident in memory. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

201 Overheads for Computers as Components 2nd ed.
Address translation Requires some sort of register/table to allow arbitrary mappings of logical to physical addresses. Two basic schemes: segmented; paged. Segmentation and paging can be combined (x86). © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

202 Overheads for Computers as Components 2nd ed.
Segments and pages memory page 1 segment 1 page 2 segment 2 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

203 Segment address translation
segment base address logical address + segment lower bound range error range check segment upper bound physical address © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

204 Page address translation
offset page i base concatenate page offset © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

205 Page table organizations
descriptor page descriptor flat tree © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

206 Caching address translations
Large translation tables require main memory access. TLB: cache for address translation. Typically small. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

207 Overheads for Computers as Components 2nd ed.
ARM memory management Memory region types: section: 1 Mbyte block; large page: 64 kbytes; small page: 4 kbytes. An address is marked as section-mapped or page-mapped. Two-level translation scheme. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

208 ARM address translation
Translation table base register 1st index 2nd index offset 1st level table descriptor concatenate concatenate 2nd level table descriptor physical address © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

209 Overheads for Computers as Components 2nd ed.
CPUs CPU performance CPU power consumption. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

210 Elements of CPU performance
Cycle time. CPU pipeline. Memory system. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

211 Overheads for Computers as Components 2nd ed.
Pipelining Several instructions are executed simultaneously at different stages of completion. Various conditions can cause pipeline bubbles that reduce utilization: branches; memory system delays; etc. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

212 Overheads for Computers as Components 2nd ed.
Performance measures Latency: time it takes for an instruction to get through the pipeline. Throughput: number of instructions executed per time period. Pipelining increases throughput without reducing latency. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

213 Overheads for Computers as Components 2nd ed.
ARM7 pipeline ARM 7 has 3-stage pipe: fetch instruction from memory; decode opcode and operands; execute. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

214 ARM pipeline execution
add r0,r1,#5 fetch decode fetch execute decode fetch execute decode sub r2,r3,r6 execute cmp r2,#3 time 1 2 3 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

215 Overheads for Computers as Components 2nd ed.
Pipeline stalls If every step cannot be completed in the same amount of time, pipeline stalls. Bubbles introduced by stall increase latency, reduce throughput. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

216 ARM multi-cycle LDMIA instruction
r0,{r2,r3} fetch decode ex ld r2 ex ld r3 sub r2,r3,r6 fetch decode ex sub cmp r2,#3 fetch decode ex cmp time © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

217 Overheads for Computers as Components 2nd ed.
Control stalls Branches often introduce stalls (branch penalty). Stall time may depend on whether branch is taken. May have to squash instructions that already started executing. Don’t know what to fetch until condition is evaluated. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

218 Overheads for Computers as Components 2nd ed.
ARM pipelined branch fetch decode ex bne bne foo sub r2,r3,r6 foo add r0,r1,r2 ex bne fetch decode ex add time © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

219 Overheads for Computers as Components 2nd ed.
Delayed branch To increase pipeline efficiency, delayed branch mechanism requires n instructions after branch always executed whether branch is executed or not. SHARC supports delayed and non-delayed branches. Specified by bit in branch instruction. 2 instruction branch delay slot. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

220 Example: ARM execution time
Determine execution time of FIR filter: for (i=0; i<N; i++) f = f + c[i]*x[i]; Only branch in loop test may take more than one cycle. BLT loop takes 1 cycle best case, 3 worst case. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

221 Overheads for Computers as Components 2nd ed.
FIR filter ARM code ; loop initiation code MOV r0,#0 ; use r0 for i, set to 0 MOV r8,#0 ; use a separate index for arrays ADR r2,N ; get address for N LDR r1,[r2] ; get value of N MOV r2,#0 ; use r2 for f, set to 0 ADR r3,c ; load r3 with address of base of c ADR r5,x ; load r5 with address of base of x ; loop body loop LDR r4,[r3,r8] ; get value of c[i] LDR r6,[r5,r8] ; get value of x[i] MUL r4,r4,r6 ; compute c[i]*x[i] ADD r2,r2,r4 ; add into running sum ; update loop counter and array index ADD r8,r8,#4 ; add one to array index ADD r0,r0,#1 ; add 1 to i ; test for exit CMP r0,r1 BLT loop ; if i < N, continue loop loopend ... © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

222 FIR filter performance by block
Variable # instructions # cycles Initialization tinit 7 Body tbody 4 Update tupdate 2 Test ttest [2,4] tloop = tinit+ N(tbody + tupdate) + (N-1) ttest,worst + ttest,best Loop test succeeds is worst case Loop test fails is best case © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

223 Overheads for Computers as Components 2nd ed.
C55x pipeline C55x has 7-stage pipe: fetch; decode; address: computes data/branch addresses; access 1: reads data; access 2: finishes data read; Read stage: puts operands on internal busses; execute. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

224 Overheads for Computers as Components 2nd ed.
C55x organization C, D busses Dual operand read D bus Single operand read B bus Dual-multiply coefficient 3 data read busses 16 Data read from memory 3 data read address busses 24 program address bus 24 Instruction fetch program read bus Instruction unit Program flow unit Address unit Data unit 32 Writes 2 data write busses 16 2 data write address busses 24 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

225 Overheads for Computers as Components 2nd ed.
C55x pipeline hazards Processor structure: Three computation units. 14 operators. Can perform two operations per instruction. Some combinations of operators are not legal. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

226 Overheads for Computers as Components 2nd ed.
C55x hazards A-unit ALU/A-unit ALU. A-unit swap/A-unit swap. D-unit ALU,shifter,MAC/D-unit ALU,shifter,MAC D-unit shifter/D-unit shift, store D-unit shift, store/D-unit shift, store D-unit swap/D-unit swap P-unit control/P-unit control © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

227 Memory system performance
Caches introduce indeterminacy in execution time. Depends on order of execution. Cache miss penalty: added time due to a cache miss. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

228 Overheads for Computers as Components 2nd ed.
Types of cache misses Compulsory miss: location has not been referenced before. Conflict miss: two locations are fighting for the same block. Capacity miss: working set is too large. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

229 Overheads for Computers as Components 2nd ed.
CPU power consumption Most modern CPUs are designed with power consumption in mind to some degree. Power vs. energy: heat depends on power consumption; battery life depends on energy consumption. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

230 CMOS power consumption
Voltage drops: power consumption proportional to V2. Toggling: more activity means more power. Leakage: basic circuit characteristics; can be eliminated by disconnecting power. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

231 CPU power-saving strategies
Reduce power supply voltage. Run at lower clock frequency. Disable function units with control signals when not in use. Disconnect parts from power supply when not in use. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

232 Overheads for Computers as Components 2nd ed.
C55x low power features Parallel execution units---longer idle shutdown times. Multiple data widths: 16-bit ALU vs. 40-bit ALU. Instruction caches minimizes main memory accesses. Power management: Function unit idle detection. Memory idle detection. User-configurable IDLE domains allow programmer control of what hardware is shut down. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

233 Power management styles
Static power management: does not depend on CPU activity. Example: user-activated power-down mode. Dynamic power management: based on CPU activity. Example: disabling off function units. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

234 Application: PowerPC 603 energy features
Provides doze, nap, sleep modes. Dynamic power management features: Uses static logic. Can shut down unused execution units. Cache organized into subarrays to minimize amount of active circuitry. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

235 Overheads for Computers as Components 2nd ed.
PowerPC 603 activity Percentage of time units are idle for SPEC integer/floating-point: unit Specint92 Specfp92 D cache 29% 28% I cache 29% 17% load/store 35% 17% fixed-point 38% 76% floating-point 99% 30% system register 89% 97% © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

236 Overheads for Computers as Components 2nd ed.
Power-down costs Going into a power-down mode costs: time; energy. Must determine if going into mode is worthwhile. Can model CPU power states with power state machine. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

237 Application: StrongARM SA-1100 power saving
Processor takes two supplies: VDD is main 3.3V supply. VDDX is 1.5V. Three power modes: Run: normal operation. Idle: stops CPU clock, with logic still powered. Sleep: shuts off most of chip activity; 3 steps, each about 30 ms; wakeup takes > 10 ms. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

238 SA-1100 power state machine
Prun = 400 mW run 10 ms 160 ms 90 ms 10 ms 90 ms idle sleep Pidle = 50 mW Psleep = 0.16 mW © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

239 Overheads for Computers as Components 2nd ed.
CPUs Example: data compressor. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

240 Overheads for Computers as Components 2nd ed.
Goals Compress data transmitted over serial line. Receives byte-size input symbols. Produces output symbols packed into bytes. Will build software module only here. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

241 Collaboration diagram for compressor
1..m: packed output symbols 1..n: input symbols :input :data compressor :output © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

242 Overheads for Computers as Components 2nd ed.
Huffman coding Early statistical text compression algorithm. Select non-uniform size codes. Use shorter codes for more common symbols. Use longer codes for less common symbols. To allow decoding, codes must have unique prefixes. No code can be a prefix of a longer valid code. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

243 Overheads for Computers as Components 2nd ed.
Huffman example character P a .45 b .24 c .11 d .08 e .07 f .05 P=1 P=.55 P=.31 P=.19 P=.12 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

244 Overheads for Computers as Components 2nd ed.
Example Huffman code Read code from root to leaves: a 1 b 01 c 0000 d 0001 e 0010 f 0011 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

245 Huffman coder requirements table
© 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

246 Building a specification
Collaboration diagram shows only steady-state input/output. A real system must: Accept an encoding table. Allow a system reset that flushes the compression buffer. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

247 data-compressor class
buffer: data-buffer table: symbol-table current-bit: integer encode(): boolean, data-buffer flush() new-symbol-table() © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

248 data-compressor behaviors
encode: Takes one-byte input, generates packed encoded symbols and a Boolean indicating whether the buffer is full. new-symbol-table: installs new symbol table in object, throws away old table. flush: returns current state of buffer, including number of valid bits in buffer. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

249 Overheads for Computers as Components 2nd ed.
Auxiliary classes data-buffer symbol-table databuf[databuflen] : character len : integer symbols[nsymbols] : data-buffer len : integer insert() length() : integer value() : symbol load() © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

250 Overheads for Computers as Components 2nd ed.
Auxiliary class roles data-buffer holds both packed and unpacked symbols. Longest Huffman code for 8-bit inputs is 256 bits. symbol-table indexes encoded verison of each symbol. load() puts data in a new symbol table. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

251 Overheads for Computers as Components 2nd ed.
Class relationships data-compressor 1 1 1 1 data-buffer symbol-table © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

252 Overheads for Computers as Components 2nd ed.
Encode behavior create new buffer add to buffers return true T input symbol encode buffer filled? F add to buffer return false © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

253 Overheads for Computers as Components 2nd ed.
Insert behavior pack into this buffer input symbol T update length fills buffer? F pack bottom bits into this buffer, top bits into overflow buffer © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

254 Overheads for Computers as Components 2nd ed.
Program design In an object-oriented language, we can reflect the UML specification in the code more directly. In a non-object-oriented language, we must either: add code to provide object-oriented features; diverge from the specification structure. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

255 Overheads for Computers as Components 2nd ed.
C++ classes Class data_buffer { char databuf[databuflen]; int len; int length_in_chars() { return len/bitsperbyte; } public: void insert(data_buffer,data_buffer&); int length() { return len; } int length_in_bytes() { return (int)ceil(len/8.0); } int initialize(); ... © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

256 Overheads for Computers as Components 2nd ed.
C++ classes, cont’d. class data_compressor { data_buffer buffer; int current_bit; symbol_table table; public: boolean encode(char,data_buffer&); void new_symbol_table(symbol_table); int flush(data_buffer&); data_compressor(); ~data_compressor(); } © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

257 Overheads for Computers as Components 2nd ed.
C code struct data_compressor_struct { data_buffer buffer; int current_bit; sym_table table; } typedef struct data_compressor_struct data_compressor, *data_compressor_ptr; boolean data_compressor_encode(data_compressor_ptr mycmptrs, char isymbol, data_buffer *fullbuf) ... © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

258 Overheads for Computers as Components 2nd ed.
Testing Test by encoding, then decoding: symbol table result input symbols encoder decoder compare © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

259 Overheads for Computers as Components 2nd ed.
Code inspection tests Look at the code for potential problems: Can we run past end of symbol table? What happens when the next symbol does not fill the buffer? Does fill it? Do very long encoded symbols work properly? Very short symbols? Does flush() work properly? © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

260 Bus-Based Computer Systems
Busses. Memory devices. I/O devices: serial links timers and counters keyboards displays analog I/O © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

261 Overheads for Computers as Components 2nd ed.
The CPU bus Bus allows CPU, memory, devices to communicate. Shared communication medium. A bus is: A set of wires. A communications protocol. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

262 Overheads for Computers as Components 2nd ed.
Bus protocols Bus protocol determines how devices communicate. Devices on the bus go through sequences of states. Protocols are specified by state machines, one state machine per actor in the protocol. May contain asynchronous logic behavior. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

263 Overheads for Computers as Components 2nd ed.
Four-cycle handshake device 1 enq device 1 device 2 ack device 2 1 2 3 4 time © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

264 Four-cycle handshake, cont’d.
Device 1 raises enq. Device 2 responds with ack. Device 2 lowers ack once it has finished. Device 1 lowers enq. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

265 Microprocessor busses
Clock provides synchronization. R/W is true when reading (R/W’ is false when reading). Address is a-bit bundle of address lines. Data is n-bit bundle of data lines. Data ready signals when n-bit data is ready. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

266 Overheads for Computers as Components 2nd ed.
Timing diagrams © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

267 Overheads for Computers as Components 2nd ed.
Bus read © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

268 State diagrams for bus read
Get data Senddata Done Release ack See ack Ack Adrs Adrs Wait Wait device CPU start © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

269 Overheads for Computers as Components 2nd ed.
Bus wait state © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

270 Overheads for Computers as Components 2nd ed.
Bus burst read © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

271 Overheads for Computers as Components 2nd ed.
Bus multiplexing device data enable CPU data adrs adrs Adrs enable © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

272 Overheads for Computers as Components 2nd ed.
DMA Direct memory access (DMA) performs data transfers without executing instructions. CPU sets up transfer. DMA engine fetches, writes. DMA controller is a separate unit. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

273 Overheads for Computers as Components 2nd ed.
Bus mastership By default, CPU is bus master and initiates transfers. DMA must become bus master to perform its work. CPU can’t use bus while DMA operates. Bus mastership protocol: Bus request. Bus grant. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

274 Overheads for Computers as Components 2nd ed.
DMA operation CPU sets DMA registers for start address, length. DMA status register controls the unit. Once DMA is bus master, it transfers automatically. May run continuously until complete. May use every nth bus cycle. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

275 Bus transfer sequence diagram
© 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

276 System bus configurations
Multiple busses allow parallelism: Slow devices on one bus. Fast devices on separate bus. A bridge connects two busses. CPU slow device bridge memory slow device high-speed device © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

277 Overheads for Computers as Components 2nd ed.
Bridge state diagram © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

278 Overheads for Computers as Components 2nd ed.
ARM AMBA bus Two varieties: AHB is high-performance. APB is lower-speed, lower cost. AHB supports pipelining, burst transfers, split transactions, multiple bus masters. All devices are slaves on APB. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

279 Overheads for Computers as Components 2nd ed.
Memory components Several different types of memory: DRAM. SRAM. Flash. Each type of memory comes in varying: Capacities. Widths. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

280 Overheads for Computers as Components 2nd ed.
Random-access memory Dynamic RAM is dense, requires refresh. Synchronous DRAM is dominant type. SDRAM uses clock to improve performance, pipeline memory accesses. Static RAM is faster, less dense, consumes more power. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

281 Overheads for Computers as Components 2nd ed.
SDRAM operation © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

282 Overheads for Computers as Components 2nd ed.
Read-only memory ROM may be programmed at factory. Flash is dominant form of field-programmable ROM. Electrically erasable, must be block erased. Random access, but write/erase is much slower than read. NOR flash is more flexible. NAND flash is more dense. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

283 Overheads for Computers as Components 2nd ed.
Timers and counters Very similar: a timer is incremented by a periodic signal; a counter is incremented by an asynchronous, occasional signal. Rollover causes interrupt. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

284 Overheads for Computers as Components 2nd ed.
Watchdog timer Watchdog timer is periodically reset by system timer. If watchdog is not reset, it generates an interrupt to reset the host. host CPU interrupt watchdog timer reset © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

285 Overheads for Computers as Components 2nd ed.
Switch debouncing A switch must be debounced to multiple contacts caused by eliminate mechanical bouncing: © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

286 Overheads for Computers as Components 2nd ed.
Encoded keyboard An array of switches is read by an encoder. N-key rollover remembers multiple key depressions. row © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

287 Overheads for Computers as Components 2nd ed.
LED Must use resistor to limit current: © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

288 Overheads for Computers as Components 2nd ed.
7-segment LCD display May use parallel or multiplexed input. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

289 Types of high-resolution display
Liquid crystal display (LCD) is dominant form. Plasma, OLED, etc. Frame buffer holds current display contents. Written by processor. Read by video. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

290 Overheads for Computers as Components 2nd ed.
Touchscreen Includes input and output device. Input device is a two-dimensional voltmeter: © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

291 Touchscreen position sensing
ADC voltage © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

292 Digital-to-analog conversion
Use resistor tree: R Vout bn 2R bn-1 4R bn-2 8R bn-3 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

293 Overheads for Computers as Components 2nd ed.
Flash A/D conversion N-bit result requires 2n comparators: Vin encoder ... © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

294 Dual-slope conversion
Use counter to time required to charge/discharge capacitor. Charging, then discharging eliminates non-linearities. Vin timer © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

295 Overheads for Computers as Components 2nd ed.
Sample-and-hold Samples data: converter Vin © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

296 Bus-Based Computer Systems
Designing with microprocessors. Development and debugging. System-level performance analysis. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

297 Overheads for Computers as Components 2nd ed.
System architectures Architectures and components: software; hardware. Some software is very hardware-dependent. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

298 Hardware platform architecture
Contains several elements: CPU; bus; memory; I/O devices: networking, sensors, actuators, etc. How big/fast much each one be? © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

299 Software architecture
Functional description must be broken into pieces: division among people; conceptual organization; performance; testability; maintenance. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

300 Hardware and software architectures
Hardware and software are intimately related: software doesn’t run without hardware; how much hardware you need is determined by the software requirements: speed; memory. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

301 Overheads for Computers as Components 2nd ed.
Evaluation boards Designed by CPU manufacturer or others. Includes CPU, memory, some I/O devices. May include prototyping section. CPU manufacturer often gives out evaluation board netlist---can be used as starting point for your custom board design. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

302 Overheads for Computers as Components 2nd ed.
Adding logic to a board Programmable logic devices (PLDs) provide low/medium density logic. Field-programmable gate arrays (FPGAs) provide more logic and multi-level logic. Application-specific integrated circuits (ASICs) are manufactured for a single purpose. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

303 Overheads for Computers as Components 2nd ed.
The PC as a platform Advantages: cheap and easy to get; rich and familiar software environment. Disadvantages: requires a lot of hardware resources; not well-adapted to real-time. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

304 Typical PC hardware platform
CPU memory device CPU bus interface bus high-speed bus DMA controller intr ctrl timers low-speed bus bus interface device © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

305 Overheads for Computers as Components 2nd ed.
Typical busses PCI: standard for high-speed interfacing 33 or 66 MHz. PCI Express. USB (Universal Serial Bus), Firewire (IEEE 1394): relatively low-cost serial interface with high speed. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

306 Overheads for Computers as Components 2nd ed.
Software elements IBM PC uses BIOS (Basic I/O System) to implement low-level functions: boot-up; minimal device drivers. BIOS has become a generic term for the lowest-level system software. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

307 Overheads for Computers as Components 2nd ed.
Example: StrongARM StrongARM system includes: CPU chip (3.686 MHz clock) system control module ( kHz clock). Real-time clock; operating system timer general-purpose I/O; interrupt controller; power manager controller; reset controller. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

308 Debugging embedded systems
Challenges: target system may be hard to observe; target may be hard to control; may be hard to generate realistic inputs; setup sequence may be complex. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

309 Overheads for Computers as Components 2nd ed.
Host/target design Use a host system to prepare software for target system: target system serial line host system © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

310 Overheads for Computers as Components 2nd ed.
Host-based tools Cross compiler: compiles code on host for target system. Cross debugger: displays target state, allows target system to be controlled. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

311 Overheads for Computers as Components 2nd ed.
Software debuggers A monitor program residing on the target provides basic debugger functions. Debugger should have a minimal footprint in memory. User program must be careful not to destroy debugger program, but , should be able to recover from some damage caused by user code. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

312 Overheads for Computers as Components 2nd ed.
Breakpoints A breakpoint allows the user to stop execution, examine system state, and change state. Replace the breakpointed instruction with a subroutine call to the monitor program. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

313 Overheads for Computers as Components 2nd ed.
ARM breakpoints 0x400 MUL r4,r6,r6 0x404 ADD r2,r2,r4 0x408 ADD r0,r0,#1 0x40c B loop uninstrumented code 0x400 MUL r4,r6,r6 0x404 ADD r2,r2,r4 0x408 ADD r0,r0,#1 0x40c BL bkpoint code with breakpoint © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

314 Breakpoint handler actions
Save registers. Allow user to examine machine. Before returning, restore system state. Safest way to execute the instruction is to replace it and execute in place. Put another breakpoint after the replaced breakpoint to allow restoring the original breakpoint. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

315 Overheads for Computers as Components 2nd ed.
In-circuit emulators A microprocessor in-circuit emulator is a specially-instrumented microprocessor. Allows you to stop execution, examine CPU state, modify registers. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

316 Overheads for Computers as Components 2nd ed.
Logic analyzers A logic analyzer is an array of low-grade oscilloscopes: © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

317 Logic analyzer architecture
UUT sample memory microprocessor system clock vector address controller state or timing mode clock gen keypad display © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

318 Overheads for Computers as Components 2nd ed.
Boundary scan Simplifies testing of multiple chips on a board. Registers on pins can be configured as a scan chain. Used for debuggers, in-circuit emulators. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

319 Overheads for Computers as Components 2nd ed.
How to exercise code Run on host system. Run on target system. Run in instruction-level simulator. Run on cycle-accurate simulator. Run in hardware/software co-simulation environment. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

320 Debugging real-time code
Bugs in drivers can cause non-deterministic behavior in the foreground problem. Bugs may be timing-dependent. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

321 System-level performance analysis
Performance depends on all the elements of the system: CPU. Cache. Bus. Main memory. I/O device. CPU memory cache © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

322 Bandwidth as performance
Bandwidth applies to several components: Memory. Bus. CPU fetches. Different parts of the system run at different clock rates. Different components may have different widths (bus, memory). © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

323 Bandwidth and data transfers
Video frame: 320 x 240 x 3 = 230,400 bytes. Transfer in 1/30 sec. Transfer 1 byte/msec, 0.23 sec per frame. Too slow. Increase bandwidth: Increase bus width. Increase bus clock rate. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

324 Overheads for Computers as Components 2nd ed.
Bus bandwidth T: # bus cycles. P: time/bus cycle. Total time for transfer: t = TP. D: data payload length. O1 + O2 = overhead O. O1 D O2 W Tbasic(N) = (D+O)N/W © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

325 Bus burst transfer bandwidth
T: # bus cycles. P: time/bus cycle. Total time for transfer: t = TP. D: data payload length. O1 + O2 = overhead O. 1 2 B O W Tburst(N) = (BD+O)N/(BW) © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

326 Overheads for Computers as Components 2nd ed.
Memory aspect ratios 16 M 64 M 8 M 8 1 4 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

327 Overheads for Computers as Components 2nd ed.
Memory access times Memory component access times comes from chip data sheet. Page modes allow faster access for successive transfers on same page. If data doesn’t fit naturally into physical words: A = [(E/w)mod W]+1 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

328 Bus performance bottlenecks
Transfer 320 x 240 video 30 frames/sec = 612,000 bytes/sec. Is performance bottleneck bus or memory? CPU memory © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

329 Bus performance bottlenecks, cont’d.
Bus: assume 1 MHz bus, D=1, O=3: Tbasic = (1+3)612,000/2 = 1,224,000 cycles = sec. Memory: try burst mode B=4, width w=0.5. Tmem = (4*1+4)612,000/(4*0.5) = 2,448,000 cycles = sec. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

330 Performance spreadsheet
© 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

331 Overheads for Computers as Components 2nd ed.
Parallelism Speed things up by running several units at once. DMA provides parallelism if CPU doesn’t need the bus: DMA + bus. CPU. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

332 Bus-Based Computer Systems
Example: alarm clock © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

333 Overheads for Computers as Components 2nd ed.
Alarm clock interface Alarm on Alarm off buzzer PM Alarm ready light set time set alarm hour minute button © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

334 Overheads for Computers as Components 2nd ed.
Operations Set time: hold set time, depress hour, minute. Set alarm time: hold set alarm, depress hour, minute. Turn alarm on/off: depress alarm on/off. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

335 Alarm clock requirements
© 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

336 Alarm clock class diagram
1 1 1 1 Lights* Display Mechanism 1 1 1 Buttons* Speaker* 1 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

337 Alarm clock physical classes
Lights* Buttons* Speaker* digit-val() digit-scan() alarm-on-light() PM-light() set-time(): boolean set-alarm(): boolean alarm-on(): boolean alarm-off(): boolean minute(): boolean hour(): boolean buzz() © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

338 Overheads for Computers as Components 2nd ed.
Display class Display time[4]: integer alarm-indicator: boolean PM-indicator: boolean set-time() alarm-light-on() alarm-light-off() PM-light-on() PM-light-off() © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

339 Overheads for Computers as Components 2nd ed.
Mechanism class Mechanism Seconds: integer PM: boolean tens-hours, ones-hours: boolean tens-minutes, ones-minutes: boolean alarm-ready: boolean alarm-tens-hours, alarm-ones-hours: boolean alarm-tens-minutes, alarm-ones-minutes: scan-keyboard() update-time() © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

340 Update-time behavior update seconds with rollover
display.set-time(current time) F Time >= alarm and alarm-on? Rollover? F T T update hh:mm with rollover alarm.buzzer(true) PM->AM AM->PM PM=true PM=false © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

341 Scan-keyboard behavior
Set-time and not set-alarm and hours compute button activations Alarm-on Increment time tens w. rollover and AM/PM alarm-ready= true Alarm-off alarm-ready= false alarm.buzzer(false) Increment time ones w. rollover and AM/PM save button states Set-time and not set-alarm and minutes © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

342 Overheads for Computers as Components 2nd ed.
System architecture Includes: periodic behavior (clock); aperiodic behavior (buttons, buzzer activation). Two major software components: interrupt-driven routine updates time; foreground program deals with buttons, commands. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

343 Interrupt-driven routine
Timer probably can’t handle one-minute interrupt interval. Use software variable to convert interrupt frequency to seconds. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

344 Overheads for Computers as Components 2nd ed.
Foreground program Operates as while loop: while (TRUE) { read_buttons(button_values); process_command(button_values); check_alarm(); } © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

345 Overheads for Computers as Components 2nd ed.
Testing Component testing: test interrupt code on the platform; can test foreground program using a mock-up. System testing: relatively few components to integrate; check clock accuracy; check recognition of buttons, buzzer, etc. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

346 Program design and analysis
Software components. Representations of programs. Assembly and linking. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

347 Software state machine
State machine keeps internal state as a variable, changes state based on inputs. Uses: control-dominated code; reactive systems. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

348 Overheads for Computers as Components 2nd ed.
State machine example no seat/- no seat/ buzzer off idle seat/timer on no seat/- no belt and no timer/- buzzer seated Belt/buzzer on belt/- belt/ buzzer off belted no belt/timer on © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

349 Overheads for Computers as Components 2nd ed.
C implementation #define IDLE 0 #define SEATED 1 #define BELTED 2 #define BUZZER 3 switch (state) { case IDLE: if (seat) { state = SEATED; timer_on = TRUE; } break; case SEATED: if (belt) state = BELTED; else if (timer) state = BUZZER; } © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

350 Signal processing and circular buffer
Commonly used in signal processing: new data constantly arrives; each datum has a limited lifetime. Use a circular buffer to hold the data stream. time t time t+1 d1 d2 d3 d4 d5 d6 d7 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

351 Overheads for Computers as Components 2nd ed.
Circular buffer x1 x2 x3 x4 x5 x6 t1 t2 t3 Data stream x1 x5 x6 x2 x7 x3 x4 Circular buffer © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

352 Overheads for Computers as Components 2nd ed.
Circular buffers Indexes locate currently used data, current input data: d5 d1 input use d2 d2 input d3 d3 d4 d4 use time t1+1 time t1 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

353 Circular buffer implementation: FIR filter
int circ_buffer[N], circ_buffer_head = 0; int c[N]; /* coefficients */ int ibuf, ic; for (f=0, ibuff=circ_buff_head, ic=0; ic<N; ibuff=(ibuff==N-1?0:ibuff++), ic++) f = f + c[ic]*circ_buffer[ibuf]; © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

354 Overheads for Computers as Components 2nd ed.
Queues Elastic buffer: holds data that arrives irregularly. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

355 Overheads for Computers as Components 2nd ed.
Buffer-based queues #define Q_SIZE 32 #define Q_MAX (Q_SIZE-1) int q[Q_MAX], head, tail; void initialize_queue() { head = tail = 0; } void enqueue(int val) { if (((tail+1)%Q_SIZE) == head) error(); q[tail]=val; if (tail == Q_MAX) tail = 0; else tail++; } int dequeue() { int returnval; if (head == tail) error(); returnval = q[head]; if (head == Q_MAX) head = 0; else head++; return returnval; } © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

356 Overheads for Computers as Components 2nd ed.
Models of programs Source code is not a good representation for programs: clumsy; leaves much information implicit. Compilers derive intermediate representations to manipulate and optiize the program. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

357 Overheads for Computers as Components 2nd ed.
Data flow graph DFG: data flow graph. Does not represent control. Models basic block: code with no entry or exit. Describes the minimal ordering requirements on operations. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

358 Single assignment form
x = a + b; y = c - d; z = x * y; y = b + d; original basic block x = a + b; y = c - d; z = x * y; y1 = b + d; single assignment form © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

359 Overheads for Computers as Components 2nd ed.
Data flow graph x = a + b; y = c - d; z = x * y; y1 = b + d; single assignment form a b c d + - y x * + z y1 DFG © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

360 DFGs and partial orders
a+b, c-d; b+d x*y Can do pairs of operations in any order. a b c d + - y x * + z y1 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

361 Control-data flow graph
CDFG: represents control and data. Uses data flow graphs as components. Two types of nodes: decision; data flow. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

362 Overheads for Computers as Components 2nd ed.
Data flow node Encapsulates a data flow graph: Write operations in basic block form for simplicity. x = a + b; y = c + d © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

363 Overheads for Computers as Components 2nd ed.
Control cond T v1 v4 value v3 v2 F Equivalent forms © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

364 Overheads for Computers as Components 2nd ed.
CDFG example T if (cond1) bb1(); else bb2(); bb3(); switch (test1) { case c1: bb4(); break; case c2: bb5(); break; case c3: bb6(); break; } cond1 bb1() F bb2() bb3() test1 c3 c1 c2 bb4() bb5() bb6() © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

365 Overheads for Computers as Components 2nd ed.
for loop for (i=0; i<N; i++) loop_body(); for loop i=0; while (i<N) { loop_body(); i++; } equivalent i=0 i<N F T loop_body() © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

366 Overheads for Computers as Components 2nd ed.
Assembly and linking Last steps in compilation: HLL compile assembly HLL assembly assemble HLL assembly link link executable © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

367 Multiple-module programs
Programs may be composed from several files. Addresses become more specific during processing: relative addresses are measured relative to the start of a module; absolute addresses are measured relative to the start of the CPU address space. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

368 Overheads for Computers as Components 2nd ed.
Assemblers Major tasks: generate binary for symbolic instructions; translate labels into addresses; handle pseudo-ops (data, etc.). Generally one-to-one translation. Assembly labels: ORG 100 label1 ADR r4,c © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

369 Overheads for Computers as Components 2nd ed.
Symbol table ADD r0,r1,r2 xx ADD r3,r4,r5 CMP r0,r3 yy SUB r5,r6,r7 assembly code xx 0x8 yy 0x10 symbol table © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

370 Symbol table generation
Use program location counter (PLC) to determine address of each location. Scan program, keeping count of PLC. Addresses are generated at assembly time, not execution time. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

371 Overheads for Computers as Components 2nd ed.
Symbol table example PLC=0x7 ADD r0,r1,r2 xx ADD r3,r4,r5 CMP r0,r3 yy SUB r5,r6,r7 xx 0x8 PLC=0x7 yy 0x10 PLC=0x7 PLC=0x7 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

372 Overheads for Computers as Components 2nd ed.
Two-pass assembly Pass 1: generate symbol table Pass 2: generate binary instructions © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

373 Relative address generation
Some label values may not be known at assembly time. Labels within the module may be kept in relative form. Must keep track of external labels---can’t generate full binary for instructions that use external labels. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

374 Overheads for Computers as Components 2nd ed.
Pseudo-operations Pseudo-ops do not generate instructions: ORG sets program location. EQU generates symbol table entry without advancing PLC. Data statements define data blocks. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

375 Overheads for Computers as Components 2nd ed.
Linking Combines several object modules into a single executable module. Jobs: put modules in order; resolve labels across modules. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

376 Externals and entry points
xxx ADD r1,r2,r3 B a yyy %1 a ADR r4,yyy ADD r3,r4,r5 external reference © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

377 Overheads for Computers as Components 2nd ed.
Module ordering Code modules must be placed in absolute positions in the memory space. Load map or linker flags control the order of modules. module1 module2 module3 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

378 Overheads for Computers as Components 2nd ed.
Dynamic linking Some operating systems link modules dynamically at run time: shares one copy of library among all executing programs; allows programs to be updated with new versions of libraries. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

379 Program design and analysis
Compilation flow. Basic statement translation. Basic optimizations. Interpreters and just-in-time compilers. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

380 Overheads for Computers as Components 2nd ed.
Compilation Compilation strategy (Wirth): compilation = translation + optimization Compiler determines quality of code: use of CPU resources; memory access scheduling; code size. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

381 Basic compilation phases
HLL parsing, symbol table machine-independent optimizations machine-dependent optimizations assembly © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

382 Statement translation and optimization
Source code is translated into intermediate form such as CDFG. CDFG is transformed/optimized. CDFG is translated into instructions with optimization decisions. Instructions are further optimized. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

383 Arithmetic expressions
b a*b + 5*(c-d) a c d * - expression 5 * + DFG © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

384 Arithmetic expressions, cont’d.
b c d ADR r4,a MOV r1,[r4] ADR r4,b MOV r2,[r4] ADD r3,r1,r2 1 2 * - 5 3 ADR r4,c MOV r1,[r4] ADR r4,d MOV r5,[r4] SUB r6,r4,r5 * 4 + MUL r7,r6,#5 ADD r8,r7,r3 DFG code © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

385 Control code generation
if (a+b > 0) x = 5; else x = 7; a+b>0 x=5 x=7 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

386 Control code generation, cont’d.
ADR r5,a LDR r1,[r5] ADR r5,b LDR r2,b ADD r3,r1,r2 BLE label3 1 2 a+b>0 x=5 3 LDR r3,#5 ADR r5,x STR r3,[r5] B stmtent x=7 LDR r3,#7 ADR r5,x STR r3,[r5] stmtent ... © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

387 Overheads for Computers as Components 2nd ed.
Procedure linkage Need code to: call and return; pass parameters and results. Parameters and returns are passed on stack. Procedures with few parameters may use registers. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

388 Overheads for Computers as Components 2nd ed.
Procedure stacks growth proc1 proc1(int a) { proc2(5); } FP frame pointer proc2 5 accessed relative to SP SP stack pointer © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

389 Overheads for Computers as Components 2nd ed.
ARM procedure linkage APCS (ARM Procedure Call Standard): r0-r3 pass parameters into procedure. Extra parameters are put on stack frame. r0 holds return value. r4-r7 hold register values. r11 is frame pointer, r13 is stack pointer. r10 holds limiting address on stack size to check for stack overflows. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

390 Overheads for Computers as Components 2nd ed.
Data structures Different types of data structures use different data layouts. Some offsets into data structure can be computed at compile time, others must be computed at run time. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

391 One-dimensional arrays
C array name points to 0th element: a a[0] a[1] = *(a + 1) a[2] © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

392 Two-dimensional arrays
Column-major layout: M ... N a[0,0] a[0,1] ... a[1,0] a[1,1] = a[i*M+j] © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

393 Overheads for Computers as Components 2nd ed.
Structures Fields within structures are static offsets: aptr field1 4 bytes struct { int field1; char field2; } mystruct; struct mystruct a, *aptr = &a; *(aptr+4) field2 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

394 Expression simplification
Constant folding: 8+1 = 9 Algebraic: a*b + a*c = a*(b+c) Strength reduction: a*2 = a<<1 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

395 Overheads for Computers as Components 2nd ed.
Dead code elimination Dead code: #define DEBUG 0 if (DEBUG) dbg(p1); Can be eliminated by analysis of control flow, constant folding. 1 dbg(p1); © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

396 Overheads for Computers as Components 2nd ed.
Procedure inlining Eliminates procedure linkage overhead: int foo(a,b,c) { return a + b - c;} z = foo(w,x,y); ð z = w + x + y; © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

397 Overheads for Computers as Components 2nd ed.
Loop transformations Goals: reduce loop overhead; increase opportunities for pipelining; improve memory system performance. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

398 Overheads for Computers as Components 2nd ed.
Loop unrolling Reduces loop overhead, enables some other optimizations. for (i=0; i<4; i++) a[i] = b[i] * c[i]; ð for (i=0; i<2; i++) { a[i*2] = b[i*2] * c[i*2]; a[i*2+1] = b[i*2+1] * c[i*2+1]; } © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

399 Loop fusion and distribution
Fusion combines two loops into 1: for (i=0; i<N; i++) a[i] = b[i] * 5; for (j=0; j<N; j++) w[j] = c[j] * d[j]; ð for (i=0; i<N; i++) { a[i] = b[i] * 5; w[i] = c[i] * d[i]; } Distribution breaks one loop into two. Changes optimizations within loop body. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

400 Overheads for Computers as Components 2nd ed.
Loop tiling Breaks one loop into a nest of loops. Changes order of accesses within array. Changes cache behavior. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

401 Overheads for Computers as Components 2nd ed.
Loop tiling example for (i=0; i<N; i++) for (j=0; j<N; j++) c[i] = a[i,j]*b[i]; for (i=0; i<N; i+=2) for (j=0; j<N; j+=2) for (ii=0; ii<min(i+2,n); ii++) for (jj=0; jj<min(j+2,N); jj++) c[ii] = a[ii,jj]*b[ii]; © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

402 Overheads for Computers as Components 2nd ed.
Array padding Add array elements to change mapping into cache: a[0,0] a[0,1] a[0,2] a[0,0] a[0,1] a[0,2] a[0,2] a[1,0] a[1,1] a[1,2] a[1,0] a[1,1] a[1,2] a[1,2] before after © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

403 Overheads for Computers as Components 2nd ed.
Register allocation Goals: choose register to hold each variable; determine lifespan of varible in the register. Basic case: within basic block. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

404 Register lifetime graph
w = a + b; x = c + w; y = c + d; t=1 a t=2 b c t=3 d w x y 1 2 3 time © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

405 Instruction scheduling
Non-pipelined machines do not need instruction scheduling: any order of instructions that satisfies data dependencies runs equally fast. In pipelined machines, execution time of one instruction depends on the nearby instructions: opcode, operands. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

406 Overheads for Computers as Components 2nd ed.
Reservation table A reservation table relates instructions/time to CPU resources. Time/instr A B instr1 X instr2 X X instr3 X instr4 X © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

407 Overheads for Computers as Components 2nd ed.
Software pipelining Schedules instructions across loop iterations. Reduces instruction latency in iteration i by inserting instructions from iteration i+1. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

408 Instruction selection
May be several ways to implement an operation or sequence of operations. Represent operations as graphs, match possible instruction sequences onto graph. + + * + * * MUL ADD expression templates MADD © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

409 Overheads for Computers as Components 2nd ed.
Using your compiler Understand various optimization levels (-O1, -O2, etc.) Look at mixed compiler/assembler output. Modifying compiler output requires care: correctness; loss of hand-tweaked code. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

410 Interpreters and JIT compilers
Interpreter: translates and executes program statements on-the-fly. JIT compiler: compiles small sections of code into instructions during program execution. Eliminates some translation overhead. Often requires more memory. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

411 Program design and analysis
Program-level performance analysis. Optimizing for: Execution time. Energy/power. Program size. Program validation and testing. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

412 Program-level performance analysis
Need to understand performance in detail: Real-time behavior, not just typical. On complex platforms. Program performance ¹ CPU performance: Pipeline, cache are windows into program. We must analyze the entire program. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

413 Complexities of program performance
Varies with input data: Different-length paths. Cache effects. Instruction-level performance variations: Pipeline interlocks. Fetch times. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

414 How to measure program performance
Simulate execution of the CPU. Makes CPU state visible. Measure on real CPU using timer. Requires modifying the program to control the timer. Measure on real CPU using logic analyzer. Requires events visible on the pins. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

415 Program performance metrics
Average-case execution time. Typically used in application programming. Worst-case execution time. A component in deadline satisfaction. Best-case execution time. Task-level interactions can cause best-case program behavior to result in worst-case system behavior. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

416 Elements of program performance
Basic program execution time formula: execution time = program path + instruction timing Solving these problems independently helps simplify analysis. Easier to separate on simpler CPUs. Accurate performance analysis requires: Assembly/binary code. Execution platform. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

417 Data-dependent paths in an if statement
if (a || b) { /* T1 */ if ( c ) /* T2 */ x = r*s+t; /* A1 */ else y=r+s; /* A2 */ z = r+s+u; /* A3 */ } else { if ( c ) /* T3 */ y = r-t; /* A4 */ a b c path T1=F, T3=F: no assignments 1 T1=F, T3=T: A4 T1=T, T2=F: A2, A3 T1=T, T2=T: A1, A3 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

418 Overheads for Computers as Components 2nd ed.
Paths in a loop for (i=0, f=0; i<N; i++) f = f + c[i] * x[i]; i=0 f=0 N i=N Y f = f + c[i] * x[i] i = i + 1 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

419 Overheads for Computers as Components 2nd ed.
Instruction timing Not all instructions take the same amount of time. Multi-cycle instructions. Fetches. Execution times of instructions are not independent. Pipeline interlocks. Cache effects. Execution times may vary with operand value. Floating-point operations. Some multi-cycle integer operations. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

420 Mesaurement-driven performance analysis
Not so easy as it sounds: Must actually have access to the CPU. Must know data inputs that give worst/best case performance. Must make state visible. Still an important method for performance analysis. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

421 Overheads for Computers as Components 2nd ed.
Feeding the program Need to know the desired input values. May need to write software scaffolding to generate the input values. Software scaffolding may also need to examine outputs to generate feedback-driven inputs. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

422 Trace-driven measurement
Instrument the program. Save information about the path. Requires modifying the program. Trace files are large. Widely used for cache analysis. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

423 Overheads for Computers as Components 2nd ed.
Physical measurement In-circuit emulator allows tracing. Affects execution timing. Logic analyzer can measure behavior at pins. Address bus can be analyzed to look for events. Code can be modified to make events visible. Particularly important for real-world input streams. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

424 Overheads for Computers as Components 2nd ed.
CPU simulation Some simulators are less accurate. Cycle-accurate simulator provides accurate clock-cycle timing. Simulator models CPU internals. Simulator writer must know how CPU works. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

425 SimpleScalar FIR filter simulation
int x[N] = {8, 17, … }; int c[N] = {1, 2, … }; main() { int i, k, f; for (k=0; k<COUNT; k++) for (i=0; i<N; i++) f += c[i]*x[i]; } N total sim cycles sim cycles per filter execution 100 25854 259 1,000 155759 156 1,0000 145 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

426 Performance optimization motivation
Embedded systems must often meet deadlines. Faster may not be fast enough. Need to be able to analyze execution time. Worst-case, not typical. Need techniques for reliably improving execution time. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

427 Programs and performance analysis
Best results come from analyzing optimized instructions, not high-level language code: non-obvious translations of HLL statements into instructions; code may move; cache effects are hard to predict. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

428 Overheads for Computers as Components 2nd ed.
Loop optimizations Loops are good targets for optimization. Basic loop optimizations: code motion; induction-variable elimination; strength reduction (x*2 -> x<<1). © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

429 Overheads for Computers as Components 2nd ed.
Code motion for (i=0; i<N*M; i++) z[i] = a[i] + b[i]; i<X i=0; X = N*M i=0; i<N*M N Y z[i] = a[i] + b[i]; i = i+1; © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

430 Induction variable elimination
Induction variable: loop index. Consider loop: for (i=0; i<N; i++) for (j=0; j<M; j++) z[i,j] = b[i,j]; Rather than recompute i*M+j for each array in each iteration, share induction variable between arrays, increment at end of loop body. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

431 Overheads for Computers as Components 2nd ed.
Cache analysis Loop nest: set of loops, one inside other. Perfect loop nest: no conditionals in nest. Because loops use large quantities of data, cache conflicts are common. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

432 Array conflicts in cache
1024 1024 4099 b[0,0] ... 4099 main memory cache © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

433 Array conflicts, cont’d.
Array elements conflict because they are in the same line, even if not mapped to same location. Solutions: move one array; pad array. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

434 Performance optimization hints
Use registers efficiently. Use page mode memory accesses. Analyze cache behavior: instruction conflicts can be handled by rewriting code, rescheudling; conflicting scalar data can easily be moved; conflicting array data can be moved, padded. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

435 Energy/power optimization
Energy: ability to do work. Most important in battery-powered systems. Power: energy per unit time. Important even in wall-plug systems---power becomes heat. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

436 Measuring energy consumption
Execute a small loop, measure current: I while (TRUE) a(); © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

437 Sources of energy consumption
Relative energy per operation (Catthoor et al): memory transfer: 33 external I/O: 10 SRAM write: 9 SRAM read: 4.4 multiply: 3.6 add: 1 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

438 Cache behavior is important
Energy consumption has a sweet spot as cache size changes: cache too small: program thrashes, burning energy on external memory accesses; cache too large: cache itself burns too much power. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

439 Overheads for Computers as Components 2nd ed.
Cache sweet spot [Li98] © 1998 IEEE © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

440 Overheads for Computers as Components 2nd ed.
Optimizing for energy First-order optimization: high performance = low energy. Not many instructions trade speed for energy. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

441 Optimizing for energy, cont’d.
Use registers efficiently. Identify and eliminate cache conflicts. Moderate loop unrolling eliminates some loop overhead instructions. Eliminate pipeline stalls. Inlining procedures may help: reduces linkage, but may increase cache thrashing. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

442 Overheads for Computers as Components 2nd ed.
Efficient loops General rules: Don’t use function calls. Keep loop body small to enable local repeat (only forward branches). Use unsigned integer for loop counter. Use <= to test loop counter. Make use of compiler---global optimization, software pipelining. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

443 Single-instruction repeat loop example
STM #4000h,AR2 ; load pointer to source STM #100h,AR3 ; load pointer to destination RPT #(1024-1) MVDD *AR2+,*AR3+ ; move © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

444 Optimizing for program size
Goal: reduce hardware cost of memory; reduce power consumption of memory units. Two opportunities: data; instructions. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

445 Data size minimization
Reuse constants, variables, data buffers in different parts of code. Requires careful verification of correctness. Generate data using instructions. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

446 Overheads for Computers as Components 2nd ed.
Reducing code size Avoid function inlining. Choose CPU with compact instructions. Use specialized instructions where possible. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

447 Program validation and testing
But does it work? Concentrate here on functional verification. Major testing strategies: Black box doesn’t look at the source code. Clear box (white box) does look at the source code. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

448 Overheads for Computers as Components 2nd ed.
Clear-box testing Examine the source code to determine whether it works: Can you actually exercise a path? Do you get the value you expect along a path? Testing procedure: Controllability: rovide program with inputs. Execute. Observability: examine outputs. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

449 Controlling and observing programs
firout = 0.0; for (j=curr, k=0; j<N; j++, k++) firout += buff[j] * c[k]; for (j=0; j<curr; j++, k++) if (firout > 100.0) firout = 100.0; if (firout < ) firout = ; Controllability: Must fill circular buffer with desired N values. Other code governs how we access the buffer. Observability: Want to examine firout before limit testing. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

450 Execution paths and testing
Paths are important in functional testing as well as performance analysis. In general, an exponential number of paths through the program. Show that some paths dominate others. Heuristically limit paths. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

451 Choosing the paths to test
Possible criteria: Execute every statement at least once. Execute every branch direction at least once. Equivalent for structured programs. Not true for gotos. not covered © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

452 Overheads for Computers as Components 2nd ed.
Basis paths Approximate CDFG with undirected graph. Undirected graphs have basis paths: All paths are linear combinations of basis paths. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

453 Cyclomatic complexity
Cyclomatic complexity is a bound on the size of basis sets: e = # edges n = # nodes p = number of graph components M = e – n + 2p. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

454 Overheads for Computers as Components 2nd ed.
Branch testing Heuristic for testing branches. Exercise true and false branches of conditional. Exercise every simple condition at least once. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

455 Branch testing example
Correct: if (a || (b >= c)) { printf(“OK\n”); } Incorrect: if (a && (b >= c)) { printf(“OK\n”); } Test: a = F (b >=c) = T Example: Correct: [0 || (3 >= 2)] = T Incorrect: [0 && (3 >= 2)] = F © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

456 Another branch testing example
Correct: if ((x == good_pointer) && x->field1 == 3)) { printf(“got the value\n”); } Incorrect: if ((x = good_pointer) && x->field1 == 3)) { printf(“got the value\n”); } Incorrect code changes pointer. Assignment returns new LHS in C. Test that catches error: (x != good_pointer) && x->field1 = 3) © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

457 Overheads for Computers as Components 2nd ed.
Domain testing Heuristic test for linear inequalities. Test on each side + boundary of inequality. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

458 Overheads for Computers as Components 2nd ed.
Def-use pairs Variable def-use: Def when value is assigned (defined). Use when used on right-hand side. Exercise each def-use pair. Requires testing correct path. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

459 Overheads for Computers as Components 2nd ed.
Loop testing Loops need specialized tests to be tested efficiently. Heuristic testing strategy: Skip loop entirely. One loop iteration. Two loop iterations. # iterations much below max. n-1, n, n+1 iterations where n is max. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

460 Overheads for Computers as Components 2nd ed.
Black-box testing Complements clear-box testing. May require a large number of tests. Tests software in different ways. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

461 Black-box test vectors
Random tests. May weight distribution based on software specification. Regression tests. Tests of previous versions, bugs, etc. May be clear-box tests of previous versions. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

462 How much testing is enough?
Exhaustive testing is impractical. One important measure of test quality---bugs escaping into field. Good organizations can test software to give very low field bug report rates. Error injection measures test quality: Add known bugs. Run your tests. Determine % injected bugs that are caught. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

463 Program design and analysis
Software modem. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

464 Overheads for Computers as Components 2nd ed.
Theory of operation Frequency-shift keying: separate frequencies for 0 and 1. 1 time © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

465 Overheads for Computers as Components 2nd ed.
FSK encoding Generate waveforms based on current bit: bit-controlled waveform generator © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

466 Overheads for Computers as Components 2nd ed.
FSK decoding zero filter detector 0 bit A/D converter one filter detector 1 bit © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

467 Overheads for Computers as Components 2nd ed.
Transmission scheme Send data in 8-bit bytes. Arbitrary spacing between bytes. Byte starts with 0 start bit. Receiver measures length of start bit to synchronize itself to remaining 8 bits. start (0) bit 1 bit 2 bit 3 bit 8 ... © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

468 Overheads for Computers as Components 2nd ed.
Requirements © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

469 Overheads for Computers as Components 2nd ed.
Specification Line-in* Receiver 1 1 input() sample-in() bit-out() Transmitter Line-out* 1 1 bit-in() sample-out() output() © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

470 Overheads for Computers as Components 2nd ed.
System architecture Interrupt handlers for samples: input and output. Transmitter. Receiver. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

471 Overheads for Computers as Components 2nd ed.
Transmitter Waveform generation by table lookup. float sine_wave[N_SAMP] = { 0.0, 0.5, 0.866, 1, 0.866, 0.5, 0.0, -0.5, , -1.0, , -0.5, 0}; time © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

472 Overheads for Computers as Components 2nd ed.
Receiver Filters (FIR for simplicity) use circular buffers to hold data. Timer measures bit length. State machine recognizes start bits, data bits. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

473 Overheads for Computers as Components 2nd ed.
Hardware platform CPU. A/D converter. D/A converter. Timer. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

474 Component design and testing
Easy to test transmitter and receiver on host. Transmitter can be verified with speaker outputs. Receiver verification tasks: start bit recognition; data bit recognition. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

475 System integration and testing
Use loopback mode to test components against each other. Loopback in software or by connecting D/A and A/D converters. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

476 Processes and operating systems
Multiple tasks and multiple processes. Specifications of process timing. Preemptive real-time operating systems. Processes and UML. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

477 Overheads for Computers as Components 2nd ed.
Reactive systems Respond to external events. Engine controller. Seat belt monitor. Requires real-time response. System architecture. Program implementation. May require a chain reaction among multiple processors. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

478 Overheads for Computers as Components 2nd ed.
Tasks and processes A task is a functional description of a connected set of operations. (Task can also mean a collection of processes.) A process is a unique execution of a program. Several copies of a program may run simultaneously or at different times. A process has its own state: registers; memory. The operating system manages processes. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

479 Why multiple processes?
Multiple tasks means multiple processes. Processes help with timing complexity: multiple rates multimedia automotive asynchronous input user interfaces communication systems © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

480 Overheads for Computers as Components 2nd ed.
Multi-rate systems Tasks may be synchronous or asynchronous. Synchronous tasks may recur at different rates. Processes run at different rates based on computational needs of the tasks. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

481 Example: engine control
Tasks: spark control crankshaft sensing fuel/air mixture oxygen sensor Kalman filter engine controller © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

482 Typical rates in engine controllers
Variable Full range time (ms) Update period (ms) Engine spark timing 300 2 Throttle 40 Air flow 30 4 Battery voltage 80 Fuel flow 250 10 Recycled exhaust gas 500 25 Status switches 100 20 Air temperature Seconds 400 Barometric pressure 1000 Spark (dwell) 1 Fuel adjustment 8 Carburetor Mode actuators © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

483 Overheads for Computers as Components 2nd ed.
Real-time systems Perform a computation to conform to external timing constraints. Deadline frequency: Periodic. Aperiodic. Deadline type: Hard: failure to meet deadline causes system failure. Soft: failure to meet deadline causes degraded response. Firm: late response is useless but some late responses can be tolerated. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

484 Timing specifications on processes
Release time: time at which process becomes ready. Deadline: time at which process must finish. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

485 Release times and deadlines
P1 P1 P1 time initiating event period period aperiodic process periodic process initiated at start of period © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

486 Rate requirements on processes
Period: interval between process activations. Rate: reciprocal of period. Initiatino rate may be higher than period---several copies of process run at once. CPU 1 P11 CPU 2 P12 CPU 3 P13 CPU 4 P14 time © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

487 Overheads for Computers as Components 2nd ed.
Timing violations What happens if a process doesn’t finish by its deadline? Hard deadline: system fails if missed. Soft deadline: user may notice, but system doesn’t necessarily fail. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

488 Example: Space Shuttle software error
Space Shuttle’s first launch was delayed by a software timing error: Primary control system PASS and backup system BFS. BFS failed to synchronize with PASS. Change to one routine added delay that threw off start time calculation. 1 in 67 chance of timing problem. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

489 Overheads for Computers as Components 2nd ed.
Task graphs Tasks may have data dependencies---must execute in certain order. Task graph shows data/control dependencies between processes. Task: connected set of processes. Task set: One or more tasks. P1 P2 P5 P3 P6 P4 task 1 task 2 task set © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

490 Communication between tasks
Task graph assumes that all processes in each task run at the same rate, tasks do not communicate. In reality, some amount of inter-task communication is necessary. It’s hard to require immediate response for multi-rate communication. MPEG system layer MPEG audio MPEG video © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

491 Process execution characteristics
Process execution time Ti. Execution time in absence of preemption. Possible time units: seconds, clock cycles. Worst-case, best-case execution time may be useful in some cases. Sources of variation: Data dependencies. Memory system. CPU pipeline. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

492 Overheads for Computers as Components 2nd ed.
Utilization CPU utilization: Fraction of the CPU that is doing useful work. Often calculated assuming no scheduling overhead. Utilization: U = (CPU time for useful work)/ (total available CPU time) = [ S t1 ≤ t ≤ t2 T(t) ] / [t2 – t1] = T/t © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

493 Overheads for Computers as Components 2nd ed.
State of a process A process can be in one of three states: executing on the CPU; ready to run; waiting for data. executing gets data and CPU gets CPU preempted needs data gets data ready waiting needs data © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

494 The scheduling problem
Can we meet all deadlines? Must be able to meet deadlines in all cases. How much CPU horsepower do we need to meet our deadlines? © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

495 Scheduling feasibility
Resource constraints make schedulability analysis NP-hard. Must show that the deadlines are met for all timings of resource requests. P1 P2 I/O device © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

496 Simple processor feasibility
Assume: No resource conflicts. Constant process execution times. Require: T ≥ Si Ti Can’t use more than 100% of the CPU. T1 T2 T3 T © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

497 Overheads for Computers as Components 2nd ed.
Hyperperiod Hyperperiod: least common multiple (LCM) of the task periods. Must look at the hyperperiod schedule to find all task interactions. Hyperperiod can be very long if task periods are not chosen carefully. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

498 Overheads for Computers as Components 2nd ed.
Hyperperiod example Long hyperperiod: P1 7 ms. P2 11 ms. P3 15 ms. LCM = 1155 ms. Shorter hyperperiod: P1 8 ms. P2 12 ms. P3 16 ms. LCM = 96 ms. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

499 Simple processor feasibility example
P1 period 1 ms, CPU time 0.1 ms. P2 period 1 ms, CPU time 0.2 ms. P3 period 5 ms, CPU time 0.3 ms. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

500 Overheads for Computers as Components 2nd ed.
Cyclostatic/TDMA Schedule in time slots. Same process activation irrespective of workload. Time slots may be equal size or unequal. T1 T2 T3 T1 T2 T3 P P © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

501 Overheads for Computers as Components 2nd ed.
TDMA assumptions Schedule based on least common multiple (LCM) of the process periods. Trivial scheduler -> very small scheduling overhead. P1 P1 P1 P2 P2 PLCM © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

502 Overheads for Computers as Components 2nd ed.
TDMA schedulability Always same CPU utilization (assuming constant process execution times). Can’t handle unexpected loads. Must schedule a time slot for aperiodic events. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

503 TDMA schedulability example
TDMA period = 10 ms. P1 CPU time 1 ms. P2 CPU time 3 ms. P3 CPU time 2 ms. P4 CPU time 2 ms. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

504 Overheads for Computers as Components 2nd ed.
Round-robin Schedule process only if ready. Always test processes in the same order. Variations: Constant system period. Start round-robin again after finishing a round. T1 T2 T3 T2 T3 P P © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

505 Round-robin assumptions
Schedule based on least common multiple (LCM) of the process periods. Best done with equal time slots for processes. Simple scheduler -> low scheduling overhead. Can be implemented in hardware. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

506 Round-robin schedulability
Can bound maximum CPU load. May leave unused CPU cycles. Can be adapted to handle unexpected load. Use time slots at end of period. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

507 Schedulability and overhead
The scheduling process consumes CPU time. Not all CPU time is available for processes. Scheduling overhead must be taken into account for exact schedule. May be ignored if it is a small fraction of total execution time. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

508 Running periodic processes
Need code to control execution of processes. Simplest implementation: process = subroutine. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

509 while loop implementation
Simplest implementation has one loop. No control over execution timing. while (TRUE) { p1(); p2(); } © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

510 Timed loop implementation
Encapuslate set of all processes in a single function that implements the task set,. Use timer to control execution of the task. No control over timing of individual processes. void pall(){ p1(); p2(); } © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

511 Multiple timers implementation
Each task has its own function. Each task has its own timer. May not have enough timers to implement all the rates. void pA(){ /* rate A */ p1(); p3(); } void B(){ /* rate B */ p2(); p4(); p5(); © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

512 Timer + counter implementation
Use a software count to divide the timer. Only works for clean multiples of the timer period. int p2count = 0; void pall(){ p1(); if (p2count >= 2) { p2(); p2count = 0; } else p2count++; p3(); © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

513 Implementing processes
All of these implementations are inadequate. Need better control over timing. Need a better mechanism than subroutines. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

514 Processes and operating systems
© 2000 Morgan Kaufman Overheads for Computers as Components

515 Overheads for Computers as Components
Operating systems The operating system controls resources: who gets the CPU; when I/O takes place; how much memory is allocated. The most important resource is the CPU itself. CPU access controlled by the scheduler. © 2000 Morgan Kaufman Overheads for Computers as Components

516 Overheads for Computers as Components
Process state A process can be in one of three states: executing on the CPU; ready to run; waiting for data. executing gets data and CPU gets CPU preempted needs data gets data ready waiting needs data © 2000 Morgan Kaufman Overheads for Computers as Components

517 Operating system structure
OS needs to keep track of: process priorities; scheduling state; process activation record. Processes may be created: statically before system starts; dynamically during execution. © 2000 Morgan Kaufman Overheads for Computers as Components

518 Embedded vs. general-purpose scheduling
Workstations try to avoid starving processes of CPU access. Fairness = access to CPU. Embedded systems must meet deadlines. Low-priority processes may not run for a long time. © 2000 Morgan Kaufman Overheads for Computers as Components

519 Priority-driven scheduling
Each process has a priority. CPU goes to highest-priority process that is ready. Priorities determine scheduling policy: fixed priority; time-varying priorities. © 2000 Morgan Kaufman Overheads for Computers as Components

520 Priority-driven scheduling example
Rules: each process has a fixed priority (1 highest); highest-priority ready process gets CPU; process continues until done. Processes P1: priority 1, execution time 10 P2: priority 2, execution time 30 P3: priority 3, execution time 20 © 2000 Morgan Kaufman Overheads for Computers as Components

521 Priority-driven scheduling example
P3 ready t=18 P2 ready t=0 P1 ready t=15 P2 P1 P2 P3 10 20 30 40 50 60 time © 2000 Morgan Kaufman Overheads for Computers as Components

522 The scheduling problem
Can we meet all deadlines? Must be able to meet deadlines in all cases. How much CPU horsepower do we need to meet our deadlines? © 2000 Morgan Kaufman Overheads for Computers as Components

523 Process initiation disciplines
Periodic process: executes on (almost) every period. Aperiodic process: executes on demand. Analyzing aperiodic process sets is harder---must consider worst-case combinations of process activations. © 2000 Morgan Kaufman Overheads for Computers as Components

524 Timing requirements on processes
Period: interval between process activations. Initiation interval: reciprocal of period. Initiation time: time at which process becomes ready. Deadline: time at which process must finish. © 2000 Morgan Kaufman Overheads for Computers as Components

525 Overheads for Computers as Components
Timing violations What happens if a process doesn’t finish by its deadline? Hard deadline: system fails if missed. Soft deadline: user may notice, but system doesn’t necessarily fail. © 2000 Morgan Kaufman Overheads for Computers as Components

526 Example: Space Shuttle software error
Space Shuttle’s first launch was delayed by a software timing error: Primary control system PASS and backup system BFS. BFS failed to synchronize with PASS. Change to one routine added delay that threw off start time calculation. 1 in 67 chance of timing problem. © 2000 Morgan Kaufman Overheads for Computers as Components

527 Interprocess communication
Interprocess communication (IPC): OS provides mechanisms so that processes can pass data. Two types of semantics: blocking: sending process waits for response; non-blocking: sending process continues. © 2000 Morgan Kaufman Overheads for Computers as Components

528 Overheads for Computers as Components
IPC styles Shared memory: processes have some memory in common; must cooperate to avoid destroying/missing messages. Message passing: processes send messages along a communication channel---no common address space. © 2000 Morgan Kaufman Overheads for Computers as Components

529 Overheads for Computers as Components
Shared memory Shared memory on a bus: memory CPU 1 CPU 2 © 2000 Morgan Kaufman Overheads for Computers as Components

530 Race condition in shared memory
Problem when two CPUs try to write the same location: CPU 1 reads flag and sees 0. CPU 2 reads flag and sees 0. CPU 1 sets flag to one and writes location. CPU 2 sets flag to one and overwrites location. © 2000 Morgan Kaufman Overheads for Computers as Components

531 Overheads for Computers as Components
Atomic test-and-set Problem can be solved with an atomic test-and-set: single bus operation reads memory location, tests it, writes it. ARM test-and-set provided by SWP: ADR r0,SEMAPHORE LDR r1,#1 GETFLAG SWP r1,r1,[r0] BNZ GETFLAG © 2000 Morgan Kaufman Overheads for Computers as Components

532 Overheads for Computers as Components
Critical regions Critical region: section of code that cannot be interrupted by another process. Examples: writing shared memory; accessing I/O device. © 2000 Morgan Kaufman Overheads for Computers as Components

533 Overheads for Computers as Components
Semaphores Semaphore: OS primitive for controlling access to critical regions. Protocol: Get access to semaphore with P(). Perform critical region operations. Release semaphore with V(). © 2000 Morgan Kaufman Overheads for Computers as Components

534 Overheads for Computers as Components
Message passing Message passing on a network: CPU 1 CPU 2 message message message © 2000 Morgan Kaufman Overheads for Computers as Components

535 Process data dependencies
One process may not be able to start until another finishes. Data dependencies defined in a task graph. All processes in one task run at the same rate. P1 P2 P3 P4 © 2000 Morgan Kaufman Overheads for Computers as Components

536 Other operating system functions
Date/time. File system. Networking. Security. © 2000 Morgan Kaufman Overheads for Computers as Components

537 Processes and operating systems
Scheduling policies: RMS; EDF. Scheduling modeling assumptions. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

538 Overheads for Computers as Components 2nd ed.
Metrics How do we evaluate a scheduling policy: Ability to satisfy all deadlines. CPU utilization---percentage of time devoted to useful work. Scheduling overhead---time required to make scheduling decision. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

539 Rate monotonic scheduling
RMS (Liu and Layland): widely-used, analyzable scheduling policy. Analysis is known as Rate Monotonic Analysis (RMA). © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

540 Overheads for Computers as Components 2nd ed.
RMA model All process run on single CPU. Zero context switch time. No data dependencies between processes. Process execution time is constant. Deadline is at end of period. Highest-priority ready process runs. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

541 Overheads for Computers as Components 2nd ed.
Process parameters Ti is computation time of process i; ti is period of process i. period ti Pi computation time Ti © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

542 Rate-monotonic analysis
Response time: time required to finish process. Critical instant: scheduling state that gives worst response time. Critical instant occurs when all higher-priority processes are ready to execute. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

543 Overheads for Computers as Components 2nd ed.
Critical instant P1 P2 P3 interfering processes P1 P2 P3 critical instant P4 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

544 Overheads for Computers as Components 2nd ed.
RMS priorities Optimal (fixed) priority assignment: shortest-period process gets highest priority; priority inversely proportional to period; break ties arbitrarily. No fixed-priority scheme does better. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

545 Overheads for Computers as Components 2nd ed.
RMS example P2 period P2 P1 period P1 P1 P1 5 10 time © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

546 Overheads for Computers as Components 2nd ed.
RMS CPU utilization Utilization for n processes is S i Ti / ti As number of tasks approaches infinity, maximum utilization approaches 69%. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

547 RMS CPU utilization, cont’d.
RMS cannot use 100% of CPU, even with zero context switch overhead. Must keep idle cycles available to handle worst-case scenario. However, RMS guarantees all processes will always meet their deadlines. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

548 Overheads for Computers as Components 2nd ed.
RMS implementation Efficient implementation: scan processes; choose highest-priority active process. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

549 Earliest-deadline-first scheduling
EDF: dynamic priority scheduling scheme. Process closest to its deadline has highest priority. Requires recalculating processes at every timer interrupt. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

550 Overheads for Computers as Components 2nd ed.
EDF analysis EDF can use 100% of CPU. But EDF may fail to miss a deadline. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

551 Overheads for Computers as Components 2nd ed.
EDF implementation On each timer interrupt: compute time to deadline; choose process closest to deadline. Generally considered too expensive to use in practice. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

552 Fixing scheduling problems
What if your set of processes is unschedulable? Change deadlines in requirements. Reduce execution times of processes. Get a faster CPU. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

553 Overheads for Computers as Components 2nd ed.
Priority inversion Priority inversion: low-priority process keeps high-priority process from running. Improper use of system resources can cause scheduling problems: Low-priority process grabs I/O device. High-priority device needs I/O device, but can’t get it until low-priority process is done. Can cause deadlock. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

554 Solving priority inversion
Give priorities to system resources. Have process inherit the priority of a resource that it requests. Low-priority process inherits priority of device if higher. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

555 Overheads for Computers as Components 2nd ed.
Data dependencies Data dependencies allow us to improve utilization. Restrict combination of processes that can run simultaneously. P1 and P2 can’t run simultaneously. P1 P2 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

556 Context-switching time
Non-zero context switch time can push limits of a tight schedule. Hard to calculate effects---depends on order of context switches. In practice, OS context switch overhead is small (hundreds of clock cycles) relative to many common task periods (ms – ms). © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

557 Processes and operating systems
Interprocess communication. Operating system performance. Power management. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

558 Interprocess communication
OS provides interprocess communication mechanisms: various efficiencies; communication power. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

559 Interprocess communication
Interprocess communication (IPC): OS provides mechanisms so that processes can pass data. Two types of semantics: blocking: sending process waits for response; non-blocking: sending process continues. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

560 Overheads for Computers as Components 2nd ed.
IPC styles Shared memory: processes have some memory in common; must cooperate to avoid destroying/missing messages. Message passing: processes send messages along a communication channel---no common address space. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

561 Overheads for Computers as Components 2nd ed.
Shared memory Shared memory on a bus: memory CPU 1 CPU 2 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

562 Race condition in shared memory
Problem when two CPUs try to write the same location: CPU 1 reads flag and sees 0. CPU 2 reads flag and sees 0. CPU 1 sets flag to one and writes location. CPU 2 sets flag to one and overwrites location. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

563 Overheads for Computers as Components 2nd ed.
Atomic test-and-set Problem can be solved with an atomic test-and-set: single bus operation reads memory location, tests it, writes it. ARM test-and-set provided by SWP: ADR r0,SEMAPHORE LDR r1,#1 GETFLAG SWP r1,r1,[r0] BNZ GETFLAG © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

564 Overheads for Computers as Components 2nd ed.
Critical regions Critical region: section of code that cannot be interrupted by another process. Examples: writing shared memory; accessing I/O device. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

565 Overheads for Computers as Components 2nd ed.
Semaphores Semaphore: OS primitive for controlling access to critical regions. Protocol: Get access to semaphore with P(). Perform critical region operations. Release semaphore with V(). © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

566 Overheads for Computers as Components 2nd ed.
Message passing Message passing on a network: CPU 1 CPU 2 message message message © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

567 Process data dependencies
One process may not be able to start until another finishes. Data dependencies defined in a task graph. All processes in one task run at the same rate. P1 P2 P3 P4 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

568 Overheads for Computers as Components 2nd ed.
Signals in UML More general than Unix signal---may carry arbitrary data: someClass <<signal>> aSig <<send>> sigbehavior() p : integer © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

569 Evaluating RTOS performance
Simplifying assumptions: Context switch costs no CPU time,. We know the exact execution time of processes. WCET/BCET don’t depend on context switches. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

570 Scheduling and context switch overhead
Process Execution time deadline P1 3 5 P2 10 With context switch overhead of 1, no feasible schedule. 2TP1 + TP2 = 2*(1+3)+(1_3)=11 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

571 Process execution time
Process execution time is not constant. Extra CPU time can be good. Extra CPU time can also be bad: Next process runs earlier, causing new preemption. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

572 Overheads for Computers as Components 2nd ed.
Processes and caches Processes can cause additional caching problems. Even if individual processes are well-behaved, processes may interfere with each other. Worst-case execution time with bad behavior is usually much worse than execution time with good cache behavior. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

573 Effects of scheduling on the cache
Schedule 1 (LRU cache): Process WCET Avg. CPU time P1 8 6 P2 4 3 P3 Schedule 2 (half of cache reserved for P1): © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

574 Overheads for Computers as Components 2nd ed.
Power optimization Power management: determining how system resources are scheduled/used to control power consumption. OS can manage for power just as it manages for time. OS reduces power by shutting down units. May have partial shutdown modes. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

575 Power management and performance
Power management and performance are often at odds. Entering power-down mode consumes energy, time. Leaving power-down mode consumes © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

576 Simple power management policies
Request-driven: power up once request is received. Adds delay to response. Predictive shutdown: try to predict how long you have before next request. May start up in advance of request in anticipation of a new request. If you predict wrong, you will incur additional delay while starting up. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

577 Probabilistic shutdown
Assume service requests are probabilistic. Optimize expected values: power consumption; response time. Simple probabilistic: shut down after time Ton, turn back on after waiting for Toff. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

578 Advanced Configuration and Power Interface
ACPI: open standard for power management services. applications device drivers OS kernel power management ACPI BIOS Hardware platform © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

579 ACPI global power states
G3: mechanical off G2: soft off S1: low wake-up latency with no loss of context S2: low latency with loss of CPU/cache state S3: low latency with loss of all state except memory S4: lowest-power state with all devices off G1: sleeping state G0: working state © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

580 Processes and operating systems
Telephone answering machine. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

581 Overheads for Computers as Components 2nd ed.
Theory of operation Compress audio using adaptive differential pulse code modulation (ADPCM). analog time ADPCM 3 2 1 -1 -2 -3 time © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

582 Overheads for Computers as Components 2nd ed.
ADPCM coding Coded in a small alphabet with positive and negative values. {-3,-2,-1,1,2,3} Minimize error between predicted value and actual signal value. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

583 ADPCM compression system
quantizer inverse quantizer integrator encoder samples inverse quantizer integrator decoder © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

584 Telephone system terms
Subscriber line: line to phone. Central office: telephone switching system. Off-hook: phone active. On-hook: phone inactive. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

585 Real and simulated subscriber line
Real subscriber line: 90V RMS ringing signal; companded analog signals; lightning protection, etc. Simulated subscriber line: microphone input; speaker output; switches for ring, off-hook, etc. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

586 Overheads for Computers as Components 2nd ed.
Requirements © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

587 Overheads for Computers as Components 2nd ed.
Comments on analysis DRAM requirement influenced by DRAM price. Details of user interface protocol could be tested on a PC-based prototype. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

588 Answering machine class diagram
1 1 1 Microphone* 1 Controls Record * Outgoing- message 1 1 1 1 1 1 Line-in* * 1 * 1 Incoming- message 1 Playback Line-out* * 1 1 Lights Buttons* 1 1 Speaker* © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

589 Physical interface classes
Microphone* Line-in* Line-out* sample() sample() ring-indicator() sample() pick-up() Buttons* Lights* Speaker* record-OGM play messages num-messages sample() © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

590 Overheads for Computers as Components 2nd ed.
Message classes Message length start-adrs next-msg samples Outgoing-message Incoming-message length=30 sec msg-time © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

591 Overheads for Computers as Components 2nd ed.
Operational classes Controls Record Playback operate() record-msg() playback-msg() © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

592 Overheads for Computers as Components 2nd ed.
Software components Front panel module. Speaker module. Telephone line module. Telephone input and output modules. Compression module. Decompression module. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

593 Controls activate behavior
Compute buttons, line activations Activations? Play OGM Record OGM Play ICM Erase Answer Play OGM Wait for timeout Allocate ICM Erase Record ICM © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

594 Record-msg/playback-msg behaviors
nextadrs = 0 nextadrs = 0 msg.samples[nextadrs] = sample(source) speaker.samples() = msg.samples[nextadrs]; nextadrs++ F F End(source) nextadrs=msg.length T T record-msg playback-msg © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

595 Overheads for Computers as Components 2nd ed.
Hardware platform CPU. Memory. Front panel. 2 A/Ds: subscriber line, microphone. 2 D/A: subscriber line, speaker. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

596 Component design and testing
Must test performance as well as testing. Compression time shouldn’t dominate other tasks. Test for error conditions: memory overflow; try to delete empty message set, etc. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

597 System integration and testing
Can test partial integration on host platform; full testing requires integration on target platform. Simulate phone line for tests: it’s legal; easier to produce test conditions. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

598 Overheads for Computers as Components 2nd ed.
Multiprocessors Why multiprocessors? CPUs and accelerators. Multiprocessor performance analysis. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

599 Overheads for Computers as Components 2nd ed.
Why multiprocessors? Better cost/performance. Match each CPU to its tasks or use custom logic (smaller, cheaper). CPU cost is a non-linear function of performance. cost performance © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

600 Why multiprocessors? cont’d.
Better real-time performance. Put time-critical functions on less-loaded processing elements. Remember RMS utilization---extra CPU cycles must be reserved to meet deadlines. cost deadline w. RMS overhead deadline performance © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

601 Why multiprocessors? cont’d.
Using specialized processors or custom logic saves power. Desktop uniprocessors are not power-efficient enough for battery-powered applications. [Aus04] © 2004 IEEE Computer Society © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

602 Why multiprocessors? cont’d.
Good for processing I/O in real-time. May consume less energy. May be better at streaming data. May not be able to do all the work on even the largest single CPU. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

603 Overheads for Computers as Components 2nd ed.
Accelerated systems Use additional computational unit dedicated to some functions? Hardwired logic. Extra CPU. Hardware/software co-design: joint design of hardware and software architectures. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

604 Accelerated system architecture
accelerator request result CPU data data memory I/O © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

605 Accelerator vs. co-processor
A co-processor executes instructions. Instructions are dispatched by the CPU. An accelerator appears as a device on the bus. The accelerator is controlled by registers. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

606 Accelerator implementations
Application-specific integrated circuit. Field-programmable gate array (FPGA). Standard component. Example: graphics processor. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

607 Overheads for Computers as Components 2nd ed.
System design tasks Design a heterogeneous multiprocessor architecture. Processing element (PE): CPU, accelerator, etc. Program the system. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

608 Accelerated system design
First, determine that the system really needs to be accelerated. How much faster is the accelerator on the core function? How much data transfer overhead? Design the accelerator itself. Design CPU interface to accelerator. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

609 Accelerated system platforms
Several off-the-shelf boards are available for acceleration in PCs: FPGA-based core; PC bus interface. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

610 Accelerator/CPU interface
Accelerator registers provide control registers for CPU. Data registers can be used for small data objects. Accelerator may include special-purpose read/write logic. Especially valuable for large data transfers. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

611 System integration and debugging
Try to debug the CPU/accelerator interface separately from the accelerator core. Build scaffolding to test the accelerator. Hardware/software co-simulation can be useful. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

612 Overheads for Computers as Components 2nd ed.
Caching problems Main memory provides the primary data transfer mechanism to the accelerator. Programs must ensure that caching does not invalidate main memory data. CPU reads location S. Accelerator writes location S. CPU writes location S. BAD © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

613 Overheads for Computers as Components 2nd ed.
Synchronization As with cache, main memory writes to shared memory may cause invalidation: CPU reads S. Accelerator writes S. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

614 Multiprocessor performance analysis
Effects of parallelism (and lack of it): Processes. CPU and bus. Multiple processors. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

615 Overheads for Computers as Components 2nd ed.
Accelerator speedup Critical parameter is speedup: how much faster is the system with the accelerator? Must take into account: Accelerator execution time. Data transfer time. Synchronization with the master CPU. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

616 Accelerator execution time
Total accelerator execution time: taccel = tin + tx + tout Data input Data output Accelerated computation © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

617 Overheads for Computers as Components 2nd ed.
Accelerator speedup Assume loop is executed n times. Compare accelerated system to non-accelerated system: S = n(tCPU - taccel) = n[tCPU - (tin + tx + tout)] Execution time on CPU © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

618 Single- vs. multi-threaded
One critical factor is available parallelism: single-threaded/blocking: CPU waits for accelerator; multithreaded/non-blocking: CPU continues to execute along with accelerator. To multithread, CPU must have useful work to do. But software must also support multithreading. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

619 Overheads for Computers as Components 2nd ed.
Total execution time Single-threaded: Multi-threaded: P1 P1 P2 A1 P2 A1 P3 P3 P4 P4 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

620 Execution time analysis
Single-threaded: Count execution time of all component processes. Multi-threaded: Find longest path through execution. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

621 Sources of parallelism
Overlap I/O and accelerator computation. Perform operations in batches, read in second batch of data while computing on first batch. Find other work to do on the CPU. May reschedule operations to move work after accelerator initiation. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

622 Data input/output times
Bus transactions include: flushing register/cache values to main memory; time required for CPU to set up transaction; overhead of data transfers by bus packets, handshaking, etc. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

623 Scheduling and allocation
Must: schedule operations in time; allocate computations to processing elements. Scheduling and allocation interact, but separating them helps. Alternatively allocate, then schedule. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

624 Example: scheduling and allocation
Task graph Hardware platform © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

625 Overheads for Computers as Components 2nd ed.
First design Allocate P1, P2 -> M1; P3 -> M2. M1 P1 P1C P2 P2C M2 P3 time © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

626 Overheads for Computers as Components 2nd ed.
Second design Allocate P1 -> M1; P2, P3 -> M2: M1 P1 P1C M2 P2 P3 time © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

627 Example: adjusting messages to reduce delay
Task graph: Network: 3 4 execution time Transmission time = 4 allocation P1 P2 M1 M2 M3 d1 d2 P3 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

628 Overheads for Computers as Components 2nd ed.
Initial schedule M1 P1 M2 P2 M3 P3 network d1 d2 Time = 15 time 5 10 15 20 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

629 Overheads for Computers as Components 2nd ed.
New design Modify P3: reads one packet of d1, one packet of d2 computes partial result continues to next packet © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

630 Overheads for Computers as Components 2nd ed.
New schedule M1 P1 M2 P2 M3 P3 P3 P3 P3 network d1 d2 d1 d2 d1 d2 d1 d2 Time = 12 time 5 10 15 20 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

631 Buffering and performance
Buffering may sequentialize operations. Next process must wait for data to enter buffer before it can continue. Buffer policy (queue, RAM) affects available parallelism. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

632 Overheads for Computers as Components 2nd ed.
Buffers and latency Three processes separated by buffers: B1 A B2 B B3 C © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

633 Buffers and latency schedules
A[0] A[1] … B[0] B[1] C[0] C[1] A[0] B[0] C[0] A[1] B[1] C[1] … Must wait for all of A before getting any B © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

634 Overheads for Computers as Components 2nd ed.
Multiprocessors Consumer electronics systems. Cell phones. CDs and DVDs. Audio players. Digital still cameras. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

635 Consumer electronics use cases
Multimedia: stored in compressed form, uncompressed on viewing. Data storage and management: keep track of your multimedia, etc. Communication: download, upload, chat. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

636 Non-functional requirements for CE
Often battery-operated, strict power budget., Very inexpensive. User interface must be capable but inexpensive. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

637 Overheads for Computers as Components 2nd ed.
CE devices and hosts Many devices talk to host system. PC host does things that are hard to do on the device. Increasingly, CE devices communicate directly over the network, avoiding the host for access. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

638 Platforms and operating systems
Many CE devices use a DSP for signal processing and a RISC CPU for other tasks. I/O devices include buttons, screen, USB. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

639 Overheads for Computers as Components 2nd ed.
Flash file systems Flash is widely used for mass storage. Flash wears out on writing (up to 1 million cycles). Directory is most often written, wears out first. Flash file system has layer that moves contents to levelize wear. Hides wear leveling from API. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

640 Overheads for Computers as Components 2nd ed.
Cell phones Most popular CE device in history; most widely used computing device. 1 billion sold per year. Handset talks to cell. Cells hand off handset as it moves. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

641 Overheads for Computers as Components 2nd ed.
Cell phone platforms Today’s cell phones use analog front end, digital baseband processing. Future cell phones will perform IF processing with DSP. Baseband processing in DSP: Voice compression. Network protocol. Other processing: Multimedia functions. User interface. File system. Applications (contacts, etc.) © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

642 Overheads for Computers as Components 2nd ed.
CD/MP3 player Audio CPU memory Jog memory Analog out display Error corrector focus, tracking, sled, motor drive Servo CPU Analog in amp DAC head FE, TE, amp I2S memory © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

643 Overheads for Computers as Components 2nd ed.
CD medium Rotational speed: m/s (CLV). Track pitch: 1.6 microns. Diameter: 120 mm. Pit length: microns. Pit depth: .11 microns. Pit width: 0.5 microns. Laser wavelength: 780 nm. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

644 Overheads for Computers as Components 2nd ed.
CD mechanism Laser, lens, sled: CD focus track detectors diffraction grating sled laser track © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

645 Overheads for Computers as Components 2nd ed.
Laser focus Focus controlled by vertical position of lens. Unfocused beam causes irregular spot: Out of focus In focus Out of focus © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

646 Overheads for Computers as Components 2nd ed.
Laser pickup Side spot detectors F A Level: A+B+C+D Focus error: (A+C)-(B+D) Tracking error: E-F B D E C © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

647 Overheads for Computers as Components 2nd ed.
Servo control Four main signals: focus 245 kHz; tracking 245 kHz; sled 800 Hz; Disc motor. Optical pickup © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

648 Overheads for Computers as Components 2nd ed.
EFM Eight-to-fourteen modulation: Fourteen-bit code guarantees a maximum distance between transitions. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

649 Overheads for Computers as Components 2nd ed.
Error correction CD capacity: 6.99 GB raw, 700 MB formatted. Reed-Solomon code: g(x) = (x-a) (x- a2) … (x- an-k-1) (x- an-k) Produces data, erasure bits. Time to solve varies greatly depending on noise. CD interleaves Reed-Solomon blocks to reduce effects of large data gaps. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

650 Control and error correction
Skips caused by physical disturbance. Wait for disturbance to subside. Retry. Read errors caused by disc/servo problems. Detect error. Choose location for retry. Fail and interpolate. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

651 Overheads for Computers as Components 2nd ed.
MPEG audio standards Layer 1: Lossless compression of subbands + optional simple masking model Layer 2: More advanced masking model. Layer 3: Additional processing for lower bit rates. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

652 Overheads for Computers as Components 2nd ed.
MPEG audio rates Input sampling rates: 32, 44.1, 48 kHz. Output bit rates: 23, 48, 64, 96, 112, 128, 192, 256, 384 kbits/sec. Output can be mono, dual-channel (bilingual, etc.), stereo. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

653 Overheads for Computers as Components 2nd ed.
Other standards Dolby Digital (AC-3): Uses modified discrete cosine transform. ATRAC (MiniDisc): Uses subband + modified DCT. MPEG-2 AAC. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

654 Overheads for Computers as Components 2nd ed.
MPEG Layer 1 384 samples/block at all frequencies. Equals 8 ms at 48 kHz. Optional masking model. Driven by separate FFT for better accuracy. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

655 Overheads for Computers as Components 2nd ed.
MPEG Layer 1 data frame Bit allocation codes specify word length in each subband. Scale factors give gain for each band. header CRC bit allocation scale factors subband samples aux data © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

656 Overheads for Computers as Components 2nd ed.
MPEG Layer 1 encoder Choose Scale factor Filter bank mux requantize * 0101.. Masking model FFT © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

657 Overheads for Computers as Components 2nd ed.
MPEG Layer 1 decoder Scale factor demux inverse quantize Inverse filter bank 0101.. * * expand Step size © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

658 Overheads for Computers as Components 2nd ed.
Decoding is easier than encoding, but requires: decompression; filtering. Basic CD standard for data discs. No standards for MP3 disc file structure: player must understand Windows, Mac, Unix discs. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

659 Overheads for Computers as Components 2nd ed.
Audio players Audio players may use flash, hard disk, or CD for mass storage. Decompression requires small amount of CPU: 10% of ARM7. File system must be compatible (FAT). © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

660 Overheads for Computers as Components 2nd ed.
Digital still cameras DSC must determine exposure before taking picture. After taking picture: Improve image quality. Compress. Save as file. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

661 Digital still camera architecture
DSC uses CPU for general-purpose processing, DSP for image processing. Internal memory buffers the passes on the image. Display is lower resolution than image sensor. Image must be downsampled. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

662 Overheads for Computers as Components 2nd ed.
Image capture Before taking picture: Determine exposure. Determine focus. Optimize white balance. Bayer pattern © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

663 Overheads for Computers as Components 2nd ed.
Image processing Must perform basic processing to get usable picture: Bayer->RGB interpolation. DSCs perform many functions formerly performed by photoprocessors for film: Image sharpening. Color balance. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

664 Overheads for Computers as Components 2nd ed.
File management EXIF standard gives format for digital pictures: Format of data in a file. Directory structure. EXIF file includes: Image (JPEG, etc.) Thumbnail. Metadata (camera type, date/time, etc.) © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

665 Overheads for Computers as Components 2nd ed.
Accelerators Example: video accelerator © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

666 Overheads for Computers as Components 2nd ed.
Concept Build accelerator for block motion estimation, one step in video compression. Perform two-dimensional correlation: Frame 1 f2 f2 f2 f2 f2 f2 f2 f2 f2 f2 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

667 Block motion estimation
MPEG divides frame into 16 x 16 macroblocks for motion estimation. Search for best match within a search range. Measure similarity with sum-of-absolute-differences (SAD): S | M(i,j) - S(i-ox, j-oy) | © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

668 Overheads for Computers as Components 2nd ed.
Best match Best match produces motion vector for motion block: © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

669 Overheads for Computers as Components 2nd ed.
Full search algorithm bestx = 0; besty = 0; bestsad = MAXSAD; for (ox = - SEARCHSIZE; ox < SEARCHSIZE; ox++) { for (oy = -SEARCHSIZE; oy < SEARCHSIZE; oy++) { int result = 0; for (i=0; i<MBSIZE; i++) { for (j=0; j<MBSIZE; j++) { result += iabs(mb[i][j] - search[i-ox+XCENTER][j-oy-YCENTER]); © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

670 Full search algorithm, cont’d.
} if (result <= bestsad) { bestsad = result; bestx = ox; besty = oy; } © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

671 Computational requirements
Let MBSIZE = 16, SEARCHSIZE = 8. Search area is in each dimension. Must perform: nops = (16 x 16) x (17 x 17) = ops CIF format has 352 x 288 pixels -> 22 x 18 macroblocks. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

672 Accelerator requirements
© 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

673 Accelerator data types, basic classes
Motion-vector Macroblock Search-area x, y : pos pixels[] : pixelval pixels[] : pixelval PC Motion-estimator memory[] compute-mv() © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

674 Overheads for Computers as Components 2nd ed.
Sequence diagram :PC :Motion-estimator compute-mv() Search area memory[] memory[] macroblocks memory[] © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

675 Architectural considerations
Requires large amount of memory: macroblock has 256 pixels; search area has 1,089 pixels. May need external memory (especially if buffering multiple macroblocks/search areas). © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

676 Motion estimator organization
PE 0 search area network PE 1 comparator ctrl Address generator ... Motion vector macroblock network PE 15 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

677 Overheads for Computers as Components 2nd ed.
Pixel schedules M(0,0) S(0,2) © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

678 Overheads for Computers as Components 2nd ed.
System testing Testing requires a large amount of data. Use simple patterns with obvious answers for initial tests. Extract sample data from JPEG pictures for more realistic tests. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

679 Networking for Embedded Systems
Why we use networks. Network abstractions. Example networks. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

680 Overheads for Computers as Components 2nd ed.
Network elements distributed computing platform: PE PE communication link network PE PEs may be CPUs or ASICs. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

681 Networks in embedded systems
initial processing more processing PE sensor PE PE actuator © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

682 Overheads for Computers as Components 2nd ed.
Why distributed? Higher performance at lower cost. Physically distributed activities---time constants may not allow transmission to central site. Improved debugging---use one CPU in network to debug others. May buy subsystems that have embedded processors. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

683 Overheads for Computers as Components 2nd ed.
Network abstractions International Standards Organization (ISO) developed the Open Systems Interconnection (OSI) model to describe networks: 7-layer model. Provides a standard way to classify network components and operations. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

684 Overheads for Computers as Components 2nd ed.
OSI model application end-use interface presentation data format session application dialog control transport connections network end-to-end service data link reliable data transport physical mechanical, electrical © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

685 Overheads for Computers as Components 2nd ed.
OSI layers Physical: connectors, bit formats, etc. Data link: error detection and control across a single link (single hop). Network: end-to-end multi-hop data communication. Transport: provides connections; may optimize network resources. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

686 Overheads for Computers as Components 2nd ed.
OSI layers, cont’d. Session: services for end-user applications: data grouping, checkpointing, etc. Presentation: data formats, transformation services. Application: interface between network and end-user programs. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

687 Hardware architectures
Many different types of networks: topology; scheduling of communication; routing. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

688 Point-to-point networks
One source, one or more destinations, no data switching (serial port): PE 3 PE 1 PE 2 link 1 link 2 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

689 Overheads for Computers as Components 2nd ed.
Bus networks Common physical connection: PE 1 PE 2 PE 3 PE 4 header address data ECC packet format © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

690 Overheads for Computers as Components 2nd ed.
Bus arbitration Fixed: Same order of resolution every time. Fair: every PE has same access over long periods. round-robin: rotate top priority among Pes. fixed A B C A B C round-robin A B C B C A A,B,C A,B,C © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

691 Overheads for Computers as Components 2nd ed.
Crossbar out4 out3 out2 out1 in1 in2 in3 in4 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

692 Crossbar characteristics
Non-blocking. Can handle arbitrary multi-cast combinations. Size proportional to n2. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

693 Overheads for Computers as Components 2nd ed.
Multi-stage networks Use several stages of switching elements. Often blocking. Often smaller than crossbar. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

694 Message-based programming
Transport layer provides message-based programming interface: send_msg(adrs,data1); Data must be broken into packets at source, reassembled at destination. Data-push programming: make things happen in network based on data transfers. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

695 Overheads for Computers as Components 2nd ed.
I2C bus Designed for low-cost, medium data rate applications. Characteristics: serial; multiple-master; fixed-priority arbitration. Several microcontrollers come with built-in I2C controllers. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

696 Overheads for Computers as Components 2nd ed.
I2C physical layer master 1 master 2 data line SDL clock line SCL slave 1 slave 2 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

697 Overheads for Computers as Components 2nd ed.
I2C data format SCL ... ... ... SDL ack start MSB © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

698 I2C electrical interface
Open collector interface: + SDL + SCL © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

699 Overheads for Computers as Components 2nd ed.
I2C signaling Sender pulls down bus for 0. Sender listens to bus---if it tried to send a 1 and heard a 0, someone else is simultaneously transmitting. Transmissions occur in 8-bit bytes. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

700 Overheads for Computers as Components 2nd ed.
I2C data link layer Every device has an address (7 bits in standard, 10 bits in extension). Bit 8 of address signals read or write. General call address allows broadcast. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

701 Overheads for Computers as Components 2nd ed.
I2C bus arbitration Sender listens while sending address. When sender hears a conflict, if its address is higher, it stops signaling. Low-priority senders relinquish control early enough in clock cycle to allow bit to be transmitted reliably. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

702 Overheads for Computers as Components 2nd ed.
I2C transmissions multi-byte write S adrs data data P read from slave S adrs 1 data P write, then read S adrs data S adrs 1 data P © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

703 Overheads for Computers as Components 2nd ed.
Ethernet Dominant non-telephone LAN. Versions: 10 Mb/s, 100 Mb/s, 1 Gb/s Goal: reliable communication over an unreliable medium. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

704 Overheads for Computers as Components 2nd ed.
Ethernet topology Bus-based system, several possible physical layers: A B C © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

705 Overheads for Computers as Components 2nd ed.
CSMA/CD Carrier sense multiple access with collision detection: sense collisions; exponentially back off in time; retransmit. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

706 Exponential back-off times
© 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

707 Ethernet packet format
preamble start frame source adrs dest adrs length data payload padding CRC © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

708 Overheads for Computers as Components 2nd ed.
Ethernet performance Quality-of-service tends to non-linearly decrease at high load levels. Can’t guarantee real-time deadlines. However, may provide very good service at proper load levels. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

709 Overheads for Computers as Components 2nd ed.
Fieldbus Used for industrial control and instrumentation---factories, etc. H1 standard based on MB/s twisted pair medium. High Speed Ethernet (HSE) standard based on 100 Mb/s Ethernet. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

710 Overheads for Computers as Components 2nd ed.
Networks Network-based design. Communication analysis. System performance analysis. Internet. Internet-enabled systems. Vehicles as networks. Sensor networks © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

711 Communication analysis
First, understand delay for single message. Delay for multiple messages depends on: network protocol; devices on network. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

712 Overheads for Computers as Components 2nd ed.
Message delay Assume: single message; no contention. Delay: tm = tx + tn + tr = xmtr overhead + network xmit time + rcvr overhead © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

713 Example: I2C message delay
Network transmission time dominates. Assume 100 kbits/sec, one 8-bit byte. Number of bits in packet: npacket = start + address + data + stop = = 18 bits Time required to transmit: 1.8 x 10-4 sec. 20 instructions on 8 MHz controller adds 2.5 x 10-6 delay on xmtr, rcvr. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

714 Overheads for Computers as Components 2nd ed.
Multiple messages If messages can interfere with each other, analysis is more complex. Model total message delay: ty = td + tm = wait time for network + message delay © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

715 Overheads for Computers as Components 2nd ed.
Arbitration and delay Fixed-priority arbitration introduces unbounded delay for all but highest-priority device. Unless higher-priority devices are known to have limited rates that allow lower devices to transmit. Round-robin arbitration introduces bounded delay proportional to N. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

716 Further complications
Acknowledgment time. Transmission errors. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

717 Priority inversion in networks
In many networks, a packet cannot be interrupted. Result is priority inversion: low-priority message holds up higher-priority message. Doesn’t cause deadlock, but can slow down important communications. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

718 Overheads for Computers as Components 2nd ed.
Multihop networks In multihop networks, one node receives message, then retransmits to destination (or intermediate). hop 1 hop 2 A B C Network 1 Network 2 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

719 System performance analysis
System analysis is difficult in general. multiprocessor performance analysis is hard; communication performance analysis is hard. Simple example: uncertainty in P1 finish time -> uncertainty in P2 start time. P1 P2 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

720 Overheads for Computers as Components 2nd ed.
Analysis challenges P2 and P3 can delay each other, even though they are in separate tasks. Delays in P1 propagate to P2, then P3, then to P4. P1 P2 P3 P4 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

721 Overheads for Computers as Components 2nd ed.
Lower bounds on system Computational requirements: sum up process requirements over least-common multiple of periods, average over one period. Communication requirements: Count all transmissions in one period. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

722 Hardware platform design
Need to choose: number and types of PEs; number and types of networks. Evaluate a platform by allocating processes, scheduling processes and communication. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

723 I/O-intensive systems
Start with I/O devices, then consider computation: inventory required devices; identify critical deadlines; chooses devices that can share PEs; analyze communication times; choose PEs to go with devices. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

724 Computation-intensive systems
Start with shortest-deadline tasks: Put shortest-deadline tasks on separate PEs. Check for interference on critical communications. Allocate low-priority tasks to common PEs wherever possible. Balance loads wherever possible. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

725 Overheads for Computers as Components 2nd ed.
Internet Protocol Internet Protocol (IP) is basis for Internet. Provides an internetworking standard: between two Ethernets, Ethernet and token ring, etc. Higher-level services are built on top of IP. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

726 Overheads for Computers as Components 2nd ed.
IP in communication application application presentation presentation session session IP transport transport network network network data link data link data link physical physical physical node A router node B © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

727 Overheads for Computers as Components 2nd ed.
IP packet Includes: version, service type, length time to live, protocol source and destination address data payload Maximum data payload is 65,535 bytes. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

728 Overheads for Computers as Components 2nd ed.
IP addresses 32 bits in early IP, 128 bits in IPv6. Typically written in form xxx.xx.xx.xx. Names (foo.baz.com) translated to IP address by domain name server (DNS). © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

729 Overheads for Computers as Components 2nd ed.
Internet routing Best effort routing: doesn’t guarantee data delivery at IP layer. Routing can vary: session to session; packet to packet. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

730 Higher-level Internet services
Transmission Control Protocol (TCP) provides connection-oriented service. Quality-of-service (QoS) guaranteed services are under development. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

731 The Internet service stack
FTP HTTP SMTP telnet SNMP TCP UDP User Datagram Protocol IP © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

732 Internet-enabled embedded system
Internet-enabled embedded system: any embedded system that includes an Internet interface (e.g., refrigerator). Internet appliance: embedded system designed for a particular Internet task (e.g. ). Examples: Cell phone. Laser printer. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

733 Overheads for Computers as Components 2nd ed.
Example: Javacam Hardware platform: parallel-port camera; National Semi NS486SXF; 1.5 Mbytes memory. Uses memory-efficient Java Nanokernel. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

734 Overheads for Computers as Components 2nd ed.
Javacam architecture Web browser QuickCam applet HTTP Quickcam server QuickCam Java VM Java nanokernel 486 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

735 Overheads for Computers as Components 2nd ed.
Vehicles as networks 1/3 of cost of car/airplane is electronics/avionics. Dozens of microprocessors are used throughout the vehicle. Network applications: Vehicle control. Instrumentation. Communication. Passenger entertainment systems. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

736 Overheads for Computers as Components 2nd ed.
CAN bus First used in 1991. Serial bus, 1 Mb/sec up to 40 m. Synchronous bus. Logic 0 dominates logic 1 on bus. Arbitrated with CSMA/AMP: Arbitration on message priority. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

737 Overheads for Computers as Components 2nd ed.
CAN data frame 11 bit destination address. RTR bit determines read/write from/to destination. Any node can detect bus error, interrupt packet for retransmission. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

738 Overheads for Computers as Components 2nd ed.
CAN controller Controller implements physical and data link layers. No network layer needed---bus provides end-to-end connections. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

739 Overheads for Computers as Components 2nd ed.
Other vehicle busses FlexRay is next generation: Time triggered protocol. 10 Mb/s. Local Interconnect Network (LIN) connects devices in a small area (e.g., door). Passenger entertainment networks: Bluetooth. Media Oriented Systems Transport (MOST). © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

740 Overheads for Computers as Components 2nd ed.
Avionics Anything permanently attached to the aircraft must be certified by FAA/national agency. Traditional architecture uses separate electronics for each instrument/device. Line replaceable unit (LRU) can be physically removed and replaced. Federated architecture shares processors across a subsystem (nav/comm, etc.) © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

741 Overheads for Computers as Components 2nd ed.
Sensor networks Wireless networks, small nodes. Ad hoc networks---organizes itself without system administrator: Must be able to declare membership in network, find other networks. Must be able to determine routes for data. Must update configuration as nodes enter/leave. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

742 Overheads for Computers as Components 2nd ed.
Node capabilities Must be able to turn radio on/off quickly with low power overhead. Communication/computation power = 100x. Radios should operate at several different power levels to avoid interference with other nodes. Must buffer, route network traffic. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

743 Overheads for Computers as Components 2nd ed.
Networks Example: elevator controller. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

744 Overheads for Computers as Components 2nd ed.
Terminology Elevator car: holds passengers. Hoistway: elevator shaft. Car control panel: buttons in each car. Floor control panel: elevator request, etc. per floor. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

745 Overheads for Computers as Components 2nd ed.
Elevator system floor floor floor floor floor Hoistway 1 Hoistway 2 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

746 Overheads for Computers as Components 2nd ed.
Theory of operation Each floor has control panel, display. Each car has control panel: one button per floor; emergency stop. Controlled by a single controller. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

747 Elevator position sensing
sensor fine coarse © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

748 Overheads for Computers as Components 2nd ed.
Elevator control Elevator control has up and down. To stop, disable both. Master controller: reads elevator positions; reads requests; schedules elevators; controls movement; controls doors. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

749 Elevator system requirements
© 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

750 Elevator system class diagram
1 Coarse-sensor* Master-control-panel* 1 1 1 1 N Fine-sensor* Car 1 1 1 1 Controller 1 Car-control-panel* 1 1 1 Floor F N Floor-control-panel* Motor* 1 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

751 Overheads for Computers as Components 2nd ed.
Physical interfaces Sensor* Car-control-panel* hit: boolean Floors[1..F]: boolean emergency-stop: boolean open-door, close-door: Coarse-sensor* Fine-sensor* Master-control-panel... Motor* Floor-control-panel* speed: {o,s,f} up, down: boolean © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

752 Overheads for Computers as Components 2nd ed.
Car and Floor classes Car Floor request-lights[1..F]: boolean current-floor: integer up-light, down-light: boolean © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

753 Overheads for Computers as Components 2nd ed.
Controller class Controller car-floor[1..H]: integer emergency-stop[1..H]: integer scan-cars() scan-floors() scan-master-panel() operate() © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

754 Overheads for Computers as Components 2nd ed.
Architecture Computation and I/O occur at: floor control panels/displays; elevator cars; system controller. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

755 Panels and cab controller
Panels are straightforward---no real-time requirements. Cab controller: read buttons and send events to system controller; read sensor inputs and send to system controller. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

756 Overheads for Computers as Components 2nd ed.
System controller Must take inputs from many sources: car controllers; floors. Must control cars to hard real-time deadlines. User interface, scheduling are soft deadlines. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

757 Overheads for Computers as Components 2nd ed.
Testing Build an elevator simulator using an FPGA: simulate multiple elevators; simulate real-time control demands. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

758 System design techniques
Design methodologies. Requirements and specification. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

759 Overheads for Computers as Components 2nd ed.
Design methodologies Process for creating a system. Many systems are complex: large specifications; multiple designers; interface to manufacturing. Proper processes improve: quality; cost of design and manufacture. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

760 Overheads for Computers as Components 2nd ed.
Product metrics Time-to-market: beat competitors to market; meet marketing window (back-to-school). Design cost. Manufacturing cost. Quality. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

761 Overheads for Computers as Components 2nd ed.
Mars Climate Observer Lost on Mars in September 1999. Requirements problem: Requirements did not specify units. Lockheed Martin used English; JPL wanted metric. Not caught by manual inspections. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

762 Overheads for Computers as Components 2nd ed.
Design flow Design flow: sequence of steps in a design methodology. May be partially or fully automated. Use tools to transform, verify design. Design flow is one component of methodology. Methodology also includes management organization, etc. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

763 Overheads for Computers as Components 2nd ed.
Waterfall model Early model for software development: requirements architecture coding testing maintenance © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

764 Overheads for Computers as Components 2nd ed.
Waterfall model steps Requirements: determine basic characteristics. Architecture: decompose into basic modules. Coding: implement and integrate. Testing: exercise and uncover bugs. Maintenance: deploy, fix bugs, upgrade. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

765 Waterfall model critique
Only local feedback---may need iterations between coding and requirements, for example. Doesn’t integrate top-down and bottom-up design. Assumes hardware is given. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

766 Overheads for Computers as Components 2nd ed.
Spiral model system feasibility specification prototype initial system enhanced system requirements design test © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

767 Overheads for Computers as Components 2nd ed.
Spiral model critique Successive refinement of system. Start with mock-ups, move through simple systems to full-scale systems. Provides bottom-up feedback from previous stages. Working through stages may take too much time. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

768 Successive refinement model
specify specify architect architect design design build build test test initial system refined system © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

769 Hardware/software design flow
requirements and specification architecture software design hardware design integration testing © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

770 Co-design methodology
Must architect hardware and software together: provide sufficient resources; avoid software bottlenecks. Can build pieces somewhat independently, but integration is major step. Also requires bottom-up feedback. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

771 Hierarchical design flow
Embedded systems must be designed across multiple levels of abstraction: system architecture; hardware and software systems; hardware and software components. Often need design flows within design flows. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

772 Hierarchical HW/SW flow
spec architecture HW SW integrate test system spec HW architecture detailed design integration test hardware spec SW architecture detailed design integration test software © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

773 Concurrent engineering
Large projects use many people from multiple disciplines. Work on several tasks at once to reduce design time. Feedback between tasks helps improve quality, reduce number of later design problems. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

774 Concurrent engineering techniques
Cross-functional teams. Concurrent product realization. Incremental information sharing. Integrated product management. Supplier involvement. Customer focus. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

775 AT&T PBX concurrent engineering
Benchmark against competitors. Identify breakthrough improvements. Characterize current process. Create new process. Verify new process. Implement. Measure and improve. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

776 Requirements analysis
Requirements: informal description of what customer wants. Specification: precise description of what design team should deliver. Requirements phase links customers with designers. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

777 Overheads for Computers as Components 2nd ed.
Types of requirements Functional: input/output relationships. Non-functional: timing; power consumption; manufacturing cost; physical size; time-to-market; reliability. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

778 Overheads for Computers as Components 2nd ed.
Good requirements Correct. Unambiguous. Complete. Verifiable: is each requirement satisfied in the final system? Consistent: requirements do not contradict each other. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

779 Good requirements, cont’d.
Modifiable: can update requirements easily. Traceable: know why each requirement exists; go from source documents to requirements; go from requirement to implementation; back from implementation to requirement. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

780 Overheads for Computers as Components 2nd ed.
Setting requirements Customer interviews. Comparison with competitors. Sales feedback. Mock-ups, prototypes. Next-bench syndrome (HP): design a product for someone like you. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

781 Overheads for Computers as Components 2nd ed.
Specifications Capture functional and non-functional properties: verify correctness of spec; compare spec to implementation. Many specification styles: control-oriented vs. data-oriented; textual vs. graphical. UML is one specification/design language. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

782 Overheads for Computers as Components 2nd ed.
SDL Used in telecommunications protocol design. Event-oriented state machine model. telephone on-hook caller goes off-hook dial tone caller gets dial tone © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

783 Overheads for Computers as Components 2nd ed.
Statecharts Ancestor of UML state diagrams. Provided composite states: OR states; AND states. Composite states reduce the size of the state transition graph. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

784 Overheads for Computers as Components 2nd ed.
Statechart OR state s123 i1 i1 S1 S1 i2 i1 i1 i2 i2 S2 S4 S2 S4 i2 S3 S3 traditional OR state © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

785 Overheads for Computers as Components 2nd ed.
Statechart AND state sab c S1-3 S1-4 S1 S3 d b a b a c a b d c S2-3 S2-4 S2 S4 d r r r S5 S5 traditional AND state © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

786 Overheads for Computers as Components 2nd ed.
AND-OR tables Alternate way of specifying complex conditions: cond1 or (cond2 and !cond3) cond1 T - cond2 - T cond3 - F OR AND © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

787 Overheads for Computers as Components 2nd ed.
TCAS II specification TCAS II: aircraft collision avoidance system. Monitors aircraft and air traffic info. Provides audio warnings and directives to avoid collisions. Leveson et al used RMSL language to capture the TCAS specification. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

788 Overheads for Computers as Components 2nd ed.
RMSL State description: Transition bus for transitions between many states: state1 inputs a state description b c outputs d © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

789 TCAS top-level description
power-off power-on Inputs: TCAS-operational-status {operational,not-operational} fully-operational C own-aircraft other-aircraft i:[1..30] standby mode-s-ground-station i:[1..15] © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

790 Own-Aircraft AND state
CAS Inputs: own-alt-radio: integer standby-discrete-input: {true,false} own-alt-barometric:integer, etc. Climb-inibit Descend-inibit Effective-SL Alt-SL Alt-layer ... ... ... 1 1 Increase-climb-inibit 2 2 ... ... Increase-Descend-inibit ... ... Advisory-Status ... 7 7 Outputs: sound-aural-alarm: {true,false} aural-alarm-inhibit: {true, false} combined-control-out: enumerated, etc. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

791 Overheads for Computers as Components 2nd ed.
CRC cards Well-known method for analyzing a system and developing an architecture. CRC: classes; responsibilities of each class; collaborators are other classes that work with a class. Team-oriented methodology. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

792 Overheads for Computers as Components 2nd ed.
CRC card format Class name: Superclasses: Subclasses: Responsibilities: Collaborators: Class name: Class’s function: Attributes: front back © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

793 Overheads for Computers as Components 2nd ed.
CRC methodology Develop an initial list of classes. Simple description is OK. Team members should discuss their choices. Write initial responsibilities/collaborators. Helps to define the classes. Create some usage scenarios. Major uses of system and classes. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

794 CRC methodology, cont’d.
Walk through scenarios. See what works and doesn’t work. Refine the classes, responsibilities, and collaborators. Add class relatoinships: superclass, subclass. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

795 Overheads for Computers as Components 2nd ed.
CRC cards for elevator Real-world classes: elevator car, passenger, floor control, car control, car sensor. Architectural classes: car state, floor control reader, car control reader, car control sender, scheduler. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

796 Elevator responsibilities and collaborators
© 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

797 System design techniques
Quality assurance. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

798 Overheads for Computers as Components 2nd ed.
Quality assurance Quality judged by how well product satisfies its intended function. May be measured in different ways for different kinds of products. Quality assurance (QA) makes sure that all stages of the design process help to deliver a quality product. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

799 Therac-25 Medical Imager (Leveson and Turner)
Six known accidents: radiation overdoses leading to death and serious injury. Radiation gun controlled by PDP-11. Four major software components: stored data; scheduler; set of tasks; interrupt services. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

800 Overheads for Computers as Components 2nd ed.
Therac-25 tasks Treatment monitor controlled and monitored setup and delivery of treatment in eight phases. Servo task controlled radiation gun. Housekeeper task took care of status interlocks and limit checks. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

801 Treatment monitor task
Treat was main monitor task. Eight subroutines. Treat rescheduled itself after every subroutine. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

802 Overheads for Computers as Components 2nd ed.
Software timing race Timing-dependent use of mode and energy: if keyboard handler sets completion behavior before operator changes mode/energy data, Datent task will not detect the change, but Hand task will. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

803 Software timing errors
Changes to parameters made by operator may show on screen but not be sensed by Datent task. One accident caused by entering mode/energy, changing mode/energy, returning to command line in 8 seconds. Skilled operators typed faster, more likely to exercise bug. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

804 Leveson and Turner observations
Performed limited safety analysis: guessed at error probabilities, etc. Did not use mechanical backups to check machine operation. Used overly complex programs written in unreliable styles. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

805 Overheads for Computers as Components 2nd ed.
ISO 9000 Developed by International Standards organization. Applies to a broad range industries. Concentrates on process. Validation based on extensive documentation of organization’s process. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

806 CMU Capability Maturity Model
Five levels of organizational maturity: Initial: poorly organized process, depends on individuals. Repeatable: basic tracking mechanisms. Defined: processes documented and standardized. Managed: makes detailed measurements. Optimizing: measurements used for improvement. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

807 Overheads for Computers as Components 2nd ed.
Verification Verification and testing are important throughout the design flow. Early bugs are more expensive to fix: requirements bug cost to fix coding bug time © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

808 Verifying requirements and specification
prototypes; prototyping languages; pre-existing systems. Specifications: usage scenarios; formal techniques. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

809 Overheads for Computers as Components 2nd ed.
Design review Uses meetings to catch design flaws. Simple, low-cost. Proven by experiments to be effective. Use other people in the project/company to help spot design problems. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

810 Overheads for Computers as Components 2nd ed.
Design review players Designers: present design to rest of team, make changes. Review leader: coordinates process. Review scribe: takes notes of meetings. Review audience: looks for bugs. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

811 Before the design review
Design team prepares documents used to describe the design. Leader recruits audience, coordinates meetings, distributes handouts, etc. Audience members familiarize themselves with the documents before they go to the meeting. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

812 Overheads for Computers as Components 2nd ed.
Design review meeting Leader keeps meeting moving; scribe takes notes. Designers present the design: use handouts; explain what is going on; go through details. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

813 Design review audience
Look for any problems: Is the design consistent with the specification? Is the interface correct? How well is the component’s internal architecture designed? Did they use good design/coding practices? Is the testing strategy adequate? © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

814 Overheads for Computers as Components 2nd ed.
Follow-up Designers make suggested changes. Document changes. Leader checks on results of changes, may distribute to audience for further review or additional reviews. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

815 Overheads for Computers as Components 2nd ed.
Measurements Measurements help ground our beliefs: Do our practices really work? Do they work where we think they work? Types of measurements: bugs found at different stages of design; bugs as a function of time; bugs in different types of components; how bugs are found. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.


Download ppt "Overheads for Computers as Components, 2nd ed."

Similar presentations


Ads by Google