CS152 Computer Architecture and Engineering Lecture 5 Cost and Design ©UCB Fall 2001
Clock Skew’s Effect on Cycle Time Clk1 Clock Skew Clk2 . . Let’s look at an example here. Consider the worst case scenario where the input register sees the clock signal Clock One. Due to the different delay through different parts of the clock distribution network, the output register sees the clock signal Clock Two (CLK2). Here (points to Clock Skew) I have shown you that Clock Two will arrive the output register Slightly earlier than Clock One arrives at the input Register. Consequently, the minimum cycle time for this circuit to work is the sum of: (a) The Clock-to-Q time of the input register. (b) The longest delay path through the combination logic. (c) The Setup time of the output register. (d) And the purpose of this slide, the clock skew of the clock distribution network. In your homework and lab assignments, you probably will be using a relatively slow clock so clock skew is probably not a big problem. After you graduate, you may be lucky enough to find a job to work on some very high speed digital design, then the Clock Skew can be a major problem. (clock skew is usually kept <10% of the cycle time in very high speed system). In those high speed designs, if you are not careful, the sum of the Clock-to-Q time, the Setup time, and the Clock skew can become a major part of your cycle time. Notice that, if your Flip Flops have lousy Clock-to-Q and Setup times and your clock distribution is so poorly design that clock skew is big, then even if you can have the fastest logic gates in the world, you still will not have a super fast design. You can slow down the clock to “fix” a setup violation; there is not a whole lot you can do about hold time problem! +3 = 68 min. (Y:48) Clk1 Clk2 The worst case scenario for cycle time consideration: The input register sees CLK1 The output register sees CLK2 Cycle Time - Clock Skew CLK-to-Q + Longest Delay + Setup Cycle Time CLK-to-Q + Longest Delay + Setup + Clock Skew
A hold time violation because of clock skew Clk-to-Q+Delay Clk1 Hold Time Clk2 Clock Skew . Combination Logic But in the real world, there will be some clock skew. How will clock skew affect your hold time consideration? Once again, let’s look at the worst case scenario. As far as Hold Time consideration is concerned, the worst case scenario occurs where the input register sees the clock signal Clock Two (CLK2). And due to the different delay through different parts of the clock distribution network, the output register sees the clock signal Clock One (CLK1). Here (points to Clock Skew) I have shown you that Clock Two will arrive the input register Slightly earlier than Clock One arrives at the output Register. Consequently, we have to make sure AFTER we subtract the Clock Skew for the sum of: (a) The Clock-to-Q time of the input register. (b) The shortest delay path through the combination logic. We STILL have a time GREATER than the hold time requirement of the output registers. +2 = 74 min. (Y:54) Clk1 Clk2 For no violation (CLK-to-Q + Shortest Delay Path) > Hold Time + Clock Skew or (CLK-to-Q + Shortest Delay Path - Clock Skew) > Hold Time
Integrated Circuit Costs Die cost = Wafer cost Dies per Wafer * Die yield Dies per wafer = * ( Wafer_diam / 2)2 – * Wafer_diam – Test dies Wafer Area Die Area 2 * Die Area Die Area Die Yield = Wafer yield Die yield: assume defects are randomly, distributed, { 1+ Defects_per_unit_area * Die_Area } Die Cost is goes roughly with (die area)3 or (die area)4 Also Packaging and Test Cost – can easily exceed die cost
The Design Process "To Design Is To Represent" Design activity yields description/representation of an object -- Traditional craftsman does not distinguish between the conceptualization and the artifact -- Separation comes about because of complexity -- The concept is captured in one or more representation languages -- This process IS design One way to think about design is: To Design is to Represent. The result of your design activity is a representation of the object you are designing. Every design begins with a set of requirements. First is the functional requirement: a statement of what it will do. WHAT Then there is a list of performance requirements: what speed it needs to run, how much power it is allowed to consume, how big can your design be, how much will it cost ... etc. COST/PERFORMANCE +1 = 5 min. (X:45) Design Begins With Requirements -- Functional Capabilities: what it will do -- Performance Characteristics: Speed, Power, Area, Cost, . . .
Design is a "creative process," not a simple method Design Process (cont.) Design Finishes As Assembly CPU -- Design understood in terms of components and how they have been assembled -- Top Down decomposition of complex functions (behaviors) into more primitive functions -- bottom-up composition of primitive building blocks into more complex assemblies Datapath Control ALU Regs Shifter Nand Gate One of the fun part about being a designer is that you got to be a little kid playing “lego” again, except this time, you will be designing the building blocks as well as putting the building blocks together. The two approaches you will use are: Top Down and Bottom Up. You use the Top Down approach to decompose complex function into primitive functions. After the primitive functions are implemented, you then need to integrate them back together to implement the original complex function. For example, when you design a CPU, you use the top down approach to break the CPU into these primitive blocks. Once you have these blocks implemented, you then put them together to form the CPU. This is pretty clean cut. In many other design problems, you cannot just apply the top-down and then bottom up once. You need to repeat the process several times because design is a creative process, NOT a simple method. Top-Down & Bottom-up together +2 = 7 min. (X:47) Design is a "creative process," not a simple method
Design Refinement Informal System Requirement Initial Specification Intermediate Specification Final Architectural Description Intermediate Specification of Implementation Final Internal Specification Physical Implementation refinement increasing level of detail Unless you are a real genius like Mozart who could do everything right the first time, the “creative” process means a successive of refinement. So you would start out with an informal system requirement and keep on refining it until you have your final physical implementation. As you refine your design, you will keep on increasing the level of details in the specification. High level function (add) to gates +1 = 8 min. (X:48)
Design as Search Feasible (good) choices vs. Optimal choices Problem A Strategy 1 Strategy 2 SubProb2 SubProb3 SubProb 1 One way to think about the design process is that it is a search for the proper solution through the design space (point to the diagram). How do you know where to find the proper solution? Well usually you don’t. What you need to do is make educated guesses and then verify whether your guesses are correct. If you are correct, you congratulate yourself. You you are wrong, try again. You will have a set of design goals: some are given to you by your supervisors and some may set some for you own. In any case, with a set of goals and some may be contradicting, you must learn how to prioritize them. The way to remember about design is that there are many ways to do the same thing. There is really no such thing as the absolute “right way” to do certain things. Ideally, we like to always pick the best solution but remember your goal should be best solution for the ORIGINAL problem (Problem A). For the Sub-problems down here, you may not need to have to make the optimal choice each time. Sometimes all you need is a reasonable good choice. It takes design time to do best choice at every level, and if run out of design time can jeopardize project in fast moving technology If you have a choice that is good enough for a sub-problem, you should be happy with it and move onto other sub-problems that require your attention. Remember, even the world’s fastest ALU will not do you any good unless you have an equally fast controller to controls it. +3 = 11 min. (X:51) BB1 BB2 BB3 BBn Design involves educated guesses and verification -- Given the goals, how should these be prioritized? -- Given alternative design pieces, which should be selected? -- Given design space of components & assemblies, which part will yield the best solution? Feasible (good) choices vs. Optimal choices
Problem: Design a “fast” ALU for the MIPS ISA Requirements? Must support the Arithmetic / Logic operations Tradeoffs of cost and speed based on frequency of occurrence, hardware budget
Add, AddU, Sub, SubU, AddI, AddIU And, Or, AndI, OrI, Xor, Xori, Nor MIPS ALU requirements Add, AddU, Sub, SubU, AddI, AddIU => 2’s complement adder/sub with overflow detection And, Or, AndI, OrI, Xor, Xori, Nor => Logical AND, logical OR, XOR, nor SLTI, SLTIU (set less than) => 2’s complement adder with inverter, check sign bit of result ALU from from CS 150 / P&H book chapter 4 supports these ops
MIPS arithmetic instruction format 31 25 20 15 5 R-type: op Rs Rt Rd funct I-Type: op Rs Rt Immed 16 Type op funct ADDI 10 xx ADDIU 11 xx SLTI 12 xx SLTIU 13 xx ANDI 14 xx ORI 15 xx XORI 16 xx LUI 17 xx Type op funct ADD 00 40 ADDU 00 41 SUB 00 42 SUBU 00 43 AND 00 44 OR 00 45 XOR 00 46 NOR 00 47 Type op funct 00 50 00 51 SLT 00 52 SLTU 00 53 Signed arith generate overflow, no carry
Design Trick: divide & conquer Break the problem into simpler problems, solve them and glue together the solution Example: assume the immediates have been taken care of before the ALU 10 operations (4 bits) 00 add 01 addU 02 sub 03 subU 04 and 05 or 06 xor 07 nor 12 slt 13 sltU
Refined Requirements ALU (1) Functional Specification inputs: 2 x 32-bit operands A, B, 4-bit mode outputs: 32-bit result S, 1-bit carry, 1 bit overflow operations: add, addu, sub, subu, and, or, xor, nor, slt, sltU (2) Block Diagram (powerview symbol, VHDL entity) 32 32 A B 4 ALU c m ovf S 32
Behavioral Representation: VHDL Entity ALU is generic (c_delay: integer := 20 ns; S_delay: integer := 20 ns); port ( signal A, B: in vlbit_vector (0 to 31); signal m: in vlbit_vector (0 to 3); signal S: out vlbit_vector (0 to 31); signal c: out vlbit; signal ovf: out vlbit) end ALU; C_delay is the carry delay C_delay is the day fdor the sum (S) Some signals are bit vectors(A,B,S,m), some are single bit(c,ovflw) . . . S <= A + B;
Bit slice with carry look-ahead . . . Design Decisions ALU bit slice 7-to-2 C/L 7 3-to-2 C/L PLD Gates CL0 CL6 mux Simple bit-slice big combinational problem many little combinational problems partition into 2-step problem Bit slice with carry look-ahead . . .
Refined Diagram: bit-slice ALU 32 A B 32 ALU0 a0 b0 m cin co s0 ALU31 a31 b31 m cin co s31 4 M Ovflw 32 S
7-to-2 Combinational Logic start turning the crank . . . Function Inputs Outputs K-Map M0 M1 M2 M3 A B Cin S Cout add 0 0 0 0 0 0 0 0 0 Just fill in all combinations that you want 127
Design trick 3: solve part of the problem and extend Seven plus a MUX ? Design trick 2: take pieces you know (or can imagine) and try to put them together Design trick 3: solve part of the problem and extend A B 1-bit Full Adder CarryOut Mux CarryIn Result add and or S-select Now that I have shown you how to build a 1-bit full adder, we have all the major components needed for this 1-bit ALU. In order to build a 4-bit ALU, we simply connect four 1-bit ALUs in series to feed the CarryOut of one ALU to the CarryIn of the next ALU. Even though I called this an ALU, I actually lied a little. There is something missing about this ALU. This ALU can NOT perform the subtract operation. Let’s see how can we fix this problem. 2 min = 35 min. (Y:15)
Additional operations A - B = A + (– B) = A + B + 1 form two complement by invert and add one S-select invert CarryIn and A or Result Mux add 1-bit Full Adder B CarryOut Set-less-than? – left as an exercise
LSB and MSB need to do a little extra Revised Diagram LSB and MSB need to do a little extra 32 A B 32 a0 b0 a31 b31 4 ALU0 ALU0 M ? co cin co cin s0 s31 C/L to produce select, comp, c-in Ovflw 32 S
Overflow Examples: 7 + 3 = 10 but ... - 4 - 5 = - 9 but ... Decimal Binary Decimal 2’s Complement 0000 0000 1 0001 -1 1111 2 0010 -2 1110 3 0011 -3 1101 4 0100 -4 1100 5 0101 -5 1011 6 0110 -6 1010 Well so far so good but life is not always perfect. Let’s consider the case 7 plus 3, you will get 10. But if you perform the binary arithmetics on our 4-bit adder you will get 1010, which is negative 6. Similarly, if you try to add negative 4 and negative 5 together, you should get negative 9. But the binary arithmetics will give you 0111, which is 7. So what went wrong? The problem is overflow. The number you get are simply too big, in the positive 10 case, and too small in the negative 9 case, to be represented by four bits. +2 = 39 min. (Y:19) 7 0111 -7 1001 -8 1000 Examples: 7 + 3 = 10 but ... - 4 - 5 = - 9 but ... 1 1 1 1 1 1 1 7 1 1 – 4 3 – 5 + 1 1 + 1 1 1 1 1 – 6 1 1 1 7
Overflow Detection Overflow: the result is too large (or too small) to represent properly Example: - 8 4-bit binary number 7 When adding operands with different signs, overflow cannot occur! Overflow occurs when adding: 2 positive numbers and the sum is negative 2 negative numbers and the sum is positive On your own: Prove you can detect overflow by: Carry into MSB Carry out of MSB Recalled from some earlier slides that the biggest positive number you can represent using 4-bit is 7 and the smallest negative you can represent is negative 8. So any time your addition results in a number bigger than 7 or less than negative 8, you have an overflow. Keep in mind is that whenever you try to add two numbers together that have different signs, that is adding a negative number to a positive number, overflow can NOT occur. Overflow occurs when you to add two positive numbers together and the sum has a negative sign. Or, when you try to add negative numbers together and the sum has a positive sign. If you spend some time, you can convince yourself that If the Carry into the most significant bit is NOT the same as the Carry coming out of the MSB, you have a overflow. +2 = 41 min. (Y:21) 1 1 1 1 1 1 1 7 1 1 –4 3 – 5 + 1 1 + 1 1 1 1 1 – 6 1 1 1 7
Overflow Detection Logic Carry into MSB Carry out of MSB For a N-bit ALU: Overflow = CarryIn[N - 1] XOR CarryOut[N - 1] CarryIn0 A0 1-bit ALU Result0 X Y X XOR Y B0 A1 B1 1-bit ALU Result1 CarryIn1 CarryOut1 CarryOut0 1 1 Recall the XOR gate implements the not equal function: that is, its output is 1 only if the inputs have different values. Therefore all we need to do is connect the carry into the most significant bit and the carry out of the most significant bit to the XOR gate. Then the output of the XOR gate will give us the Overflow signal. +1 = 42 min. (Y:22) 1 1 1 1 CarryIn2 A2 1-bit ALU Result2 B2 CarryIn3 Overflow A3 1-bit ALU Result3 B3 CarryOut3
LSB and MSB need to do a little extra More Revised Diagram LSB and MSB need to do a little extra 32 A B 32 signed-arith and cin xor co a0 b0 a31 b31 4 ALU0 ALU0 M co cin co cin s0 s31 C/L to produce select, comp, c-in Ovflw 32 S
But What about Performance? Critical Path of n-bit Rippled-carry adder is n*CP A0 B0 1-bit ALU Result0 CarryIn0 CarryOut0 A1 B1 Result1 CarryIn1 CarryOut1 A2 B2 Result2 CarryIn2 CarryOut2 A3 B3 Result3 CarryIn3 CarryOut3 Design Trick: Throw hardware at it
Carry Look Ahead (Design trick: peek) C0 = Cin A B C-out 0 0 0 “kill” 0 1 C-in “propagate” 1 0 C-in “propagate” 1 1 1 “generate” A0 B0 A1 B1 A2 B2 A3 B3 S G P C1 = G0 + C0 P0 G = A and B P = A xor B Names: suppose G0 is 1 => carry no matter what else => generates a carry suppose G0 =0 and P0=1 => carry IFF C0 is a 1 => propagates a carry Like dominoes What about more than 4 bits? C2 = G1 + G0 P1 + C0 P0 P1 C3 = G2 + G1 P2 + G0 P1 P2 + C0 P0 P1 P2 G P C4 = . . .
Cascaded Carry Look-ahead (16-bit): Abstraction G0 P0 C1 = G0 + C0 P0 4-bit Adder C2 = G1 + G0 P1 + C0 P0 P1 4-bit Adder C3 = G2 + G1 P2 + G0 P1 P2 + C0 P0 P1 P2 4-bit Adder G P C4 = . . .
Design Trick: Guess (or “Precompute”) CP(2n) = 2*CP(n) n-bit adder n-bit adder CP(2n) = CP(n) + CP(mux) Use multiplexor to save time: guess both ways and then select (assumes mux is faster than adder) n-bit adder 1 n-bit adder n-bit adder Cout Carry-select adder
Carry Skip Adder: reduce worst case delay B A4 B A0 4-bit Ripple Adder 4-bit Ripple Adder P3 S P3 S P2 P2 P1 P1 P0 P0 Just speed up the slowest case for each block Exercise: optimal design uses variable block sizes
Additional MIPS ALU requirements Mult, MultU, Div, DivU (next lecture) => Need 32-bit multiply and divide, signed and unsigned Sll, Srl, Sra (next lecture) => Need left shift, right shift, right shift arithmetic by 0 to 31 bits Nor (leave as exercise to reader) => logical NOR or use 2 steps: (A OR B) XOR 1111....1111
Elements of the Design Process Divide and Conquer (e.g., ALU) Formulate a solution in terms of simpler components. Design each of the components (subproblems) Generate and Test (e.g., ALU) Given a collection of building blocks, look for ways of putting them together that meets requirement Successive Refinement (e.g., carry lookahead) Solve "most" of the problem (i.e., ignore some constraints or special cases), examine and correct shortcomings. Formulate High-Level Alternatives (e.g., carry select) Articulate many strategies to "keep in mind" while pursuing any one approach. Work on the Things you Know How to Do The unknown will become “obvious” as you make progress. Here are some key elements of the design process. First is divide and conquer. (a) First you formulate a solution in terms of simpler components. (b) Then you concentrate on designing each components. Once you have the individual components built, you need to find a way to put them together to solve our original problem. Unless you are really good or really lucky, you probably won’t have a perfect solution the first time so you will need to apply successive refinement to your design. While you are pursuing any one approach, you need to keep alternate strategies in mind in case what you are pursuing does not work out. One of the most important advice I can give you is that work on the things you know how to do first. As you make forward progress, a lot of the unknowns will become clear. If you sit around and wait until you know everything before you start, you will never get anything done. +2 = 15 min. (X:55)
Summary of the Design Process Hierarchical Design to manage complexity Top Down vs. Bottom Up vs. Successive Refinement Importance of Design Representations: Block Diagrams Decomposition into Bit Slices Truth Tables, K-Maps Circuit Diagrams Other Descriptions: state diagrams, timing diagrams, reg xfer, . . . Optimization Criteria: Gate Count [Package Count] top down bottom up This slide summaries some of the key points of the design process. Using a hierarchical design style is the best way to manage complexity because it allows you to ignore low level details while concentrating on the big picture. However, you cannot ignore the details forever. That’s why you need to use both the top-down and bottom-up strategies as you make successive refinements to your design. Some of the optimization criteria are: chip or broad area, pin count if you are designing a chip, delay, power, cost, and last but not least, the design time. As I pointed out at the last lecture, the computer market is so competitive that if your product is late by a year, you may fall way behind in the performance curve so design time can be one of the most important consideration. +2 = 57 min. (X:57) mux design meets at TT Area Logic Levels Fan-in/Fan-out Delay Power Pin Out Cost Design time
Why should you keep a design notebook? Keep track of the design decisions and the reasons behind them Otherwise, it will be hard to debug and/or refine the design Write it down so that can remember in long project: 2 weeks ->2 yrs Others can review notebook to see what happened Record insights you have on certain aspect of the design as they come up Record of the different design & debug experiments Memory can fail when very tired Industry practice: learn from others mistakes Well, the goal of this part of the lecture is to convince EACH of you should keep your OWN design note book. Why? Well, first of all, you need to keep track of all the design decisions you made and may be more importantly, the reasons behind your design decisions. This may not be that important when your project life span is only a few weeks but after you graduate, you will work on projects that last for 2 to 3 years. And if you don’t write things down, you may not remember how you do certain things and why and you may find it very hard to debug and refine your design. Also, sometimes when you are working on certain part of the design, you may suddenly get some insights on another part of the design. You may not have time to follow up your insights immediately and if you don’t write them down, you may never be able to reconstruct them later when you have time. Finally, it is very important for you to write down everything you see on the tests or experiments you run when you are debugging your design. +2 = 59 min. (Y:39)
Why do we keep it on-line? You need to force yourself to take notes Open a window and leave an editor running while you work 1) Acts as reminder to take notes 2) Makes it easy to take notes 1) + 2) => will actually do it Take advantage of the window system’s “cut and paste” features It is much easier to read your typing than your writing Also, paper log books have problems Limited capacity => end up with many books May not have right book with you at time vs. networked screens Can use computer to search files/index files to find what looking for The next question some of you may want to ask is, OK, I will keep a note book. But why should I keep it on line? Well, let’s be honest to ourselves. All of us need a little bit reminder to force ourselves to take notes while we work. One of the best reminder I find is the window system of modern workstation. By keeping an extra window open and have an editor running, it makes taking notes very easy and the editor also serves as a constant reminder for you to take notes. Also by keeping your notebook on-line, you can take advantage of the window system’s cut and paste feature to drop important “print outs” into your note book. Finally, although you may be able to read your own handwriting much better than anybody else, it is still easier to read your own typing than your own writing. +2 = 61 min. (Y:41)
Separate the entries by dates How should you do it? Keep it simple DON’T make it so elaborate that you won’t use (fonts, layout, ...) Separate the entries by dates type “date” command in another window and cut&paste Start day with problems going to work on today Record output of simulation into log with cut&paste; add date May help sort out which version of simulation did what Record key email with cut&paste Record of what works & doesn’t helps team decide what went wrong after you left Index: write a one-line summary of what you did at end of each day How should you keep your on-line notebook? By all means, Keep It Simple. The on-line notebook should help you trace down and solve your problems. It should NOT become one of your problems. In order to keep the note book easy to read, you should separate your entries by dates. Furthermore, before you sign off each date, we should write a one-line summary of what you did and this will serve as the index to your notebook. Let me show you some examples. +2 = 63 min. (Y:43)
On-line Notebook Example Refer to the handout “Example of On-Line Log Book” on cs 152 home page Spend 10 minutes on the notebook example: 6 minutes per page. +12 = 75 min. (Y:55)
1st page of On-line notebook (Index + Wed. 9/6/95) Wed Sep 6 00:47:28 PDT 1995 - Created the 32-bit comparator component Thu Sep 7 14:02:21 PDT 1995 - Tested the comparator Mon Sep 11 12:01:45 PDT 1995 - Investigated bug found by Bart in comp32 and fixed it + ==================================================================== Wed Sep 6 00:47:28 PDT 1995 Goal: Layout the schematic for a 32-bit comparator I've layed out the schemtatics and made a symbol for the comparator. I named it comp32. The files are ~/wv/proj1/sch/comp32.sch ~/wv/proj1/sch/comp32.sym Wed Sep 6 02:29:22 PDT 1995 - ==================================================================== Add 1 line index at front of log file at end of each session: date+summary Start with date, time of day + goal Make comments during day, summary of work End with date, time of day (and add 1 line summary at front of file)
2nd page of On-line notebook (Thursday 9/7/95) + ==================================================================== Thu Sep 7 14:02:21 PDT 1995 Goal: Test the comparator component I've written a command file to test comp32. I've placed it in ~/wv/proj1/diagnostics/comp32.cmd. I ran the command file in viewsim and it looks like the comparator is working fine. I saved the output into a log file called ~/wv/proj1/diagnostics/comp32.log Notified the rest of the group that the comparator is done. Thu Sep 7 16:15:32 PDT 1995 - ====================================================================
3rd page of On-line notebook (Monday 9/11/95) + =================================================================== = Mon Sep 11 12:01:45 PDT 1995 Goal: Investigate bug discovered in comp32 and hopefully fix it Bart found a bug in my comparator component. He left the following e-mail. ------------------- From bart@simpsons.residence Sun Sep 10 01:47:02 1995 Received: by wayne.manor (NX5.67e/NX3.0S) id AA00334; Sun, 10 Sep 95 01:47:01 -0800 Date: Wed, 10 Sep 95 01:47:01 -0800 From: Bart Simpson <bart@simpsons.residence> To: bruce@wanye.manor, old_man@gokuraku, hojo@sanctuary Subject: [cs152] bug in comp32 Status: R Hey Bruce, I think there's a bug in your comparator. The comparator seems to think that ffffffff and fffffff7 are equal. Can you take a look at this? Bart ----------------
4th page of On-line notebook (9/11/95 contd) I verified the bug. here's a viewsim of the bug as it appeared.. (equal should be 0 instead of 1) ------------------ SIM>stepsize 10ns SIM>v a_in A[31:0] SIM>v b_in B[31:0] SIM>w a_in b_in equal SIM>a a_in ffffffff\h SIM>a b_in fffffff7\h SIM>sim time = 10.0ns A_IN=FFFFFFFF\H B_IN=FFFFFFF7\H EQUAL=1 Simulation stopped at 10.0ns. ------------------- Ah. I've discovered the bug. I mislabeled the 4th net in the comp32 schematic. I corrected the mistake and re-checked all the other labels, just in case. I re-ran the old diagnostic test file and tested it against the bug Bart found. It seems to be working fine. hopefully there aren’t any more bugs:)
5th page of On-line notebook (9/11/95 contd) On second inspectation of the whole layout, I think I can remove one level of gates in the design and make it go faster. But who cares! the comparator is not in the critical path right now. the delay through the ALU is dominating the critical path. so unless the ALU gets a lot faster, we can live with a less than optimal comparator. I e-mailed the group that the bug has been fixed Mon Sep 11 14:03:41 PDT 1995 - ================================================================ ==== Perhaps later critical path changes; what was idea to make compartor faster? Check log book!
Added benefit: cool post-design statistics Sample graph from the Alewife project: For the Communications and Memory Management Unit (CMMU) These statistics came from on-line record of bugs
An Overview of the Design Process Lecture Summary Cost and Price Die size determines chip cost: cost die size( +1) Cost v. Price: business model of company, pay for engineers R&D must return $8 to $14 for every $1 invester An Overview of the Design Process Design is an iterative process, multiple approaches to get started Do NOT wait until you know everything before you start Example: Instruction Set drives the ALU design On-line Design Notebook Open a window and keep an editor running while you work;cut&paste Refer to the handout as an example Former CS 152 students (and TAs) say they use on-line notebook for programming as well as hardware design; one of most valuable skills