+ CS 325: CS Hardware and Software Organization and Architecture Exam 2: Study Guide
+ Combinational Logic Combinational Logic Classifications 1 Bit Binary Half-Adder Binary Full-Adder 4 Bit Binary Adder N-Bit Binary Adder Binary Subtractor Binary Decoder 2 Bit ALU
+ Classification of Combinational Logic Combinational Logic Circuit Arithmetic & Logical Functions Adders Subtractors Comparitors Data Transmission Multiplexers Demultiplexers Encoders Decoders Code Converters Binary BCD 7-segment
+ 1-bit Binary Half-Adder Uses AND and XOR gates. “adds” two single bit binary number to produce two outputs: Sum and Carry ABSumCarry
+ Binary Full Adder w/ Carry-In Uses AND, OR, and XOR gates. Basically two half-adders connected together. Three inputs: A, B, and Carry-In Two outputs: Sum and Carry-out ABC-inSumC-out
+ 4-bit Binary Adder Simply 4 full adders cascaded together. Each full adder represents a single weighted column. Carry signals connected producing a “ripple” effect from left to right. Also called the “Ripple Carry Binary Adder”.
+ N-bit Binary Adder Cascading Full adders can be used to accommodate N-bit Binary numbers. Problems with the N-bit Binary Adder?
+ 4-bit Binary Subtractor Since we know how to add two 4-bit binary numbers, how can we go about subtracting them? Example: A - B Special subtraction combinational circuits? Not needed! We can convert B to it’s 2’s compliment equivalent and still use our 4-bit binary adder. This can be achieved by using a NOT gate on each input of B. To complete 2’s compliment, we’ll need to set the first carry-in to “1”, which will add 1 to the 1’s compliment notation of B.
+ Binary Decoder – Simple Example The simplest example of a Binary decoder is the NOT gate. The NOT gate can be shown as a 1-to-2 Binary decoder. AQ0Q0 Q1Q
+ 2-to-4 Binary Decoder The following is an example of a 2-input, 4-output Binary Decoder. The inputs, A and B, determine which output, Q 0 to Q 3, is “high” while the remaining outputs are “low”. Only one output can be active at any given time. ABQ0Q0 Q1Q1 Q2Q2 Q3Q
+ 3-to-8 Binary Decoder Binary Decoders can also be represented by block notation: ABCQ0Q1Q2Q3Q4Q5Q6Q
+ Simple Arithmetic Logic Unit (ALU)
+ Sequential Circuits Sequential Circuits Overview Clock Signals Classification of Sequential Circuits Latches/Flip Flops S-R Latch S-R Flip Flop D Flip Flop J-K Flip Flop
+ Sequential Circuit Representation
+ S–R Latch Two stable states: S momentarily set to 1 R momentarily set to 1 Q = 1 Q = 0
+ S–R Latch Definition State Table Simplified State Table Current Inputs Current State Next State SRQnQn Q n X 111X SR 00QnQn X
+ Clocked S-R Latch (S-R Flip-Flop) Synchronous sequential circuit Based on clock pulse Events in a computer are typically synchronized to a clock pulse, so that changes occur only when a clock pulse changes state.
+ D Flip-Flop
+ D Flip-Flop Block Diagram
+ J-K Flip-Flop Synchronous sequential circuit Based on clock pulse The J-K Flip Flop is the most widely used of all flip-flop designs. The sequential operation is exactly the same as for the S-R Flip Flop. The difference is the J-K Flip Flop has no invalid or forbidden input states. JKQ n+1 00Qn
+ Evolution and Performance Generations in Computer Organization Milestones in Computer Organization Von Neumann Architecture Moore’s Law CPU Transistor sizes and count Memory Hierarchy Performance Cache Memory Performance Issues and Solutions
+ Structure of Von Neumann Architecture
+ Milestones in Computer Organization Moore’s Law: Number of transistors on a chip doubles every 18 months.
+ Performance Balance CPU performance increasing Memory capacity increasing Memory speed lagging behind CPU performance
+ Core Memory 1950’s – 1970’s 1 core = 1 bit Polarity determines logical “1” or “0” Roughly 1Mhz clock rate. Up to 32kB storage.
+ Semiconductor Memory 1970’s - Today Fairchild Size of a single core i.e. 1 bit of magnetic core storage Holds 256 bits Non-destructive read, but volatile SDRAM most common, uses capacitors. Much faster than core Today: 1.3 – 3.1 Ghz Capacity approximately doubles each year. Today: 64GB per single DIMM
+ Problems with Clock Speed and Logic Density Power Power density increases with density of logic and clock speed Dissipating heat Resistor-Capacitor (RC) delay Speed at which electrons flow limited by resistance and capacitance of metal wires connecting them Delay increases as RC product increases Wire interconnects thinner, increasing resistance Wires closer together, increasing capacitance Memory latency Memory speeds lag processor speeds Solution: More emphasis on organizational and architectural approaches
+ Evolution and Performance 2 Von Neumann Architecture Processor Hierarchy Registers ALU Processor Categories Processor Performance Amdahl’s Law Computer Benchmarks
+ Parts of a Conventional Processor ALU Status Flags: Neg, Zero, Carry, Overflow Shifter: Left multiplication by 2 Right division by 2 Complementer: Logical NOT
+ Processor Categories and Roles Many possible roles for individual processors in: Coprocessors Microcontrollers Microsequencers Embedded system processors General purpose processors
+ Basic Performance Equation Define:N = Number of instructions executed in the program S = Average number of cycles for instructions in the program R = Clock rate T = Program execution time T = N * S R
+ Improve Performance To improve performance: Decrease N and/or S Increase R Parameters are not independent: Increasing R may increase S as well N is primarily controlled by compiler Processors with large R may not have the best performance Due to larger S Making logic circuits faster/smaller is a definite win Increases R while S and N remain unchanged
+ Amdahl’s Law Potential speed up of program using multiple processors. Concluded that: Code needs to be parallelizable Speed up is bound, giving diminishing returns for more processors Task dependent Servers gain by maintaining multiple connections on multiple processors Databases can be split into parallel tasks
+ Amdahl’s Law Most important principle in computer design: Make the common case fast Optimize for the normal case Enhancement: any change/modification in the design of a component Speedup: how much faster a task will execute using an enhanced component versus using the original component. Speedup = Component enhanced Component original
+ Amdahl’s Law The enhanced feature may not be used all the time. Let the fraction of the computation time when the enhanced feature is used be F. Let the speedup when the enhanced feature is used be Se. Now the execution time with the enhancement is: Ex new = Ex old * (1 – F) + Ex old * (F/Se) This gives the overall speedup (So) as: So = Exold/Exnew = 1 / ((1 - F) + (F/Se))
+ Amdahl’s Law – Example 1 Suppose that we are considering an enhancement that runs 10 times faster than the original component but is usable only 40% of the time. What is the overall speedup gained by incorporating the enhancement? Se = 10 F = 40 / 100 = 0.4 So = 1 / ((1 – F) + (F / Se)) = 1 / (0.6 + (0.4 / 10)) = 1 / 0.64 = 1.56
+ Amdahl’s Law – Example 2 Suppose that we hired a guru programmer that made 70% of our program run 15x faster that the original program. What is the speedup of the enhanced program? Se = 15 F = 70 / 100 = 0.7 So = 1 / ((1 – F) + (F / Se)) = 1 / (0.3 + (0.7 / 15)) = 1 / = 2.88