Alternative Parallel Processing Approaches

Slides:



Advertisements
Similar presentations
DSPs Vs General Purpose Microprocessors
Advertisements

CSCI 4717/5717 Computer Architecture
Chapter 8. Pipelining.
Chapter 8. Pipelining. Instruction Hazards Overview Whenever the stream of instructions supplied by the instruction fetch unit is interrupted, the pipeline.
Chapter 5: Computer Systems Organization Invitation to Computer Science, Java Version, Third Edition.
(Page 554 – 564) Ping Perez CS 147 Summer 2001 Alternative Parallel Architectures  Dataflow  Systolic arrays  Neural networks.
Pipelining. Overview Pipelining is widely used in modern processors. Pipelining improves system performance in terms of throughput. Pipelined organization.
Pipelining By Toan Nguyen.
3.1Introduction to CPU Central processing unit etched on silicon chip called microprocessor Contain tens of millions of tiny transistors Key components:
Moore’s Law the number of circuits on a single silicon chip doubles every 18 to 24 months.
Alternative Parallel Processing Approaches Jonathan Sagabaen.
C.S. Choy95 COMPUTER ORGANIZATION Logic Design Skill to design digital components JAVA Language Skill to program a computer Computer Organization Skill.
Lecture#14. Last Lecture Summary Memory Address, size What memory stores OS, Application programs, Data, Instructions Types of Memory Non Volatile and.
Invitation to Computer Science 5th Edition
CS 1308 Computer Literacy and the Internet Computer Systems Organization.
Chapter 5: Computer Systems Organization Invitation to Computer Science, Java Version, Third Edition.
Pipeline And Vector Processing. Parallel Processing The purpose of parallel processing is to speed up the computer processing capability and increase.
TRIPS – An EDGE Instruction Set Architecture Chirag Shah April 24, 2008.
5-1 Computer Components Consider the following ad.
Computer Organization - 1. INPUT PROCESS OUTPUT List different input devices Compare the use of voice recognition as opposed to the entry of data via.
Computer Architecture 2 nd year (computer and Information Sc.)
CS 1308 Computer Literacy and the Internet. Objectives In this chapter, you will learn about:  The components of a computer system  Putting all the.
Chapter 9 Alternative Architectures. 2 Chapter 9 Objectives Learn the properties that often distinguish RISC from CISC architectures. Understand how multiprocessor.
Chapter 5: Computer Systems Design and Organization Dr Mohamed Menacer Taibah University
Computer Architecture Lecture 26 Past and Future Ralph Grishman November 2015 NYU.
Autumn 2006CSE P548 - Dataflow Machines1 Von Neumann Execution Model Fetch: send PC to memory transfer instruction from memory to CPU increment PC Decode.
What’s going on here? Can you think of a generic way to describe both of these?
Alternative Parallel Processing Approaches
William Stallings Computer Organization and Architecture 6th Edition
Computer Hardware What is a CPU.
CPU Lesson 2.
Computer Organization and Architecture Lecture 1 : Introduction
A Level Computing – a2 Component 2 1A, 1B, 1C, 1D, 1E.
QUANTUM COMPUTING: Quantum computing is an attempt to unite Quantum mechanics and information science together to achieve next generation computation.
Computer Organization and Architecture + Networks
Optical RESERVOIR COMPUTING
Computer Architecture Chapter (14): Processor Structure and Function
Control Unit Lecture 6.
ARM Organization and Implementation
Micro-programmed Control
Chapter 4 Processor Technology and Architecture
William Stallings Computer Organization and Architecture 8th Edition
Advanced Topic: Alternative Architectures Chapter 9 Objectives
Architecture & Organization 1
Chapter 14 Instruction Level Parallelism and Superscalar Processors
Chapter 3 Top Level View of Computer Function and Interconnection
Computer Architecture and Organization
COMP4211 : Advance Computer Architecture
Morgan Kaufmann Publishers The Processor
Pipelining and Vector Processing
Architecture & Organization 1
CISC AND RISC SYSTEM Based on instruction set, we broadly classify Computer/microprocessor/microcontroller into CISC and RISC. CISC SYSTEM: COMPLEX INSTRUCTION.
CSCE Fall 2013 Prof. Jennifer L. Welch.
3.1 Introduction to CPU Central processing unit etched on silicon chip called microprocessor Contain tens of millions of tiny transistors Key components:
OVERVIEW OF BIOLOGICAL NEURONS
Chapter 5: Computer Systems Organization
BIC 10503: COMPUTER ARCHITECTURE
Chapter 1 Introduction.
Computer Architecture
Artificial Intelligence Lecture No. 28
Chapter 8. Pipelining.
CSCE Fall 2012 Prof. Jennifer L. Welch.
8086 processor.
ECE 352 Digital System Fundamentals
Real time signal processing
COMPUTER ORGANIZATION AND ARCHITECTURE
Course Code 114 Introduction to Computer Science
Presentation transcript:

Alternative Parallel Processing Approaches Some people argue that real breakthroughs in computational power will occur only by abandoning the von Neumann model. Numerous efforts are now underway to devise systems that could change the way that we think about computers and computation. In this section, we will look at three of these: dataflow computing, neural networks, and systolic processing. 1 1

Alternative Parallel Processing Approaches – Dataflow Computing The ole Von Neumann machines exhibit sequential control flow: A linear stream of instructions is fetched from memory, and they act upon data. Program flow changes under the direction of branching instructions. In dataflow computing, program control is directly controlled by data dependencies. There is no program counter or shared storage. Data flows continuously and is available to multiple instructions simultaneously. A data flow graph represents the computation flow in a dataflow computer. Its nodes contain the instructions and its arcs indicate the data dependencies. 2 2

Alternative Parallel Processing Approaches – Dataflow Computing When a node has all of the data tokens it needs, it fires, performing the required operation, and consuming the token. The result is placed on an output arc. The architecture of a dataflow computer consists of processing elements that communicate with one another. Each processing element has an enabling unit that sequentially accepts tokens and stores them in memory. If the node to which this token is addressed fires, the input tokens are extracted from memory and are combined with the node itself to form an executable packet. 3 3

Alternative Parallel Processing Approaches – Neural Networks Neural network computers consist of a large number of simple processing elements that individually solve a small piece of a much larger problem. They are particularly useful in dynamic situations that are an accumulation of previous behavior, and where an exact algorithmic solution cannot be formulated. Neural networks can deal with imprecise, probabilistic information, and allow for adaptive interactions. Neural network processing elements (PEs) multiply a set of input values by an adaptable set of weights to yield a single output value. The computation carried out by each PE is simplistic when compared to a traditional microprocessor. Their power lies in their massively parallel architecture and their ability to adapt to the dynamics of the problem space. Neural networks learn from their environments. A built-in learning algorithm directs this process. 4 4

Alternative Parallel Processing Approaches – Neural Networks The simplest neural net PE (processing element) is the perceptron. Perceptrons are trainable neurons. A perceptron produces a Boolean output based upon the values that it receives from several inputs. Perceptrons are trainable because the threshold and input weights are modifiable. In this example, the output Z is true (1) if the net input, w1x1 + w2x2 + . . .+ wnxn is greater than the threshold T. The biggest problem with neural nets is that when they consist of more than 10 or 20 neurons, it is impossible to understand how the net is arriving at its results. They can derive meaning from data that are too complex to be analyzed by people. Despite early setbacks, neural nets are gaining credibility in sales forecasting, data validation, and facial recognition. 5 5

Alternative Parallel Processing Approaches – Systolic Arrays Systolic Arrays are a network of processing elements the methodically compute data by circulating it through the system. Systolic arrays, a variation of SIMD computers, have simple processors that process data by circulating it through vector pipelines. Systolic arrays can sustain great throughout because they employ a high degree of parallelism. Connections are short, and the design is simple and scalable. They are robust, efficient, and cheap to produce. They are, however, highly specialized and limited as to they types of problems they can solve. 6 6

Future Alternatives Computers, as we know them are binary, transistor-based systems. But transistor-based systems are strained to keep up with our computational demands. Transistors are becoming so small that it is hard for them to hold electrons in the way in which we're accustomed to. Thus, an alternative to transistor-based systems is Optical or Photonic Computing systems Rather than electrons performing logic, photons of laser light are used Speed of light could be the upper limit in speed No heat dissipation issues And light beams can travel in parallel – increasing performance even more Many years before this alternative makes it to the mainstream Another alternative to transistor-based systems is Biological Computing systems Uses components from living organisms instead of inorganic silicon ones. Example 1: using neurons from leeches – able to control the behavior Example 2: using DNA as software and enzymes as hardware (called DNA computing) – use DNA strands to test all solutions at once and output a correct answer Example 3: using bacteria that can turn genes on and off in predictable ways 7 7

Future Alternatives Another alternative to transistor-based systems is Quantum Computing systems Quantum computing uses quantum bits (qubits) that can be in multiple states at once. The "state" of a qubit is determined by the spin of an electron. A thorough discussion of "spin" is under the domain of quantum physics. A qubit can be in multiple states at the same time. This is called superpositioning. A 3-bit register can simultaneously hold the values 0 through 7 8 operations can be performed at the same time. This phenomenon is called quantum parallelism. A system with 600 qbits can superposition 2 600 states 8 8

Quantum Computing Quantum computers may be applied in the areas of cryptography, true random-number generation, and in the solution of other intractable problems. Making effective use of quantum computers requires rethinking our approach to problems and the development of new algorithms. There is a law (Rose’s Law) that states that the number of qubits that can be assembled to successfully perform computations will double every 12 months; this has been precisely the case for the past nine year One of the largest obstacles to the progress of quantum computation is the tendency for qubits to decay into a state of decoherence. Decoherence causes uncorrectable errors. Advanced error-correction algorithms have been applied to this problem and show promise. Much research remains to be done, however. 9 9

- Quantum Computing - The realization of quantum computing has raised questions about technological singularity. Technological singularity is the theoretical point when human technology has fundamentally and irreversibly altered human development This is the point when civilization changes to an extent that its technology is incomprehensible to previous generations. Are we there, now ?????????? 10 10

Pipelining T ime Pipelining is a way of effectively organizing concurrent activities in improving performance I I I 1 2 3 F E F E F E 1 1 2 2 3 3 (a) Sequential execution Interstage buffer B1 Instruction Ex ecution fetch unit unit (b) Hardware organization T ime Clock cycle 1 2 3 4 Instruction I F E 1 1 1 I F E 2 2 2 I F E 3 3 3 (c) Pipelined execution Figure 8.1. Basic idea of instruction pipelining.

Chapter 8 section 1 - Pipelining

Chapter 8 section 1 - Pipelining Pipelining only works if the various tasks take about the same amount of time. Fetching from main memory can be ten-fold longer than other tasks (caching). This stall is called a “data hazard” because the source or destination operand is not available when expected in the pipeline.

Chapter 8 section 1 - Pipelining You can also experience a stall due to an instruction NOT being available when expected. This stall called an “instruction hazard” or “control hazard” – caused by a missed caching opportunity which causes the processor to fetch from main memory. Instruction I2 fetch is delayed due to cache miss Instruction I1 fetch was fine

Chapter 8 section 1 - Pipelining You can also experience a stall due to TWO instructions requiring the use of the same hardware resources This stall called a “structural hazard”. A A good example would be, one instruction needing access to memory for an EXECUTE or WRITE, while some other instruction need access at the same time for a FETCH. Having separate instruction and data caches can help with this problem. Example: Load X(R1),R2 Operand read is written into register R2 in cycle 6 Memory access takes place in cycle 5 Memory address X+[R1] is computed in step E2 Causes stall for I3 because both require access to the SAME register during cycle 6