Efficient Representation of Data Structures on Associative Processors Jalpesh K. Chitalia (Advisor Dr. Robert A. Walker) Computer Science Department Kent.

Slides:



Advertisements
Similar presentations
Topics covered: CPU Architecture CSE 243: Introduction to Computer Architecture and Hardware/Software Interface.
Advertisements

Binary Trees CSC 220. Your Observations (so far data structures) Array –Unordered Add, delete, search –Ordered Linked List –??
Advanced Data Structures
Advanced Databases: Lecture 2 Query Optimization (I) 1 Query Optimization (introduction to query processing) Advanced Databases By Dr. Akhtar Ali.
1 Lecture-2 CSIT-120 Spring 2001 Revision of Lecture-1 Introducing Computer Architecture The FOUR Main Elements Fetch-Execute Cycle A Look Under the Hood.
Sabegh Singh Virdi ASC Processor Group Computer Science Department
Advanced Topics in Algorithms and Data Structures An overview of the lecture 2 Models of parallel computation Characteristics of SIMD models Design issue.
Chapter 5: Computer Systems Organization Invitation to Computer Science, Java Version, Third Edition.
Efficient Associative SIMD Processing for Non-Tabular Data Jalpesh K. Chitalia and Robert A. Walker Computer Science Department Kent State University.
Multithreaded ASC Kevin Schaffer and Robert A. Walker ASC Processor Group Computer Science Department Kent State University.
©Brooks/Cole, 2003 Chapter 12 Abstract Data Type.
B-Trees Disk Storage What is a multiway tree? What is a B-tree?
Chapter 16 Control Unit Implemntation. A Basic Computer Model.
Recap – Our First Computer WR System Bus 8 ALU Carry output A B S C OUT F 8 8 To registers’ input/output and clock inputs Sequence of control signal combinations.
(Page 554 – 564) Ping Perez CS 147 Summer 2001 Alternative Parallel Architectures  Dataflow  Systolic arrays  Neural networks.
State Machines Timing Computer Bus Computer Performance Instruction Set Architectures RISC / CISC Machines.
1 Lecture-2 CS-120 Fall 2000 Revision of Lecture-1 Introducing Computer Architecture The FOUR Main Elements Fetch-Execute Cycle A Look Under the Hood.
1 B-Trees Disk Storage What is a multiway tree? What is a B-tree? Why B-trees? Comparing B-trees and AVL-trees Searching a B-tree Insertion in a B-tree.
Models of Parallel Computation Advanced Algorithms & Data Structures Lecture Theme 12 Prof. Dr. Th. Ottmann Summer Semester 2006.
Chapter 2: Impact of Machine Architectures What is the Relationship Between Programs, Programming Languages, and Computers.
Henry Hexmoor1 Chapter 10- Control units We introduced the basic structure of a control unit, and translated assembly instructions into a binary representation.
B-Trees and B+-Trees Disk Storage What is a multiway tree?
Microprocessor Systems Design I Instructor: Dr. Michael Geiger Spring 2012 Lecture 2: 80386DX Internal Architecture & Data Organization.
Important Problem Types and Fundamental Data Structures
Binary Trees Chapter 6.
CSE Lectures 22 – Huffman codes
Micro-operations Are the functional, or atomic, operations of a processor. A single micro-operation generally involves a transfer between registers, transfer.
An Introduction Chapter Chapter 1 Introduction2 Computer Systems  Programmable machines  Hardware + Software (program) HardwareProgram.
ECE 526 – Network Processing Systems Design Network Processor Architecture and Scalability Chapter 13,14: D. E. Comer.
Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved ADT Implementation:
Computer Architecture and Organization Introduction.
Chapter 5: Computer Systems Organization Invitation to Computer Science, Java Version, Third Edition.
Advanced Databases: Lecture 6 Query Optimization (I) 1 Introduction to query processing + Implementing Relational Algebra Advanced Databases By Dr. Akhtar.
Data Structure & File Systems Hun Myoung Park, Ph.D., Public Management and Policy Analysis Program Graduate School of International Relations International.
 DATA STRUCTURE DATA STRUCTURE  DATA STRUCTURE OPERATIONS DATA STRUCTURE OPERATIONS  BIG-O NOTATION BIG-O NOTATION  TYPES OF DATA STRUCTURE TYPES.
Computer Architecture Lecture 2 System Buses. Program Concept Hardwired systems are inflexible General purpose hardware can do different tasks, given.
1 Implementing An Associative Processor on FPGAs.
SOFTWARE DESIGN. INTRODUCTION There are 3 distinct types of activities in design 1.External design 2.Architectural design 3.Detailed design Architectural.
Data Structures Types of Data Structure Data Structure Operations Examples Choosing Data Structures Data Structures in Alice.
A summary of TOY. 4 Main Components Data Processor Control Processor Memory Input/Output Device.
P-Tree Implementation Anne Denton. So far: Logical Definition C.f. Dr. Perrizo’s slides Logical definition Defines node information Representation of.
Computer Architecture 2 nd year (computer and Information Sc.)
Elementary Data Organization. Outline  Data, Entity and Information  Primitive data types  Non primitive data Types  Data structure  Definition 
Electronic Analog Computer Dr. Amin Danial Asham by.
Vector and symbolic processors
Winter 2014Parallel Processing, Fundamental ConceptsSlide 1 2 A Taste of Parallel Algorithms Learn about the nature of parallel algorithms and complexity:
1 The Instruction Set Architecture September 27 th, 2007 By: Corbin Johnson CS 146.
What is a program? A sequence of steps
HYPERCUBE ALGORITHMS-1
Simple ALU How to perform this C language integer operation in the computer C=A+B; ? The arithmetic/logic unit (ALU) of a processor performs integer arithmetic.
3/12/2013Computer Engg, IIT(BHU)1 CONCEPTS-1. Pipelining Pipelining is used to increase the speed of processing It uses temporal parallelism In pipelining,
Onlinedeeneislam.blogspot.com1 Design and Analysis of Algorithms Slide # 1 Download From
Introduction Contain two or more CPU share common memory and peripherals. Provide greater system throughput. Multiple processor executing simultaneous.
Data Structure and Algorithms
A Scalable Pipelined Associative SIMD Array With Reconfigurable PE Interconnection Network For Embedded Applications Hong Wang & Robert A. Walker Computer.
Embedded Real-Time Systems Processing interrupts Lecturer Department University.
Chapter 1 Introduction.   In this chapter we will learn about structure and function of computer and possibly nature and characteristics of computer.
Mohammed I DAABO COURSE CODE: CSC 355 COURSE TITLE: Data Structures.
PACL and ASC Processor Research Overview
Top 50 Data Structures Interview Questions
Multiway Search Trees Data may not fit into main memory
Micro-programmed Control
System Programming and administration
Operating Systems (CS 340 D)
Chapter 5: Computer Systems Organization
William Stallings Computer Organization and Architecture 8th Edition
Computer Architecture
Chapter 12 Pipelining and RISC
Presentation transcript:

Efficient Representation of Data Structures on Associative Processors Jalpesh K. Chitalia (Advisor Dr. Robert A. Walker) Computer Science Department Kent State University

Presentation Outline  ASC Processor Architecture  Associative Features  Structure Codes  Represent Data Structures  Structure Code Operations  Implementation of Structure Codes  Summary and Future Work

The ASC Processor  A scalable design implemented on a million gate Altera FPGA  SIMD-like architecture  Currently, 36 8-bit Processing Elements (PE) available  8-bit Instruction Stream (IS) control unit with 8-bit Instruction and Data addresses, 32-bit instructions

The ASC Architecture

 Each PE listens to the IS through the broadcast and reduction network  PEs can communicate amongst themselves using the PE Network  PE may either execute or ignore the microcode instruction broadcast by IS under the control of the Mask Stack

The ASC Features  Associative Search Each PE can search its local memory for a key under the control of IS  Responder Resolution A special circuit signals if ‘at least one’ record was found  Masked Operation Local Mask Stacks can turn on or off the execution of instruction from IS

The ASC Example Select * from Students where Grade > 90

The ASC Features  Constant Time Associative Operations Associative Search Finding minimum or maximum in a field  Ideal for Database processing Data is organized in tabular format Each tuple in a table can be processed by one PE  PE Network Many parallel algorithms require all PEs to move contents in a regular pattern E.g.: Image Convolution, Matrix Multiplication

Non Tabular Data Structures  Many applications use linked list based data structures For example, plain HTML parsing can be done using tree-structure Similarly, XML or Object-relational databases can be represented using trees only Game programming uses tree-based algorithms, and demand much of processing power Complier construction uses directed acyclic graphs and tree structures

Data Structure Codes  A unique coding scheme Allows representation of any data structure into a tabular format Tabular format allows division of data amongst the PEs Also known as “structure code”  Different coding schemes for different data structures may be required  Uses Associative Search feature of the ASC Processor

Simple List-based Structures  The left figure shows a possible representation of 1D and 2D arrays using structure codes  The right figure represents stack and queue, depending on the use of appropriate functions

Complex List-based Structures Each digit-position indicates the level of a tree Each value in that position indicates the position of that child from the left Discussions henceforth are confined to trees Graphs are read in a slightly different manner

Structure Code Operations  Two sets of constant-time operations: scalar and parallel  Scalar Instructions are simple mathematic operations Search parent, child or root Finds value of structure code for next or previous nodes  Parallel Instructions use complex associative operations Finds code for next, previous or both siblings ‘Locates’ the required node for further processing

Scalar Operations fstcd: leftmost child nxtcd: right sibling prvcd: left sibling trncd: truncate (parent) node trnacd: truncate all (root) node  Can be used to allocate a new node Limited use in searching records

Parallel Operations  Index instructions: Flags a node of the result to ease further processing nxtdex (next or right), prvdex (previous or left) and sibdex (siblings or both left and right)  Value instructions: Returns the exact structure code of the result nxtval (next or right) and prvval (previous or left)  Can be used to ‘locate’ nodes in any tree  Uses parallel and associative hardware resources

Applications of DSC Ops Scalar  Associative searching: the parent or the root in a given data structure space  Allocating elements in a data structure space Value  Finding the value of resultant structure code Index  Most useful in associative search (eg, tree traversal)

Implementation Requirements  Finding the left or right neighbor – scalar and parallel operations  Implementation involves Input and Output assembly code Input: from scalar memory to IS, and to the PEs if necessary Output: from PEs or IS to scalar memory  I/O was non-functional in previous versions

Implementation Requirements  Hardware functionality Debugging and developing I/O functionality Debugging few other instructions  Structure Code Operations Scalar Operations: nxtcd, prvcd Parallel Operations: nxtdex, nxtval, prvdex, prvval, sibdex  Parallel Operations Most complex set amongst ASC operations

Implementation - Scalar Ops  Scalar Operations Involves mathematical manipulations to input structure code Input may not be a part of any data structure  Scalar I/O Input is read directly from the scalar memory Output is written directly to the destination address in scalar memory

Implementation – Value Ops  Processing of Parallel Value Ops Associative search, associative min/max  Input Cycle All the elements of data structure are distributed amongst the PEs sequentially Reference node is broadcast to them  Output Cycle Multi-byte structure code is stored in destination address (scalar memory)

Implementation – Index Ops  Processing of Parallel Index Ops The resultant element(s) is (are) ‘marked’ for further processing  Note: sibdex may select two results  Input and Output Input: same as in value operations Output: Bit flags for all the input nodes are stored sequentially in destination address (scalar memory)

Summary  SIMD-based computers are more suited for database processing  ASC processor with its associative operations makes them more efficient  Structure codes translate non-tabular data structures into a tabular format Tree and Graphs can be represented and evenly divided like records in a table  Object-relational databases, HTML processing Stacks, Queues, multi-dimensional arrays can be efficiently processed  Structure codes not required for even representation

Future Work  Efforts are in place to clean the architecture To allow multiplier/divider on each PE To accommodate RISC instruction set  Unpacked bytes are used to support variable-length structure code Can be avoided with an efficient divider unit  Input/Output Special structure code operations are defined but not implemented in this work Parallel input/output is also required  Developing applications that use structure codes

Acknowledgements  Professor Walker  Professor Potter  Committee members for their time  Kevin Schaffer, Hong Wang, Meiduo Wu, Lei Xie, Ping Xu

Questions? Thank You!