PLX : Instruction Set Architecture Shih-Hsueh, Chang.

Slides:



Advertisements
Similar presentations
Goal: Write Programs in Assembly
Advertisements

Review of the MIPS Instruction Set Architecture. RISC Instruction Set Basics All operations on data apply to data in registers and typically change the.
Chapter Programming in C
2010/12/12
Performing Advanced Bit Manipulations Efficiently in General-Purpose Processors Yedidya Hilewitz and Ruby B. Lee Princeton Architecture Lab for Multimedia.
1 生物計算期末作業 暨南大學資訊工程系 2003/05/13. 2 compare f1 f2  只比較兩個檔案 f1 與 f2 ,比完後將結果輸出。 compare directory  以兩兩比對的方式,比對一個目錄下所有檔案的相 似程度。  將相似度很高的檔案做成報表輸出,報表中至少要.
Teacher : Ing-Jer Huang TA : Chien-Hung Chen 2015/6/3 Course Embedded Systems : Principles and Implementations Weekly Preview Question CH3.5 ~ CH /10/31.
3Com Switch 4500 切VLAN教學.
Reference, primitive, call by XXX 必也正名乎 誌謝 : 部份文字取於前輩 TAHO 的文章.
Review of Chapter 3 - 已學過的 rules( 回顧 )- 朝陽科技大學 資訊管理系 李麗華 教授.
Graph V(G 1 )={0, 1, 2, 3, 4, 5, 6, 7, 8, 9} E(G 1 )={(0, 2), (0, 3), (1, 4), (2, 3), (2, 5), (2, 6), (3, 6), (3, 7), (4, 7), (5, 6), (5,
JAVA 程式設計與資料結構 第十四章 Linked List. Introduction Linked List 的結構就是將物件排成一列, 有點像是 Array ,但是我們卻無法直接經 由 index 得到其中的物件 在 Linked List 中,每一個點我們稱之為 node ,第一個 node.
Lecture Note of 9/29 jinnjy. Outline Remark of “Central Concepts of Automata Theory” (Page 1 of handout) The properties of DFA, NFA,  -NFA.
Execution of an instruction
基礎物理總論 基礎物理總論 熱力學與統計力學(三) Statistical Mechanics 東海大學物理系 施奇廷.
溝通和自我概念 自我概念如何發展 自我概念的特徵 自我概念抗拒改變 認同的影響 自我應驗預言和溝通 改變你的自我概念 公開自我和隱藏自我
具備人臉追蹤與辨識功能的一個 智慧型數位監視系統 系統架構 在巡邏模式中 ,攝影機會左右來回巡視,並 利用動態膚色偵測得知是否有移動膚色物體, 若有移動的膚色物體則進入到追蹤模式,反之 則繼續巡視。
Introduction to Java Programming Lecture 15 Objects and Classes.
Introduction to Java Programming Lecture 17 Abstract Classes & Interfaces.
: The largest Clique ★★★★☆ 題組: Contest Archive with Online Judge 題號: 11324: The largest Clique 解題者:李重儀 解題日期: 2008 年 11 月 24 日 題意: 簡單來說,給你一個 directed.
3-3 使用幾何繪圖工具 Flash 的幾何繪圖工具包括線段工具 (Line Tool) 、橢圓形工具 (Oval Tool) 、多邊星形 工具 (Rectangle Tool) 3 種。這些工具畫出 來的幾何圖形包括了筆畫線條和填色區域, 將它們適當地組合加上有技巧地變形與配 色, 不但比鉛筆工具簡單,
Digital Signal Processing with Examples in M ATLAB ® Chap 1 Introduction Ming-Hong Shih, Aug 25, 2003.
第 十 章 探索階段的其他技巧.
: Problem A : MiniMice ★★★★☆ 題組: Contest Archive with Online Judge 題號: 11411: Problem A : MiniMice 解題者:李重儀 解題日期: 2008 年 9 月 3 日 題意:簡單的說,題目中每一隻老鼠有一個編號.
資料結構實習-一 參數傳遞.
Chap 3.3~3.5 Construction an Arithmetic Logic Unit (ALU) Jen-Chang Liu, Spring 2006.
1 Introduction to Java Programming Lecture 2: Basics of Java Programming Spring 2008.
公用品.  該物品的數量不會因一人的消費而受到 影響,它可以同時地被多人享用。 角色分配  兩位同學當我的助手,負責:  其餘各人是投資者,每人擁有 $100 , 可以投資在兩種資產上。  記錄  計算  協助同學討論.
Teacher : Ing-Jer Huang TA : Chien-Hung Chen 2015/6/25 Course Embedded Systems : Principles and Implementations Weekly Preview Question CH 2.4~CH 2.6 &
Image Interpolation Use SSE 指導教授 : 楊士萱 學 生 : 楊宗峰 日 期 :
JAVA 程式設計與資料結構 第二十章 Searching. Sequential Searching Sequential Searching 是最簡單的一種搜尋法,此演 算法可應用在 Array 或是 Linked List 此等資料結構。 Sequential Searching 的 worst-case.
Chapter 4 Processor Technology and Architecture. Chapter goals Describe CPU instruction and execution cycles Explain how primitive CPU instructions are.
The Processor 2 Andreas Klappenecker CPSC321 Computer Architecture.
845: Gas Station Numbers ★★★ 題組: Problem Set Archive with Online Judge 題號: 845: Gas Station Numbers. 解題者:張維珊 解題日期: 2006 年 2 月 題意: 將輸入的數字,經過重新排列組合或旋轉數字,得到比原先的數字大,
Linguistics phonetic symbols. 先下載 IPA 字型檔案,執行安裝。 由於這個程式的字型目錄設定錯誤, 所以等重新開機時就會發現字型消失。 所以必須根據以下步驟來讓 Windows 加入 IPA 字型。
JAVA 程式設計與資料結構 第十六章 Hash Tables. Introduction Hash Tables 結構為一個 Array ,稱之為 Bucket array 。 如果想要新增一個物件,要根據這個物件的特性 將其加入 Hash Table 內。 Bucket Array 用 A 來代替,其.
1 Introduction to Java Programming Lecture 2: Basics of Java Programming Spring 2009.
中序轉後序 藉由由左向右掃瞄中序運算式產生後序運算式,遇到 運算元就直接輸出,遇到運算符號則先存入堆疊,將 優先權較高者輸出。 範例: a + b * c TokenStack [0] [1] [2] topoutput aa ++0a b+0ab *+ *1ab c+ *1abc eosabc*+
: Problem E Antimatter Ray Clearcutting ★★★★☆ 題組: Problem Set Archive with Online Judge 題號: 11008: Problem E Antimatter Ray Clearcutting 解題者:林王智瑞.
連續隨機變數 連續變數:時間、分數、重量、……
: Wine trading in Gergovia ★★☆☆☆ 題組: Contest Volumes with Online Judge 題號: 11054: Wine trading in Gergovia 解題者:劉洙愷 解題日期: 2008 年 2 月 29 日 題意:在 Gergovia.
Enhancements to the Linux Kernel for Blocking Buffer Overflow Based Attacks Massimo Bernaschi Emanuele Gabrielli Luigi V. Mancini.
1 Introduction to Java Programming Lecture 3 Mathematical Operators Spring 2008.
:Problem E.Stone Game ★★★☆☆ 題組: Problem Set Archive with Online Judge 題號: 10165: Problem E.Stone Game 解題者:李濟宇 解題日期: 2006 年 3 月 26 日 題意: Jack 與 Jim.
Multiple cycle implementation Each instruction takes more than one clock cycles to execution Q: How to break an instruction? Break each instruction into.
VHDL語法(3).
: How many 0's? ★★★☆☆ 題組: Problem Set Archive with Online Judge 題號: 11038: How many 0’s? 解題者:楊鵬宇 解題日期: 2007 年 5 月 15 日 題意:寫下題目給的 m 與 n(m
9.8 Solution of Differential Equations by Means of Taylor Series.
1 柱體與錐體 1. 找出柱體與錐體的規則 2. 柱體的命名與特性 3. 柱體的展開圖 4. 錐體的命名與特性 5. 錐體的展開圖
English Verb Tense. 動詞時態是什麼呢 時態為動詞所特有的, 它能夠表示出一個 動作所發生的時間或者是型式.
第三單元 3.7 土壤 1 . 生物對地理環境作用的根本原 因是什麼? 2 .生物圈對其他三大圈層有何作 用? 3 .綠色植物對環境具有哪些保護 作用? 複習提問.
CS2422 Basic Concepts Department of Computer Science National Tsing Hua University.
Instruction Set Architecture The portion of the machine visible to the programmer Issues: Internal storage model Addressing modes Operations Operands Encoding.
chap3 Chapter 3 Top-Down Design with Functions.
 Parallel Deposit (bit scatter)  Deposits in the result register, at positions flagged by 1’s in r 3, the right justified bits from r 2 Yedidya Hilewitz.
Computer Organization & Programming Chapter 6 Single Datapath CPU Architecture.
Next Generation ISA Itanium / IA-64. Operating Environments IA-32 Protected Mode/Real Mode/Virtual Mode - if supported by the OS IA-64 Instruction Set.
Java Just-In-Time Compiler in hand-held system 指導教授 單智君老師 指導教授 單智君老師 李政仲 張淳恩 王信安.
Computer Organization Instructions Language of The Computer (MIPS) 2.
ARM7 Architecture What We Have Learned up to Now.
CS 230: Computer Organization and Assembly Language
Prof. Sirer CS 316 Cornell University
Chapter 6: Process Synchronization
Topic 5: Processor Architecture Implementation Methodology
Topic 5: Processor Architecture
Unit 12 CPU Design & Programming
MARIE: An Introduction to a Simple Computer
Prof. Sirer CS 316 Cornell University
The ARM Instruction Set
PAC ISS Zong-Cing Lin PAS lab, CSIE, NTU.
Presentation transcript:

PLX : Instruction Set Architecture Shih-Hsueh, Chang

Outline Introduction Instruction Set Architecture Datapath Scalability Predication ALU instructions Shift and Permute instructions Multiply instructions Other instructions Conclusion

Outline Introduction Instruction Set Architecture Datapath Scalability Predication ALU instructions Shift and Permute instructions Multiply instructions Other instructions Conclusion

Introduction (1/3) What's PLX? PLX is a small, general-purpose, subword- parallel instruction set architecture (ISA) designed at Princeton University, Department of Electrical Engineering.Princeton University Department of Electrical Engineering PLX is designed to be a simple yet high- performance ISA for multimedia information processing.

Introduction (2/3) PLX History In Fall 2001, design goals and architecture for PLX were specified by Prof. Ruby B. Lee of Princeton University.Prof. Ruby B. Lee PLX 0.1 was encoded, documented, and implemented as a project for the ELE-572 Class during Spring 2001 by Princeton graduates R. Adler '01 and G. Reis '01.

Introduction (3/3) PLX 1.1 was released in February The software toolset includes an assembler, a compiler, and a simulator. Currently PLX 1.2 is being maintained and developed by PALMS.

PLX 1.1 Toolset assembler - contains the asm script, which is the PLX assembler benchmarks - contains the PLX benchmarks written in PLX assembly and C build - initially empty, this is where the binary executable for the simulator is built compiler - source code for the PLX compiler (see the README and INSTALL files in this directory for how to install the compiler) scripts - Perl scripts used to automate the build process simulator - source code for the PLX simulator ISA - contains the PLX architecture definition vhdl - contains the VHDL model for a PLX processor

Outline Introduction Instruction Set Architecture Datapath Scalability Predication ALU instructions Shift and Permute instructions Multiply instructions Other instructions Conclusion

Instruction Set Architecture These ISAs exploit two properties of multimedia applications: Huge amounts of data parallelism Extensive use of low-precision data These two properties are exploited well by the use of subword parallelism.

Instruction Set Architecture Datapath is partitioned into multiple lower-precision segments called the subwords. All instructions are 32-bits long and subword sizes are 1, 2, 4 and 8 bytes.

Instruction Set Architecture The instructions operate in parallel on these subwords.

Instruction Set Architecture PLX instructions can be classified into three major groups : ALU instructions, shift and permute instructions, and multiply instructions.

Outline Introduction Instruction Set Architecture Datapath Scalability Predication ALU instructions Shift and Permute instructions Multiply instructions Other instructions Conclusion

Datapath Scalability PLX can be implemented as a 32-bit, 64-bit or 128-bit architecture without any changes to the ISA.

Outline Introduction Instruction Set Architecture Datapath Scalability Predication ALU instructions Shift and Permute instructions Multiply instructions Other instructions Conclusion

Predication All PLX instructions are predicated. Predicated reduces conditional branches. PLX has bit predicate registers organized into 16 predicate register sets of 8 predicate registers each. The registers in this set are numbered P0 through P7.

Predication (cont.) The predicate registers P1 to P7 can be set and cleared using compare instructions (P0 is always true). Only one set is active at any time. Active set changed in software. Only 3 bits required per instruction.

Predication (cont.) Two types of compare instructions set the predicate registers in PLX.

Outline Introduction Instruction Set Architecture Datapath Scalability Predication ALU instructions Shift and Permute instructions Multiply instructions Other instructions Conclusion

ALU instructions

Saturation arithmetic 一般的算術運算在 overflow 的時候, 並沒有特 別的處理, 通常是直接將最左邊的一個 bit 丟 棄. 在多媒體的應用程式中,綠色加綠色,我們預 期他會變成深綠色。但是要是發生 overflow , 它可能變成很奇怪的顏色。 Saturation arithmetic, 使得綠色加綠色最多是 變成黑色,而不會變出什麼奇怪的顏色。

Low-cost multiplication Pshift [left|right] add instructions allow low-cost integer and fixed-point multiplication in the ALU without a separate multiplier.

pmax & pmin These instructions are very useful for sorting algorithms. pmax and pmin can perform a swap operation for multiple pairs of subwords in a single cycle.

PLX Allows Parallel Writing of Predicate Registers

Outline Introduction Instruction Set Architecture Datapath Scalability Predication ALU instructions Shift and Permute instructions Multiply instructions Other instructions Conclusion

Shift and Permute instructions

shift right pair Two source registers are concatenated and shifted right. useful for entities that span two registers. when the same register is used, the result is a rotation of that register.

Mix right/left Mix instructions are very useful for performing matrix transposition.

Matrix Transposition

permute The permute instruction works on 1-byte and 2-byte subwords, and performs a small set of carefully selected permutation primitives.

permute variable To use a second source register to specify the permutation control bits. To perform any arbitrary permutation of 1-byte or 2-byt subwords, with or without repetitions of any subword.

Outline Introduction Instruction Set Architecture Datapath Scalability Predication ALU instructions Shift and Permute instructions Multiply instructions Other instructions Conclusion

Multiply instructions

Pmultiply shift right right-shifts the products before writing the lower- order half of the bits to the destination register. allows selection of the desired 16-bits of each product. Pmultiply odd and pmultiply even only multiply the odd or even indexed subwords of the source registers produce full length products.

Outline Introduction Instruction Set Architecture Datapath Scalability Predication ALU instructions Shift and Permute instructions Multiply instructions Other instructions Conclusion

Other instructions PLX has load and store instructions for accessing memory. Program flow can be changed with jump instructions. This includes jump and link instructions for procedure calls. Conditional branches are achieved with predicated jump instructions.

Outline Introduction Instruction Set Architecture Predication Datapath Scalability ALU instructions Shift and Permute instructions Multiply instructions Other instructions Conclusion

Subword Parallelism Datapath Scalability Novel Definition of Predication Extended instructions for multimedia. Low cost and very high multimedia performance.