Presentation is loading. Please wait.

Presentation is loading. Please wait.

INTRODUCTION Crusoe processor is 128 bit microprocessor which is build for mobile computing devices where low power consumption is required. Crusoe processor.

Similar presentations


Presentation on theme: "INTRODUCTION Crusoe processor is 128 bit microprocessor which is build for mobile computing devices where low power consumption is required. Crusoe processor."— Presentation transcript:

1 INTRODUCTION Crusoe processor is 128 bit microprocessor which is build for mobile computing devices where low power consumption is required. Crusoe processor is 128 bit microprocessor which is build for mobile computing devices where low power consumption is required. VLIW based processor and x86 Code Morphing software provide x86-compatible mobile platform solution. VLIW based processor and x86 Code Morphing software provide x86-compatible mobile platform solution. Processor core operates at 500-700 MHz. Processor core operates at 500-700 MHz.

2 Crusoe Processor Family TM 5400:-500-700 mhz. 256k L2 cache TM 5400:-500-700 mhz. 256k L2 cache TM 5500:-667-800 mhz 256k L2 cache TM 5500:-667-800 mhz 256k L2 cache TM 5600:-500-700 mhz 512k L2 cache TM 5600:-500-700 mhz 512k L2 cache TM 5800:-667-800 mhz 512K L2 cache TM 5800:-667-800 mhz 512K L2 cache

3 Multiple Issue Microprocessors Several Functional Units (Integer ALUs, Floating Point Unit, Load/Store…) Several Functional Units (Integer ALUs, Floating Point Unit, Load/Store…) Multiple instructions issued per cycle Multiple instructions issued per cycle Requires higher memory bandwidth and more registers Requires higher memory bandwidth and more registers Two main flavors: Superscalar and VLIW. Two main flavors: Superscalar and VLIW.

4 Intel’s Superscalar Approach Superscalar: Issue a variable number of instructions per cycle. Superscalar: Issue a variable number of instructions per cycle. Pentium Pro, Pentium II, Pentium III are all superscalar, with a single pipeline. Pentium Pro, Pentium II, Pentium III are all superscalar, with a single pipeline. Processor core is RISC-based with x86 front end. Processor core is RISC-based with x86 front end.

5 VLIW Approach Very Long Instruction Word processor Very Long Instruction Word processor Multiple FU’s, each explicitly programmed on each instruction Multiple FU’s, each explicitly programmed on each instruction A Very Long Instruction Word is called a molecule A Very Long Instruction Word is called a molecule Each molecule contains 4 atoms: one instruction for each FU. Each molecule contains 4 atoms: one instruction for each FU. A molecule is either 128 bits or 64 bits wide. A molecule is either 128 bits or 64 bits wide.

6 Transmeta’s Crusoe Core Floating Point Unit Integer ALU #0 Load/Store Unit Branch Unit FADDADDLDBRCC 128 bit Molecule

7 Code Morphing: Crusoe’s key x86 instructions are converted to the Crusoe instruction set through a software layer x86 instructions are converted to the Crusoe instruction set through a software layer During instruction translation, optimizations and scheduling tricks can be performed During instruction translation, optimizations and scheduling tricks can be performed Crusoe Processor Architecture is decoupled from application software Crusoe Processor Architecture is decoupled from application software

8 Code Morphing basics Code Morphing software resides in ROM Code Morphing software resides in ROM Translations are performed dynamically and are cached Translations are performed dynamically and are cached Successively aggressive optimizations are performed each time a block is executed Successively aggressive optimizations are performed each time a block is executed VLIW Processor Core Code Morphing Software x86 OS/BIOS x86 Applications

9 Code Translation Superscalar approach translates one instruction at a time Superscalar approach translates one instruction at a time Code Morphing examines blocks at a time, creating a translation from a block. Code Morphing examines blocks at a time, creating a translation from a block. Translations are saved in a translation cache. Translations are saved in a translation cache. Successive executions of the translation invokes only the optimizer, not the translator Successive executions of the translation invokes only the optimizer, not the translator Cost of translation is amortized over successive executions Cost of translation is amortized over successive executions

10 Hardware Support for Code Morphing Explicit setting of condition code Explicit setting of condition code All registers holding x86 state are shadowed All registers holding x86 state are shadowed Commit operation copies active state to the shadow registers. Commit operation copies active state to the shadow registers. “Translated bit” in page table to detect self-modifying code “Translated bit” in page table to detect self-modifying code Alias hardware allows the ordering of load instructions ahead of store instructions Alias hardware allows the ordering of load instructions ahead of store instructions

11 Exception Handling x86 exceptions are precise (Problematic for out-of- order execution of instructions) x86 exceptions are precise (Problematic for out-of- order execution of instructions) On an exception, processor state is rolled back to the most recent commit. On an exception, processor state is rolled back to the most recent commit. Execution proceeds in in-order mode until the fault location is found Execution proceeds in in-order mode until the fault location is found

12 LongRun: Dynamic Power Management Typical Approach 1: Switch off processor quickly to save power (Can give glitches) Typical Approach 1: Switch off processor quickly to save power (Can give glitches) Typical Approach 2: Change clock rate by suspending processor and restarting Typical Approach 2: Change clock rate by suspending processor and restarting Crusoe 1: Adjust clock rate dynamically, without suspension Crusoe 1: Adjust clock rate dynamically, without suspension Crusoe 2: Adjust voltage level Crusoe 2: Adjust voltage level Result: Cubic power reduction, up to 30%. Result: Cubic power reduction, up to 30%.

13 Performance of Crusoe Processor The heatsink on the TM5400 Crusoe processor is quite small. The heatsink on the TM5400 Crusoe processor is quite small. Execution Time Execution Time – Comparable to direct hardware implementation by Intel or AMD – Comparable to direct hardware implementation by Intel or AMD – TM5400 at 667 MHz is about the same as a Pentium III running at 500MHz. – TM5400 at 667 MHz is about the same as a Pentium III running at 500MHz. Low Cost. Low Cost. – Much simpler hardware. – Much simpler hardware. Crusoe TM5400 is a about 7 million transistors (P4 is at 41 Million) Crusoe TM5400 is a about 7 million transistors (P4 is at 41 Million) – Easier to design, more scalable, easier to reach high clock rate, – Easier to design, more scalable, easier to reach high clock rate, more room for caches, better yield, etc more room for caches, better yield, etc Low Power Low Power

14 Crusoe vs. PIII, heat generation PIII: 105.5C.Crusoe: 48.2 Both processors playing a DVD

15 Drawbacks Code optimization doesn’t start until a block of code has been executed more than a few times. Code optimization doesn’t start until a block of code has been executed more than a few times. Code translation requires clock cycles which could otherwise be used in performing application computation. Code translation requires clock cycles which could otherwise be used in performing application computation.

16 Where Transmeta could go next The current emphasis is on mobile computing. The current emphasis is on mobile computing. Different applications of Code Morphing could be made to allow a different emphasis or target. Different applications of Code Morphing could be made to allow a different emphasis or target. Optimization techniques could be tailored to different target architectures. Optimization techniques could be tailored to different target architectures. Workstation/Server chips were hinted at in the documentation. Workstation/Server chips were hinted at in the documentation.

17 Conclusions Transmeta has built an x86 Crusoe processor based on VLIW technology Transmeta has built an x86 Crusoe processor based on VLIW technology Code Morphing offers a new approach to the implementation of an instruction set architecture Code Morphing offers a new approach to the implementation of an instruction set architecture Crusoe offers the power of a high-performance Intel processor, consuming a fraction of the power Crusoe offers the power of a high-performance Intel processor, consuming a fraction of the power

18


Download ppt "INTRODUCTION Crusoe processor is 128 bit microprocessor which is build for mobile computing devices where low power consumption is required. Crusoe processor."

Similar presentations


Ads by Google