LZRW3 Decompressor dual semester project Part A Mid Presentation Students: Peleg Rosen Tal Czeizler Advisors: Moshe Porian Netanel Yamin 22.6.2014.

Slides:



Advertisements
Similar presentations
System Integration and Performance
Advertisements

Dr. Rabie A. Ramadan Al-Azhar University Lecture 3
Sumitha Ajith Saicharan Bandarupalli Mahesh Borgaonkar.
Internal Logic Analyzer Final presentation-part B
Internal Logic Analyzer Final presentation-part A
The 8085 Microprocessor Architecture
Microprocessor and Microcontroller
Chapter 1 Computer System Overview Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design Principles,
Reliable Data Storage using Reed Solomon Code Supervised by: Isaschar (Zigi) Walter Performed by: Ilan Rosenfeld, Moshe Karl Spring 2004 Part A Final Presentation.
Double buffer SDRAM Memory Controller Presented by: Yael Dresner Andre Steiner Instructed by: Michael Levilov Project Number: D0713.
Computer System Overview
Firmware implementation of Integer Array Sorter Characterization presentation Dec, 2010 Elad Barzilay Uri Natanzon Supervisor: Moshe Porian.
Computer System Overview
Reliable Data Storage using Reed Solomon Code Supervised by: Isaschar (Zigi) Walter Performed by: Ilan Rosenfeld, Moshe Karl Spring 2004 Midterm Presentation.
ECE 353 ECE 353 Fall 2007 Lab 3 Machine Simulator November 1, 2007.
Compressed Instruction Cache Prepared By: Nicholas Meloche, David Lautenschlager, and Prashanth Janardanan Team Lugnuts.
Encryption Development System Encryption Development System Project Part A Characterization Written by: Yaakov Levenzon Ido Kahan Advisor: Mony Orbach.
Sub-Nyquist Reconstruction Final Presentation Winter 2010/2011 By: Yousef Badran Supervisors: Asaf Elron Ina Rivkin Technion Israel Institute of Technology.
Final presentation – part B Olga Liberman and Yoav Shvartz Advisor: Moshe Porian April 2013 S YMBOL G ENERATOR 2 semester project.
Computer Systems Overview. Page 2 W. Stallings: Operating Systems: Internals and Design, ©2001 Operating System Exploits the hardware resources of one.
FPGA IRRADIATION and TESTING PLANS (Update) Ray Mountain, Marina Artuso, Bin Gui Syracuse University OUTLINE: 1.Core 2.Peripheral 3.Testing Procedures.
Firmware based Array Sorter and Matlab testing suite Final Presentation August 2011 Elad Barzilay & Uri Natanzon Supervisor: Moshe Porian.
LZRW3 Decompressor dual semester project Characterization Presentation Students: Peleg Rosen Tal Czeizler Advisors: Moshe Porian Netanel Yamin
Senior Project Presentation: Designers: Shreya Prasad & Heather Smith Advisor: Dr. Vinod Prasad May 6th, 2003 Internal Hardware Design of a Microcontroller.
6.375 Final Presentation Jeff Simpson, Jingwen Ouyang, Kyle Fritz FPGA Implementation of Whirlpool and FSB Hash Algorithms.
Electrocardiogram (ECG) application operation – Part B Performed By: Ran Geler Mor Levy Instructor:Moshe Porian Project Duration: 2 Semesters Spring 2012.
Computer Systems Week 7: Looping and Input/Output with 3-bit Alma Whitfield.
LZRW3 Data Compression Core Dual semester project April 2013 Project part A final presentation Shahar Zuta Netanel Yamin Advisor: Moshe porian.
Project Characterization Implementing a compressor in software and decompression in hardware Presents by - Schreiber Beeri Yavich Alon Guided by – Porian.
Design of a Novel Bridge to Interface High Speed Image Sensors In Embedded Systems Tareq Hasan Khan ID: ECE, U of S Term Project (EE 800)
LZRW3 Data Compression Core Project part B final presentation Shahar Zuta Netanel Yamin Advisor: Moshe porian December 2013.
Performed by:Yulia Turovski Lior Bar Lev Instructor: Mony Orbach המעבדה למערכות ספרתיות מהירות High speed digital systems laboratory הטכניון - מכון טכנולוגי.
Operating System Isfahan University of Technology Note: most of the slides used in this course are derived from those of the textbook (see slide 4)
FPGA Calculator Core Final Presentation Chen Zukerman Liran Moskovitch Advisor : Moshe Porian Duration: semesterial December 2012.
Project Final Semester A Presentation Implementing a compressor in software and decompression in hardware Presents by - Schreiber Beeri Yavich Alon Guided.
FPGA firmware of DC5 FEE. Outline List of issue Data loss issue Command error issue (DCM to FEM) Command lost issue (PC with USB connection to GANDALF)
FPGA Calculator Core Mid Presentation Chen Zukerman Liran Moskovitch Advisor : Moshe Porian Duration: semesterial November 2011.
CE Operating Systems Lecture 2 Low level hardware support for operating systems.
Picture Manipulation using Hardware Presents by- Uri Tsipin & Ran Mizrahi Supervisor– Moshe Porian Characterization presentation Dual-semester project.
Picture Manipulation using Hardware Presents by- Uri Tsipin & Ran Mizrahi Supervisor– Moshe Porian Middle presentation Dual-semester project
CE Operating Systems Lecture 2 Low level hardware support for operating systems.
Introduction to Microprocessors - chapter3 1 Chapter 3 The 8085 Microprocessor Architecture.
Menu Navigation Presented by: Tzahi Ezra Advisors: Moshe Porian Netanel Yamin One semester project Project initiation: NOV 2014 PROJECT’S MID PRESENTATION.
Mid presentation Part A Project Netanel Yamin & by: Shahar Zuta Moshe porian Advisor: Dual semester project November 2012.
Encryption / Decryption on FPGA Final Presentation Written by: Daniel Farcovich ID Saar Vigodskey ID Advisor: Mony Orbach Summer.
Simple ALU How to perform this C language integer operation in the computer C=A+B; ? The arithmetic/logic unit (ALU) of a processor performs integer arithmetic.
Microprocessor Fundamentals Week 2 Mount Druitt College of TAFE Dept. Electrical Engineering 2008.
Mini scope one semester project Project final Presentation Svetlana Gnatyshchak Lior Haiby Advisor: Moshe Porian Febuary 2014.
Control units In the last lecture, we introduced the basic structure of a control unit, and translated our assembly instructions into a binary representation.
XTRP Software Nathan Eddy University of Illinois 2/24/00.
Internal Logic Analyzer Characterization presentation By: Moran Katz and Zvika Pery Mentor: Moshe Porian Dual-semester project Spring 2012.
Menu Navigation Presented by: Tzahi Ezra Advisors: Moshe Porian Netanel Yamin One semester project Project initiation: NOV 2014 PROJECT’S CHARACTERIZATION.
Simulator Outline of MIPS Simulator project  Write a simulator for the MIPS five-stage pipeline that does the following: Implements a subset of.
Pipelining: Implementation CPSC 252 Computer Organization Ellen Walker, Hiram College.
Computer Systems Overview. Lecture 1/Page 2AE4B33OSS W. Stallings: Operating Systems: Internals and Design, ©2001 Operating System Exploits the hardware.
1 Computer System Overview Chapter 1. 2 Operating System Exploits the hardware resources of one or more processors Provides a set of services to system.
Chapter 1 Computer System Overview
The 8085 Microprocessor Architecture
SLP1 design Christos Gentsos 9/4/2014.
The 8085 Microprocessor Architecture
Dr. Michael Nasief Lecture 2
FPGA Implementation of Multicore AES 128/192/256
An Introduction to Microprocessor Architecture using intel 8085 as a classic processor
Operating Systems Chapter 5: Input/Output Management
Chapter 4 Instruction Set.
The 8085 Microprocessor Architecture
Control units In the last lecture, we introduced the basic structure of a control unit, and translated our assembly instructions into a binary representation.
Chapter 1 Computer System Overview
Review: The whole processor
Internal Representation of Files
Presentation transcript:

LZRW3 Decompressor dual semester project Part A Mid Presentation Students: Peleg Rosen Tal Czeizler Advisors: Moshe Porian Netanel Yamin

Presentation Content Project Goals Project Requirements Algorithm Overview Project Top Block Diagram Decompression Core Top View Decompression Core Design and Data Flow Stages Overview Problems and Solutions Project Schedule and Gantt

Project Goals

Implementation of LZRW3 data decompression core.

Implementation of a verification environment. Project Goals

Project Requirements

Part A: Core Requirements: – Process data at the speed of 1 Gbps. – Support data blocks with output of 2KB – 32KB. – Relay only on the FPGA’s internal memory. – VHDL Implementation.

Part A: Core Requirements: – Process data at the speed of 1 Gbps. – Support data blocks with output of 2KB – 32KB. – Relay only on the FPGA’s internal memory. – VHDL Implementation. Full simulation environment (golden model and checkers). Project Requirements

Part A: Core Requirements: – Process data at the speed of 1 Gbps. – Support data blocks with output of 2KB – 32KB. – Relay only on the FPGA’s internal memory. – VHDL Implementation. Full simulation environment (golden model and checkers). Part B: Synthesis & implementation of FPGA device (Xilinx Virtex-5). Project Requirements

Part A: Core Requirements: – Process data at the speed of 1 Gbps. – Support data blocks with output of 2KB – 32KB. – Relay only on the FPGA’s internal memory. – VHDL Implementation. Full simulation environment (golden model and checkers). Part B: Synthesis & implementation of FPGA device (Xilinx Virtex-5). GUI implementation in VisualStudio. Project Requirements

Output item (Copy item): [slot address, length ] In this case Output item = [, ] BABDACABDBCAA LZRW3 compression algorithm Hash Function Hash Table ABD Slot address BAB 0 ABD Slot address Send every 3 literals to the hash function Put offset in the hash table If the slot is occupied and the literals match - make copy item 6

Structure Algorithm Overview

Structure File header (8 byte) Algorithm Overview

Structure File header (8 byte) Groups: Algorithm Overview

Structure File header (8 byte) Groups: - control bytes (2 bytes) Algorithm Overview

Structure File header (8 byte) Groups: - control bytes (2 bytes) - data bytes ( bytes) * The last group might be smaller Algorithm Overview

File header Decode the header to determine the file size and whether it is compressed or not. Algorithm Overview

Control bytes Decode control bytes to determine the position and type of the items in the group, and where the next control bytes are. Algorithm Overview

Literal items Write as is to output file. Algorithm Overview

Literal items Write as is to output file. Algorithm Overview

Literal items Write as is to output file. Algorithm Overview

Copy items Decode to determine the offset and length of a literal sequence to be copied to the output file. Algorithm Overview

Copy items Decode to determine the offset and length of a literal sequence to be copied to the output file. Algorithm Overview

Copy items Decode to determine the offset and length of a literal sequence. Write from the output memory to itself accordingly. Algorithm Overview

Project Top Block Diagram LZRW3 DECOMPRESSION CORE

Decompression Core Top view DECOMPRESSION CORE

Decompression Core Design and Data Flow

Stages Overview – Core Management Unit

Core Management Unit Goals: To communicate with the core's periphery. To receive the input data and parse it. To transmit the appropriate control signals to the next stages. Method: The unit starts with ‘clear’ mode, which initializes the core. The following 10 clock cycles are dedicated to Header and Control Bytes decoding. From this point on, the unit determines the Mode and sets the appropriate controls according to the current byte and the previous 4 bytes.

Core Management Unit – Mode selection

Core Management Unit – Outputs

Stages Overview – 5 Bytes Buffer BUSY

New byte (8) Mid byte (8) Old byte (8) Older byte (8) Oldest byte (8) Five Bytes Buffer New byte (8) New Byte Register Mid Byte Register Old Byte Register Older Byte Register Oldest Byte Register

Stages Overview – Hash Function BUSY

Stages Overview – Hash Function

#3 Hash Function Stage TABLE INDEX = (((40543*(((*(PTR)) >4) & 0xFFF) PTR pointes to the first byte. TABLE INDEX range: 0 to 4095.

Stages Overview – Hash Table Stage

Block Overview – Write Address Counter

Write Address Counter According to Mode signal: For Literal items increments by 1. For Copy items increments by Length. Else, doesn’t increment.

Block Overview – Hash Table

Hash Table 16 bits 4096 rows Write Address Counter Offset in Read Index (12) Offset out Write Index (12) From Hash Func From Core Mgmt 5 bits Memory number 11 bits Memory address Hash Table Select

Hash Table – Default String 16 bits 4096 rows 5 bits Memory number 11 bits Memory address The Default String The LZRW3 algorithm dictates that the string “ ” is set as default.

Hash Table – Default String 16 bits 4096 rows 5 bits Memory number 11 bits Memory address The Default String The LZRW3 algorithm dictates that the string “ ” is set as default. Meaning, when a sequence starting “123..” is received, a copy item is created, even if it is the first time the sequence appears.

Hash Table – Default String 16 bits 4096 rows 5 bits Memory number 11 bits Memory address The Default String The LZRW3 algorithm dictates that the string “ ” is set as default. Meaning, when a sequence starting “123..” is received, a copy item is created, even if it is the first time the sequence appears. The index ‘1264’ is initialized with zeroes, which stand for the default string.

Hash Table – Default String 16 bits 4096 rows 5 bits Memory number 11 bits Memory address The Default String The LZRW3 algorithm dictates that the string “ ” is set as default. Meaning, when a sequence starting “123..” is received, a copy item is created, even if it is the first time the sequence appears. The index ‘1264’ is initialized with zeroes, which stand for the default string.

Block Overview – First 2 Bytes

Why is First 2 Bytes needed?

In the original file: ABCXYZABC In the compressed file: ABCXYZC1C2

Why is First 2 Bytes needed? In the original file: ABCXYZABC In the compressed file: ABCXYZC1C2 If we wish to keep our Hash Table identical to the Hash Table of the compressor, we must somehow fetch AB instead of C1C2.

First Two Bytes 16 bits 4096 rows Old byte & Mid byte Bypass Read Index (12) Two bytes out Write Index (12) From Hash Func From Core Mgmt 8 bits First byte 8 bits Second byte Hash Table Select

Block Overview – First 2 Bytes C1 X Y Z X Y C2 Z BA

Block Overview – First 2 Bytes C1 X Y Z X Y C2 Z BA

Block Overview – First 2 Bytes C1 X Y Z X Y C2 Z X Y Z Y Z C1 BA INDEX

Block Overview – First 2 Bytes C1 X Y Z X Y C2 Z X Y Z Y Z C1 INDEX BA

Block Overview – First 2 Bytes C1 X Y Z X Y C2 Z X Y Z Y Z C1 INDEX B A

Block Overview – First 2 Bytes X Y Z C2 Y X Z X Y Z INDEX Y Z C1 INDEX B A

Block Overview – First 2 Bytes X Y Z C2 Y X Z INDEX Y Z C1 INDEX B A

Block Overview – First 2 Bytes X Y Z C2 Y X B A INDEX X Y Z Y Z C1 A

Stages Overview – Address Manager

Stages Overview – Output Memory

COPY MODE DATA 2 DATA 1 DATA 3

COPY MODE DATA 2 DATA 1 DATA 3

COPY MODE DATA 2 DATA 1 DATA 3

COPY MODE READ ENABLE WRITE ENABLE READ ENABLE 1 2 DATA 2 DATA 1 DATA 3

COPY MODE READ ENABLE WRITE ENABLE READ ENABLE 1 2 DATA 2 DATA 1 DATA 3

READ ENABLE WRITE ENABLE READ ENABLE 1 2 DATA 2 DATA 1 DATA 3

WRITE ENABLE DATA 2 DATA 1 DATA 3

Timing Considerations The project requirements dictates clock frequency of 125 MHz.

Timing Considerations The project requirements dictates clock frequency of 125 MHz. Our concern was that the memory stage’s muxes will limit the frequency.

Timing Considerations The project requirements dictates clock frequency of 125 MHz. Our concern was that the memory stage’s muxes will limit the frequency. After writing the VHDL code for the memory stage we synthesized it and ran a timing analysis, which provided the following result:

Timing Considerations The project requirements dictates clock frequency of 125 MHz. Our concern was that the memory stage’s muxes will limit the frequency. After writing the VHDL code for the memory stage we synthesized it and ran a timing analysis, which provided the following result: Conclusion: The timing requirements will be met.

Primary vs Final Design 4 Kbyte FIFO Hash Function Header Decoder Control Bytes Decoder Copy Item Decoder Hash Table Write Address Counter Copy Counter Output Memory 32 Kbyte 3 Byte 1 Byte Controller Data in Index 12 Bit 4 Bit Index Length Offset in Offset out Data in Read Address From Input Block Data out Fetch stage Decode stage Calc Address stage Output Memory stage 1 Byte 3 Byte buffer Address Manager Write address Read address Write Address To Output Block

Problems and Solutions Problem #1: Preforming a copy procedure

Problems and Solutions Problem #1: Preforming a copy procedure In the initial design: only 1 output memory.

Problems and Solutions Problem #1: Preforming a copy procedure In the initial design: only 1 output memory. The problems:

Problems and Solutions Problem #1: Preforming a copy procedure In the initial design: only 1 output memory. The problems: - Wasting copy length clock cycles in order to copy item.

Problems and Solutions Problem #1: Preforming a copy procedure In the initial design: only 1 output memory. The problems: - Wasting copy length clock cycles in order to copy item. - Must stop the pipe and store the incoming data in a FIFO located at the core’s beginning while copying.

Problems and Solutions Problem #1: Preforming a copy procedure In the initial design: only 1 output memory. The problems: - Wasting copy length clock cycles in order to copy item. - Must stop the pipe and store the incoming data in a FIFO located at the core’s beginning while copying. - Demands a very complicated controller.

Problems and Solutions Problem #1: Preforming a copy procedure In the initial design: only 1 output memory. The problems: - Wasting copy length clock cycles in order to copy item. - Must stop the pipe and store the incoming data in a FIFO located at the core’s beginning while copying. - Demands a very complicated controller. The solution:

Problems and Solutions Problem #1: Preforming a copy procedure In the initial design: only 1 output memory. The problems: - Wasting copy length clock cycles in order to copy item. - Must stop the pipe and store the incoming data in a FIFO located at the core’s beginning while copying. - Demands a very complicated controller. The solution: 18 different memory blocks, which enable us to preform every copy in 2 clock cycles: 1 for reading the data from all the required memories, and the second for writing the data back to the right memories. No dependency on copy length!

Problems and Solutions Problem #2: Ignoring the Control Bytes

Problems and Solutions Problem #2: Ignoring the Control Bytes In the initial design: 3 bytes buffer.

Problems and Solutions Problem #2: Ignoring the Control Bytes In the initial design: 3 bytes buffer. The problem:

Problems and Solutions Problem #2: Ignoring the Control Bytes In the initial design: 3 bytes buffer. The problem: the Control Bytes are needed for the core management unit to operate correctly, but must be ignored in the data flow (they mustn't be written in the hash table, and we need to remember the preceding items). The problem was how to ignore them without losing data.

Problems and Solutions Problem #2: Ignoring the Control Bytes In the initial design: 3 bytes buffer. The problem: the Control Bytes are needed for the core management unit to operate correctly, but must be ignored in the data flow (they mustn't be written in the hash table, and we need to remember the preceding items). The problem was how to ignore them without losing data. The solution:

Problems and Solutions Problem #2: Ignoring the Control Bytes In the initial design: 3 bytes buffer. The problem: the Control Bytes are needed for the core management unit to operate correctly, but must be ignored in the data flow (they mustn't be written in the hash table, and we need to remember the preceding items). The problem was how to ignore them without losing data. The solution: Enlarging the buffer from 3 bytes to 5 bytes which enables us to remember the items that preceded the Control Bytes. This done, we can select the preceding items and 'bypass' the Control Bytes with the 5 bytes buffer mux.

Problems and Solutions Problem #3: Maintaining the Hash Table Correctly

Problems and Solutions Problem #3: Maintaining the Hash Table Correctly In the initial design: No First 2 Bytes memory.

Problems and Solutions Problem #3: Maintaining the Hash Table Correctly In the initial design: No First 2 Bytes memory. The problem:

Problems and Solutions Problem #3: Maintaining the Hash Table Correctly In the initial design: No First 2 Bytes memory. The problem: Before acting on a copy item, the first two bytes of the literal sequence represented by the copy should be concatenated with the previous literal items.

Problems and Solutions Problem #3: Maintaining the Hash Table Correctly In the initial design: No First 2 Bytes memory. The problem: Before acting on a copy item, the first two bytes of the literal sequence represented by the copy should be concatenated with the previous literal items. The solution:

Problems and Solutions Problem #3: Maintaining the Hash Table Correctly In the initial design: No First 2 Bytes memory. The problem: Before acting on a copy item, the first two bytes of the literal sequence represented by the copy should be concatenated with the previous literal items. The solution: Maintaining the First 2 bytes memory, which holds the first 2 bytes of each literal sequence whose offset is written to the hash table. This way, concatenation is possible by extracting the necessary bytes from the first 2 bytes memory.

New Problem Problem #4: Copy adjacent to the sequence it points to

New Problem Problem #4: Copy adjacent to the sequence it points to The problem:

New Problem Problem #4: Copy adjacent to the sequence it points to The problem: When trying to concatenate the first 2 bytes of a copy, there is a problem if the copy item arrives straight after the literal sequence that created it. The first 2 bytes are not yet stored, thus cannot be retrieved.

New Problem Problem #4: Copy adjacent to the sequence it points to The problem: When trying to concatenate the first 2 bytes of a copy, there is a problem if the copy item arrives straight after the literal sequence that created it. The first 2 bytes are not yet stored, thus cannot be retrieved. The proposed solution:

New Problem Problem #4: Copy adjacent to the sequence it points to The problem: When trying to concatenate the first 2 bytes of a copy, there is a problem if the copy item arrives straight after the literal sequence that created it. The first 2 bytes are not yet stored, thus cannot be retrieved. The proposed solution: Comparator, which determines if the index of the copy item is the last index written to in the Hash Table. If so, the relevant data is bypassed.

DateGoals 21/3/2014 – 5/4/2014Project Characterization & Algorithm interpreting 6/4/2014Characterization Presentation 7/4/2014 – 2/6/2014Full Characterization of all blocks 3/6/2014 – 21/6/2014 System blocks VHDL Design 22/6/2014Mid presentation 23/6/2014 – 25/7/2014Work on project paused for exams Project Schedule 1/2

DateGoals 30/7/2014 – 4/9/2014VHDL design Cont. 21/9/2014 – 20/10/2014Building a simulation environment 21/10/2014 – 21/11/2014Simulation run & debug 22/11/2014Part A - Final presentation 23/11/2014 – 10/12/2014FPGA synthesis & implementation 11/12/2014 – 25/12/2015GUI implementation 26/12/2014 – 24/1/2015Tests & debug 25/1/2015Final project presentation Project Schedule 2/2

Weeks: – – Characterization & interpretation Characterization presentation Blocks characterization VHDL blocks implementation Mid presentation Exams VHDL Cont. Building Sim Env Part A - Final pres. Sim & Debug FPGA synthesis GUI implementation Tests & debug Writing portfolio Final presentation …… ………...…………… ….…….……………………………………………………….… Project Gantt ……...….…….……………………………………