Compression for Fixed-Width Memories Ori Rottenstriech, Amit Berman, Yuval Cassuto and Isaac Keslassy Technion, Israel.

Slides:



Advertisements
Similar presentations
1+eps-Approximate Sparse Recovery Eric Price MIT David Woodruff IBM Almaden.
Advertisements

The Complexity of Linear Dependence Problems in Vector Spaces David Woodruff IBM Almaden Joint work with Arnab Bhattacharyya, Piotr Indyk, and Ning Xie.
Solving connectivity problems parameterized by treewidth in single exponential time Marek Cygan, Marcin Pilipczuk, Michal Pilipczuk Jesper Nederlof, Dagstuhl.
Boosting Textual Compression in Optimal Linear Time.
Compressing Forwarding Tables Ori Rottenstreich (Technion, Israel) Joint work with Marat Radan, Yuval Cassuto, Isaac Keslassy (Technion, Israel) Carmi.
Lecture 3: Source Coding Theory TSBK01 Image Coding and Data Compression Jörgen Ahlberg Div. of Sensor Technology Swedish Defence Research Agency (FOI)
Approximation Algorithms Chapter 14: Rounding Applied to Set Cover.
Bayesian Networks, Winter Yoav Haimovitch & Ariel Raviv 1.
Capacity of Wireless Channels
Bounds on Code Length Theorem: Let l ∗ 1, l ∗ 2,..., l ∗ m be optimal codeword lengths for a source distribution p and a D-ary alphabet, and let L ∗ be.
Outline. Theorem For the two processor network, Bit C(Leader) = Bit C(MaxF) = 2[log 2 ((M + 2)/3.5)] and Bit C t (Leader) = Bit C t (MaxF) = 2[log 2 ((M.
II. Linear Block Codes. © Tallal Elshabrawy 2 Last Lecture H Matrix and Calculation of d min Error Detection Capability Error Correction Capability Error.
Chain Rules for Entropy
1 Huffman Codes. 2 Introduction Huffman codes are a very effective technique for compressing data; savings of 20% to 90% are typical, depending on the.
Chapter 5 Orthogonality
Lecture 6: Huffman Code Thinh Nguyen Oregon State University.
Ugo Montanari On the optimal approximation of descrete functions with low- dimentional tables.
On the Code Length of TCAM Coding Schemes Ori Rottenstreich (Technion, Israel) Joint work with Isaac Keslassy (Technion, Israel) 1.
Worst-Case TCAM Rule Expansion Ori Rottenstreich (Technion, Israel) Joint work with Isaac Keslassy (Technion, Israel)
October, 1998DARPA / Melamed / Singh1 Parallelization of Search Algorithms for Modeling QTES Processes Joshua Kramer and Santokh Singh Rutgers University.
DL Compression – Beeri/Feitelson1 Compression דחיסה Introduction Information theory Text compression IL compression.
On Testing Convexity and Submodularity Michal Parnas Dana Ron Ronitt Rubinfeld.
Data Structures – LECTURE 10 Huffman coding
1 Chapter 5 A Measure of Information. 2 Outline 5.1 Axioms for the uncertainty measure 5.2 Two Interpretations of the uncertainty function 5.3 Properties.
Preference Analysis Joachim Giesen and Eva Schuberth May 24, 2006.
Variable-Length Codes: Huffman Codes
Worst-Case TCAM Rule Expansion Ori Rottenstreich (Technion, Israel) Joint work with Isaac Keslassy (Technion, Israel)
Linear codes 1 CHAPTER 2: Linear codes ABSTRACT Most of the important codes are special types of so-called linear codes. Linear codes are of importance.
Introduction to AEP In information theory, the asymptotic equipartition property (AEP) is the analog of the law of large numbers. This law states that.
Information and Coding Theory
§1 Entropy and mutual information
Stability Analysis of Linear Switched Systems: An Optimal Control Approach 1 Michael Margaliot School of Elec. Eng. Tel Aviv University, Israel Joint work.
PEDS: Parallel Error Detection Scheme for TCAM Devices David Hay, Politecnico di Torino Joint work with Anat Bremler Barr (IDC, Israel), Danny Hendler.
Source Coding-Compression
Dr.-Ing. Khaled Shawky Hassan
Information and Coding Theory Linear Block Codes. Basic definitions and some examples. Juris Viksna, 2015.
1 COMP3040 Tutorial 1 Analysis of algorithms. 2 Outline Motivation Analysis of algorithms Examples Practice questions.
PEDS: A PARALLEL ERROR DETECTION SCHEME FOR TCAM DEVICES Author: Anat Bremler-Barr, David Hay, Danny Hendler and Ron M. Roth Publisher/Conf.: IEEE INFOCOM.
Lossless Compression CIS 465 Multimedia. Compression Compression: the process of coding that will effectively reduce the total number of bits needed to.
Huffman coding Content 1 Encoding and decoding messages Fixed-length coding Variable-length coding 2 Huffman coding.
Communication System A communication system can be represented as in Figure. A message W, drawn from the index set {1, 2,..., M}, results in the signal.
§6 Linear Codes § 6.1 Classification of error control system § 6.2 Channel coding conception § 6.3 The generator and parity-check matrices § 6.5 Hamming.
On Finding an Optimal TCAM Encoding Scheme for Packet Classification Ori Rottenstreich (Technion, Israel) Joint work with Isaac Keslassy (Technion, Israel)
DIGITAL COMMUNICATIONS Linear Block Codes
The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.
Word : Let F be a field then the expression of the form a 1, a 2, …, a n where a i  F  i is called a word of length n over the field F. We denote the.
Linear Program Set Cover. Given a universe U of n elements, a collection of subsets of U, S = {S 1,…, S k }, and a cost function c: S → Q +. Find a minimum.
The Bloom Paradox Ori Rottenstreich Joint work with Isaac Keslassy Technion, Israel.
Approximation Algorithms Department of Mathematics and Computer Science Drexel University.
1. 2 You should know by now… u The security level of a strategy for a player is the minimum payoff regardless of what strategy his opponent uses. u A.
Vector Quantization Vector quantization is used in many applications such as image and voice compression, voice recognition (in general statistical pattern.
Digital Communications I: Modulation and Coding Course Term Catharina Logothetis Lecture 9.
A Unified Continuous Greedy Algorithm for Submodular Maximization Moran Feldman Roy SchwartzJoseph (Seffi) Naor Technion – Israel Institute of Technology.
Minimizing Delay in Shared Pipelines Ori Rottenstreich (Technion, Israel) Joint work with Isaac Keslassy (Technion, Israel) Yoram Revah, Aviran Kadosh.
CS Lecture 14 Powerful Tools     !. Build your toolbox of abstract structures and concepts. Know the capacities and limits of each tool.
D. AriflerCMPE 548 Fall CMPE 548 Routing and Congestion Control.
Approximation Algorithms based on linear programming.
UNIT I. Entropy and Uncertainty Entropy is the irreducible complexity below which a signal cannot be compressed. Entropy is the irreducible complexity.
 2004 SDU Uniquely Decodable Code 1.Related Notions 2.Determining UDC 3.Kraft Inequality.
Ch4. Zero-Error Data Compression Yuan Luo. Content  Ch4. Zero-Error Data Compression  4.1 The Entropy Bound  4.2 Prefix Codes  Definition and.
Theory of Computational Complexity Probability and Computing Chapter Hikaru Inada Iwama and Ito lab M1.
CMSC 611: Advanced Computer Architecture
EE465: Introduction to Digital Image Processing
Hans Bodlaender, Marek Cygan and Stefan Kratsch
The Variable-Increment Counting Bloom Filter
Great Theoretical Ideas In Computer Science
Independent Encoding for the Broadcast Channel
Distributed Compression For Binary Symetric Channels
Worst-Case TCAM Rule Expansion
Lecture 18 The Main Coding Theory Problem (Section 4.7)
Presentation transcript:

Compression for Fixed-Width Memories Ori Rottenstriech, Amit Berman, Yuval Cassuto and Isaac Keslassy Technion, Israel

Data Compression: Communication Channel vs. Memory 2 Communication Channel Memory Minimize the average word length Maximize the probability that data unit can be stored in a row

Memory Hierarchy 3 Longer Rows (and capacity) Better Performance SRAM DRAM Flash / HDD Processor

Long encoded rows cannot be stored within a fixed-width memory word o We would like to maximize the probability of a successful encoding within the fixed width Related work: Compression with low probability of buffer overflow [Jelinek 1968, Humblet 1981] Joint Compression for Two Data Blocks 4 Width (L bits) Successfully encoded rows: Too long rows, stored in a slower memory with longer access time:

Four possible data entries are encoded within L=3 bits by Success probability of Encoding Example 5 First field with code Second field with code Possible entries encoded by the encoding scheme

Huffman-based encoding: for Successfully-encoded entries: Better encoding: for Successfully-encoded entries: Huffman-based Encoding may not be optimal S 1,1 0.9 S 1, S 1, S 1, S 2,1 0.5 S 2,2 0.2 S 2, S 2, S 1,1 0.9 S 1, S 1, S 1, S 2,1 0.5 S 2,2 0.2 S 2, S 2,

Definition (Entry Distribution): A n entry distribution, is characterized by two (ordered) sets of elements with their corresponding vectors of positive appearance probabilities s.t. and. Let Definition (Encoding Scheme): A n encoding scheme of an entry distribution D is a pair of two prefix codes. is a prefix code of the set of elements in the first or second field. Problem Definition 7

Definition (Encoding Width Bound): Given an encoding scheme and an encoding width bound of L bits, we say that an entry is encoded successfully if its encoding width is not larger than the encoding width bound, i.e.. Definition (Success Probability): The success probability of an encoding scheme is the probability that the encoding of an arbitrary entry would be successful. Problem Definition (2) 8

Definition (Optimal Success Probability): For an entry distribution D, we denote by the optimal success probability that can be obtained by any encoding scheme Our goal: Find an encoding scheme that maximizes the success probability Constraint on the encoding scheme (Kraft’s Inequality): There exists a prefix encoding of the elements in a set with codeword lengths iff Problem Definition (3) 9

For an entry distribution, we define the following optimization problem: We would like to maximize the success probability while satisfying the Kraft’s inequality for the two prefix codes We have to represent all elements in (Including those that will never be a part of a successfully-encoded entry.) The codeword lengths should be positive integers. Optimization Problem 10

Outline  Introduction and Problem Definition  General Properties  Bounds on the success probability  Optimal Conditional Encoding  Summary 11

We assume that for the elements of are ordered in a non-increasing order of probabilities: if. Property: For For Property: For the encoding scheme composed of two fixed-length codes with codewords of bits is optimal for and is not optimal for. Property: An encoding scheme with an average encoding width satisfies. General Properties 12

Definition (Monotone Encoding Scheme): An encoding scheme of an entry distribution is called monotone if for implies that Lemma: For any distribution and any there exists a monotone optimal encoding scheme. Monotone Coding 13

Outline  Introduction and Problem Definition  General Properties  Bounds on the success probability  Optimal Conditional Encoding  Summary 14

first elements w.p. Theorem: The optimal success probability satisfies for Proof outline: We can encode the first elements in using bits and the first elements in using bits. An entry composed of these elements has a width of bits and is encoded successfully. Bounds on the optimal success probability 15 Width (L bits) first elements w.p.

last elements w.p. Theorem: The optimal success probability satisfies for Proof outline: In any monotone encoding scheme, the last elements in are encoded in at least bits, and the last elements in in at least bits. An entry composed of two such elements has a width of at least bits. Bounds on the optimal success probability 16 Width (L bits) last elements w.p.

Outline  Introduction and Problem Definition  General Properties  Bounds on the success probability  Optimal Conditional Encoding  Summary 17

Given a code of one column, we would like to find a code for the second column that maximizes the success probability. We assume: The case is trivial. Idea: We suggest a dynamic-programming algorithm. For each element in, we consider the possible codeword lengths. Lemma: We can limit the search space of codeword lengths to be [1,3W] bits. We define the weight of a codeword of length as Kraft’s inequality The sum of weights of all codewords in should be at most. Optimal Conditional Encoding 18 

Definition (Function F(n,k)): For we denote by the maximal sum of probabilities of entries that can be encoded successfully and satisfy:  The second element in the entry is one of the first elements in  The sum of weights of the first codewords is at most Property: The maximal success probability of a conditional encoding scheme is given by Optimal Conditional Encoding (2) 19

Theorem: The function satisfies  for and for.  For To calculate, we suggest a dynamic-programming algorithm based on the recursive formula Property: The time complexity of the algorithm is Optimal Conditional Encoding (3) 20

Concluding Remarks New approach for compression in fixed-width memories Analysis of the optimal success probability Finding the optimal conditional encoding 23