Digital yet Deliberately Random: Synthesizing Logical Computation on Stochastic Bit Streams Weikang Qian Ph.D. Candidate Electrical & Computer Engineering University of Minnesota Advisor: Marc D. Riedel Ph.D. Final Defense July 25, 2011
Hierarchy of Modern Digital Systems Application Level System Level Logical Level I will introduce my research from the hierarchy of modern digital systems. At the very bottom is the physical level… At the very top it the application level, where different applications are run on it. Physical Level
Challenges: Physical Level Gordon Moore Transistor Scaling: Approaching Physical Limit With the advancing of technologies, we indeed are faced with several challenges. Increasing concerns about variability and errors.
Challenges: Physical Level Emerging Device Technologies Carbon Nanotubes Nanowire Crossbar People have also looked at emerging technologies to replace the CMOS technologies, such as carbon nanotubes and carbon nanowire crossbar arrays. Randomness in connections High defect rates
Challenges: Application Level Today: process data Future: comprehend data Future: machine learning, pattern recognition, data mining, … Today’s computer is already very powerful. They are good at manipulating vast amount of data very quickly. However, they are not good at understanding data. However, with the increasing of digital data everyday, we also need computer to help us understand the data. These applications are inherently probabilistic. There is no right answer. Is she Lady Gaga? These applications are probabilistic: we look for an answer with high probability of being true
Deterministic Paradigm Application Level System Level Deterministic Encoding Arithmetic unit: numbers encoded by binary radix Control unit: instructions defined as deterministic sets of zeros and ones Logical Level Physical Level
Deterministic Paradigm Stochastic Paradigm Application Level Probabilistic Application System Level Unnecessary Stochastic Encoding Deterministic Encoding Logical Level Hard to maintain accuracy This motivates my research. Physical Level Probabilistic, Random
Synthesizing Logic that Computes on Stochastic Bit Streams 0,1,1,0,1,0,1,0,… combinational logic 1,1,0,1,0,1,1,0,… 0,1,1,0,1,0,0,0,… 1,0,0,0,0,0,1,0,… 1,0,0,0,1,1,0,0,… 1,0,1,1,0,1,1,1,… Applicable to arbitrary arithmetic functions In today’s talk, I will cover two topics. The first topic is … I proposed a general method to synthesize arbitrary arithmetic functions with this paradigm An example I used in the talk is the gamma correction function and I’ll return to this example in the talk. Gamma Correction Function
Synthesizing Logic that Generates Probabilities Transform a source set of probabilities to a target set entirely through combinational logic combinational logic 0.4 Probability: 0.119 0.5 A premise is that we need stochastic bit streams. 0,0,0,0,1,0,1,0,0,0, …
Outline Preliminaries Synthesizing Logic that Computes on Stochastic Bit Streams Synthesizing Logic that Generates Probabilities Future Work
Outline Preliminaries Synthesizing Logic that Computes on Stochastic Bit Streams Synthesizing Logic that Generates Probabilities Future Work
Logical Computation On Sequences of Random Bits combinational logic 0,1,1,0,1,0,1,0,… 1,1,0,1,0,1,1,0,… 0,1,1,0,1,0,0,0,… 1,0,0,0,0,0,1,0,… 1,0,0,0,1,1,0,0,… 1,0,1,1,0,1,1,1,…
Representing a Value by a Sequence of Random Bits A real value x in [0, 1] is represented by a sequence of random bits, each of which has probability x of being one and probability of 1 − x of being zero. 0,1,0,1,1,0,0 x = 3/7
Serial versus Parallel Stochastic Bit Streams x = 3/7 0,1,0,1,1,0, 0 Probabilistic Bundles 1 Probabilistic bundle is applicable to nanowire crossbar arrays. x = 3/7
Stochastic Bit Streams as Inputs/Outputs Probability values are the input and output signals. combinational logic 0,1,1,0,1,0,1,0 0,1,1,0,1,0,0,0 1,0,0,0,0,0,1,0 1,0,1,1,0,1,1,1 4/8 3/8 2/8 6/8 1,1,0,1,0,1,1,0 1,0,0,0,1,1,0,0 3/8 5/8
A Single AND Gate Performs Multiplication! 1,1,0,1,0,1,1,1 1,1,0,0,0,0,1,0 1,1,0,0,1,0,1,0 b = 4/8 3/8 = 6/8 • 4/8 Indeed, it is no a coincidence. Lower case letters represent probabilities… Assume two input bit streams are independent
A Conventional Multiplier HA: Half adder, 2 basic gates (AND and XOR) FA: Full adder, 5 basic gates (AND, OR, and XOR) Conventional multiplier works on binary radix encoded number. As any student in introductory course, it is complicated. In total 30 gates!
Error due to Stochastic Variance x = P(X = 1) = 2/5 Ideal: 0,1,0,0,1,1,0,1,0,0 Practical: 1,0,1,0,1,0,0,1,1,0 1/2 Effect of error is small. Error can be reduced by increasing the bit length. Target at applications that can tolerate small errors, e.g., image processing.
Precision versus Bit Length Binary radix encoding Positional and compact: To represent 2n different values, need n bits Stochastic encoding Uniform and not compact: To represent 2n different values, need 2n bits Example: (1001)2 → 9 Example: (0101100010) → 0.4 For applications that can tolerate small errors, we don’t need a large n.
Fault Tolerance Stochastic Encoding Binary Radix Encoding A bit flip does not substantially change the probability: 1010111001 → 1010011001 Binary Radix Encoding A bit flip in the most significant bit causes a huge change in the value: (1010)2 → (0010)2 0.6 0.5 10 2
Comparison of Encoding Binary Radix Encoding Stochastic Encoding Circuit Area Large Small Fault Tolerance Bad Good Delay Short Long (Positional, Weighted) (Uniform) (Uniform, Long Stream) (Positional) (Compact, Efficient) (Not compact, Long Stream) Spectrum of Encoding Binary Radix Encoding Stochastic Encoding
Outline Preliminaries Synthesizing Logic that Computes on Stochastic Bit Streams Synthesizing Logic that Generates Probabilities Future Work
Prior Work: Stochastic Bit Streams Gaines showed how to implement basic operations such as addition and multiplication (in 1969). Brown and Card showed how to implement the tanh function and the linear gain function (in 2001). Gaudet and Rapley implemented low-density parity- check (LDPC) decoder (in 2003). There has been prior works. In particular, researchers in neural network have considered this. These works focus on implementing specific functions.
Gamma Correction Function Contributions Showed what kind of functions can be implemented by combinational logic operating on stochastic bit streams Must be an arithmetic polynomial Proposed a general method to synthesize arbitrary polynomial functions. Generalized the above method to synthesize arbitrary non-polynomial functions via approximation. Gamma Correction Function
? Mathematical Model X1 X2 Y Xn Independent Random Random combinational logic X2 X1 Xn Independent Random Boolean Variables Random Boolean Variable Y Mathematically, I’m considering combinational logic, built with AND, OR, assume that they are prefect, no defect in them. Function F ?
Mathematical Model F is a polynomial on x1, …, xn with integer coefficients and degree no more than one. Example: Multiplexer In terms of probabilities, the probaility of C to be 1 equal 1
Implementing General Polynomials Can we implement polynomial with real coefficients and degree more than one on stochastic bit streams? 1 c = sa + b – sb c = t2 – 0.8t + 0.8 Set s = a = t, b = 0.8 combinational logic t c1 c0 x2 x1 x5 x3 x4 Independent Probabilities F(x1, …, x5) g(t) = t2 – 0.8t + 0.8 Special Polynomial General Polynomial t represents variable probabilities. ci’s represent constant probabilities.
The Problem: Synthesizing Circuit combinational logic t c1 c0 Independent Probabilities Target g(t) = 1.2t2 – t3 General Polynomial T is not the random variable, but the probability of the random variable to be one. ? ? Illustrate with univariate polynomials, but can be generalized to multivariate polynomials
Synthesizing Circuit to Implement Polynomial Power-Form Polynomial Bernstein coefficient Bernstein polynomial of degree 2 Bernstein basis polynomial Step 1: Convert the polynomial into a Bernstein form.
Synthesizing Circuit to Implement Polynomial Power-Form Polynomial All coefficients in unit interval less than 0 Step 1: Convert the polynomial into a Bernstein form. As a mathematical contribution, I have shown that by manipulating polynomials, specially by elevating the degree of the polynomial, we can always convert the original polynomial into a Bernstein polynomial with all coefficients in the unit interval. Step 2: Elevate the Bernstein polynomial until all coefficients are in the unit interval. Step 3: Implement the Bernstein polynomial with all coefficients in the unit interval by “generalized multiplexing.”
Synthesizing Circuit to Implement Polynomial Power-Form Polynomial (Evaluate on t = 1/2) g(1/2) = 1/4 P(Xi=1) = t (= 1/2) Independent Talk about why this works? The output of the adder obeys binomial distribution. P(Zi=1) = bi,3 (Bernstein Coefficient)
Generalized Multiplexing Binomial distribution P(Xi=1) = t Independent Bernstein basis polynomial P(Zi=1) = bi,n The logic is traditional combinational logic (0 ≤ bi,n ≤ 1)
Non-Polynomial Functions Find a Bernstein polynomial with coefficients in the unit interval that approximates the non-polynomial g(t). Find real values to minimize subject to Solved by quadratic programming
Example: Gamma Correction Function Coefficients of degree-6 Bernstein polynomial approximation: b0,6 = 0.0955, b1,6 = 0.7207, b2,6 = 0.3476, b3,6 = 0.9988, b4,6 = 0.7017, b5,6 = 0.9695, b6,6 = 0.9939
Hardware Cost Comparison Compare conventional implementation to stochastic implementation of polynomial functions. Mapped onto FPGA (counting the number of LUTs) Conventional implementation: 10-bit binary radix Stochastic implementation: bit stream of length 210
Fault-Tolerance Comparison Conventional v.s. Stochastic implementation of Gamma correction function with noise injection Conventional Implementation Stochastic Implementation 1% 2% 10%
Outline Preliminaries Synthesizing Logic that Computes on Stochastic Bit Streams Synthesizing Logic that Generates Probabilities Future Work
Generating Stochastic Bit Streams A premise for logical computation on stochastic bit streams Many other probabilistic applications Monte Carlo simulation In test of digital circuits: generate weighted random patterns We need a random bit that will have some probability to let us go left.
General Random Bit Generators probability to be one R 1, 0, 1, … If R < C, output a one; If R ≥ C, output a zero. Comparator C
Types of Random Sources Pseudorandom Number Generator Linear Feedback Shift Register (expensive) Physical Random Source Thermal Noises (cheap)
Challenge with Physical Random Sources cheap Voltage Regulators expensive expensive C1 C2 Suppose many different probabilities are needed: {0.2, 0.78, 0.2549, 0.43, 0.671, 0.012, 0.82, …}. It is costly to generate them directly. (many expensive constant values required.)
Opportunity with Physical Random Sources cheap 1,1,0,0,0, … 0,1,0,1,0, … 0,0,1,0,1, … expensive Independent Same probability
Solution When we need many different probabilities: {0.2, 0.78, 0.2549, 0.43, 0.671, 0.012, 0.82, …} Generate a few source probabilities directly from random bit generators. Synthesize combinational logic to generate other probabilities. Probability: Probability of a signal being logical one
Basic Problem … Synthesize Logic Circuit? Choose Set S ? (|S| small) Set S of Input Probabilities {p1 , p2} Set S of Input Probabilities {p1 , p2} Other Probabilities Needed p1 p2 p1 p2 Logic Circuit Random Bit Generators … q2 q1 q3 q4 Independent How do we design logic circuit that takes probabilities from the input probability set and generates the needed probabilities? Synthesize Logic Circuit? Choose Set S ? (|S| small)
Example 0.4 0.6 0.5 0.2 … Logic Circuit P(x = 1) = 0.4 0,1,0,1,0,0,1,1,0,0 P(z = 1) = 0.2 P(x = 1) = 0.4 P(z = 1) = 0.6 0,0,0,1,0,0,1,0,0,0 1,0,1,1,0,0,1,0,0,1 1,0,1,1,0,1,0,0,0,0 0,1,0,0,1,0,1,1,1,1 P(y = 1) = 0.5 P(z = 1) = P(x = 0) P(z = 1) = P(x = 1) P(y = 1)
Generating Decimal Probabilities Arbitrary Decimal Probabilities Choose Set S = {p1, p2, p3} p1 p2 p3 Logic Circuit … q2 q1 q3 q4 Independent As a result, we will present a set S consisting of two elements that could generate arbitrary decimal probability. Then, a set S consisting of only one elements. In this talk, I will focus on the set S of size one. The set S of size one is a pure mathematic result. Found Set S for |S| = 2 |S| = 1 |S| Small!
Generating Decimal Probabilities Theorem: With S = {0.4, 0.5}, we can synthesize arbitrary decimal output probabilities. Constructive proof. Derived a synthesis algorithm. As an example, we show the circuit synthesized to generate probability 0.757. All input probabilities of the circuit are either 0.4 or 0.5.
(Black dots are inverters) Algorithm Example: Synthesize q = 0.757 from S = {0.4, 0.5} ×0.5 0.86 1 − 0.14 ×0.4 0.35 0.7 0.3 0.6 0.4 ×0.4 0.6075 1 − 0.3925 ×0.5 0.785 0.215 0.43 0.757 0.243 (Black dots are inverters) For each 1- operation, we place an inverter in the circuit. For each multiplication, we place a AND gate in the circuit. For a probability value with n digits, need at most 3n AND gates.
Implementation: Reduce Circuit Depth Practical Aspect
Balancing Logic Level Optimization: Balancing Before Balancing It is like balancing a tree, move one AND gate to another branch so to reduce depth. Before Balancing After Balancing (a and b are primary inputs)
Factorization of Fractions High Level Optimization: Factorization of Fractions Example: Synthesize q = 0.49 from S = {0.4, 0.5} Basic (Black dots are inverters) When we count the depth of the circuit, we ignore inverters. Factor 0.49 = 0.7 x 0.7
Outline Preliminaries Synthesizing Logic that Computes on Stochastic Bit Streams Synthesizing Logic that Generates Probabilities Future Work
Binary Radix Encoding (Compact, Positional) Spectrum of Encoding ? Binary Radix Encoding (Compact, Positional) Stochastic Encoding (Not compact, Uniform) A mixture of encoding. Possible encodings in the middle with the advantages of both?
Minimize the area cost of circuit for probabilistic computation Logic Optimization Minimize the area cost of circuit for probabilistic computation Example: Generate 0.3 from source probabilities 0.4 and 0.5 The general problem I’m considering is how to find this optimal design. better!
Challenges of Optimization Traditional Logic Synthesis Manipulate the representations of Boolean functions (Implement the same function y = ac + bc) Logic Synthesis for Probabilistic Computation The Boolean functions can be different!
Emerging Nanotechnologies Opportunities: High density of bits/interconnects Challenges: Inherent structural randomness; High defect rates Nanowire Crossbar (Idealized) A1 A collection of inverters with shuffled outputs! The yellow bars in the figure represent nanowires. With self-assembly techniques, each vertical nanowire in the array contains a randomly located doped region, shown as a red rectangle. Where such a random doping occurs, the intersection of the horizontal and vertical wires forms a PMOS-like junction: when the voltage on the horizontal nanowire is low (high), the voltage at the output of the vertical nanowire is high (low). In an ideal situation, each horizontal nanowire intersects with the doped region of exactly one of the vertical nanowires. With randomness, such an array can be viewed as a collection of inverters with shuffled outputs. A2 A3 A4
Nanowire Crossbar Array Shuffled AND A2 A3 A4 B1 B2 B3 B4 In my preliminary work, I have explored the ways in which the randomness due to self assembly can be used. A4B3 Inputs to AND gates are shuffled A1B2 A2B4 A3B1
Shuffled AND: Multiplication Inputs to AND gates are shuffled 1 Probabilistic Bundles 1 1 1 x = 3/6 1 Multiplication c = P(C=1) = P(A=1)•P(B=1) = a • b
Other Emerging Technologies Carbon nanotubes Molecular switches DNA
Thank You!