Presentation is loading. Please wait.

Presentation is loading. Please wait.

Error Correcting Codes for Serial links : an update

Similar presentations


Presentation on theme: "Error Correcting Codes for Serial links : an update"— Presentation transcript:

1 Error Correcting Codes for Serial links : an update
21/04/2018 Error Correcting Codes for Serial links : an update Sergio Cavaliere Department of Physics, University of Napoli “Federico II”, Italy and INFN Sezione di Napoli, Italy In this talk I will present some results from a preliminary study of Forward Correcting Codes for the SuperB serial links, simulations results and tools built for the purpose. It’s actually an ongoing work which will require soon a closer integration to the architecture which are being studied for the actual link. XV SuperB Workshop – Caltech - Dec, 2010

2 XV SuperB Workshop – Caltech Dec., 2010
Overview Recall Problems with serial link failures and errors due to rad hard environment Recall what are the relevant parameters for the performance of the error correcting code Define bit error rate and bit error frequency and the related Poisson statistics Probability analysis regarding Bit Error Rate reduction in Hamming codes. Probability analysis regarding Bit Error Rate reduction in Reed Solomon codes Analysis of some proposed coding structures Conclusion and future work XV SuperB Workshop – Caltech Dec., 2010

3 Problems with serial link failures and errors
Two main problems regarding errors due to rad hard environment : Loss Of Lock – due to failures on fixed bits in the SERDES - analyzed in last Frascati meeting. Conclusion: need to provide a direct fast link between transmitter and receiver in order to signall promptly occurrence of LoL Bit errors due to the radiation hard environment: affect data integrity and data quality Solution: need of an Error Correcting Code (ECC) evaluation of the required performance of the code start from a presumed bit error rate (LHC?) [compared with the extreme technology limits of 10-15] arrive to a desired time between errors XV SuperB Workshop – Caltech Dec., 2010

4 Relevant parameters for the performance of the error correcting code
In the usual communication approach, the relevant parameter for serial link improvement is the coding gain: increase in channel noise wich can be balanced by error correction codes This allows reducing costs with the same performance or increasing speed at the same cost:  relax SNR requirements in our case what is important is the bit error rate reduction obtained by ECC From BER parameter we may compute an overall failure rate for each serial link and for the whole apparatus at a fixed data rate XV SuperB Workshop – Caltech Dec., 2010

5 Bit Error Rate and time between errors
λ error events and λ faultybits 1/T transmitted bits Unit time T transmit clock period BER =no. of errored bits/ no. of trasmitted bits f = transmission frequency λ = BEF bit error frequency=BER*f Average Time between errors = 1/BEF e.g. f=1.1GHz BER=10-10 λ= μ=Average Time between errors = 9s XV SuperB Workshop – Caltech Dec., 2010

6 Bit errors: Poisson statistics
21/04/2018 Bit errors: Poisson statistics Error on bits caused by events which take place in a radiation hard environment has an usual statistics with the features: events take place one after the other and indipendently each other the average number of events in unit time is constant, equal to λ. λ is the average number of events in unit time (frequency or rate) μ =1/ λ is the average time distance from one event to the next XV SuperB Workshop – Caltech Dec., 2010 6

7 Bit Error Rate and time between errors
The diagram shows how a value for BER translates into the average time between errors (in case of continuous data exchange) at a fixed operating frequency e.g. BER=10-10 average time between errors = 9s use error correction coding to achieve: BER=10-16 average time between errors = 4 months XV SuperB Workshop – Caltech Dec., 2010

8 How to evaluate how much ECC power is needed?
Assume a command length of 100 bits (actual figures will be bits) Assume a reference BER=10-10 for each link For a single link: correction of 0 bit per frame will deliver BER= > time between errors 9s correction of 1 bit per frame will deliver BER=5*10-17 time between errors years correction of 2 bit per frame will deliver BER=2*10-25 t between errors many years Binomial formula: probability of having n errors in a frame of m bits and error probability p We may argue that a moderate complexity ECC may be adopted Observation: Low probability values would involve very long simulations XV SuperB Workshop – Caltech Dec., 2010

9 Bit Error Rate reduction in Hamming codes
Probability of a word error for block codes. n = wordlength t=no. of corrected bits Probability of a bit error for block codes. n=wordlength t=no. of corrected bits Probability of a bit error for Hamming codes t=1 n=2m-1 In log scale it is a straight line with angular coefficient 2(n-1) XV SuperB Workshop – Caltech Dec., 2010

10 Bit Error Rate reduction in shortened Hamming codes
Due to the 18 bits constraint in the serdes we must use shortened Hamming codes 26 13 31 18 H(31,26) For Hamming code H(n,k) shortened to H(ns,ks) In the above example n=31 ns=18 XV SuperB Workshop – Caltech Dec., 2010

11 Bit Error Rate reduction in multi-Hamming codes
11 8 3 15 12 H(15,11) 4 1 7 6 H(7,4) 18 Our codes will be made of a combination of shortened Hamming codes pmulti = bit error probability for the overall code p1=probability of branch no.1 (shortened Hamming) k1red = no. of bits of the message of branch no.1 p2=probability of branch no. 2 (shortened Hamming) k2red = no. of bits of the message of branch no. 2 XV SuperB Workshop – Caltech Dec., 2010

12 Bit Error Rate reduction in Reed Solomon codes
ps probability that symbol is in error p probability that a bit is in error m is the symbol length pew probability that a word made of n symbols is in error pib probability that a bit of the message is in error after ECC coding Same work as Hamming to obtain features for shortened and combined codes XV SuperB Workshop – Caltech Dec., 2010

13 Hamming code: features of a selected test code
21/04/2018 Hamming code: features of a selected test code trasmitted a frame of 2x18 bit=36bit no polarity control codes 2 x H(15,11) + Hs(6,3) {from H(7,4)} serdes 18 36 bit 18bit n=2 2*18 Data to transmit 11 buffer & scrambler serial link Ecc = 12 % Overhead = 44 % H(15,11) Hs(6,3) 25 bit 15 6 3 Data to distribute Buffer & descrambler encoder BER  10-19 decoder apr. ’18 XV SuperB Workshop – Caltech Dec., 2010 13 13

14 Hamming code: features of a selected test code
21/04/2018 Hamming code: features of a selected test code e.g. f=1.1GHz uncorrected BER = 10-10 average time between failures 9 s after coding corrected BER = 10-19 average time between failures 244 years XV SuperB Workshop – Caltech Dec., 2010 apr. ’18 14 14

15 XV SuperB Workshop – Caltech Dec., 2010
21/04/2018 Reed Solomon codes Similar examples may be made for Reed Solomon codes We do not show an example for this also because greater hardware complexity of both encoding and decoding may drive to the Hamming solution which is: simple as far as regards hardware complexity and faster as far as regards the involved delays XV SuperB Workshop – Caltech Dec., 2010 15

16 XV SuperB Workshop – Caltech Dec., 2010
Conclusions We have developed a thorough statistical analysis of bit error probability after ECC coding for complex, shortened and mixed codes both Hamming and Reed Solomon codes, with some simulation We must point out that the above consideration on error rates apply to a single link. The multiplicity will obviously raise the bit error frequency in the apparatus by that multiplying factor. Even taking into account this circumstance we might argue that a moderate correction capability is needed in order to reduce error rate to a suitable value. This will be assessed as soon as we will have precise figures on the error rate in our rad hard environment We will therefore revert to very simple ECC structures, fully compatible with a proper hardware implementation on the ground of both available hardware resources and processing time XV SuperB Workshop – Caltech Dec., 2010

17 XV SuperB Workshop – Caltech Dec., 2010
21/04/2018 To be done obtain precise figures on the bit error rate in our rad hard environment define and analyze Hamming (Reed Solomon) coding structures with the purpose of reducing both silicon area and operating speed for the implementation analyze thoroughly the impact of error rates on the performance of the overall apparatus and related data quality evaluate practical implementations XV SuperB Workshop – Caltech Dec., 2010


Download ppt "Error Correcting Codes for Serial links : an update"

Similar presentations


Ads by Google