Readout Processing and Noise Elimination Firmware for the Fermilab Beam Loss Monitor System Wu, Jinyuan C. Drennan, R. Thurman-Keup, Z. Shi, A. Baumbaugh.

Slides:



Advertisements
Similar presentations
The CPU The Central Presentation Unit What is the CPU?
Advertisements

Computer Architecture and the Fetch-Execute Cycle
Lecture 4 Introduction to Digital Signal Processors (DSPs) Dr. Konstantinos Tatas.
Microprocessors.
Computer Architecture Instruction-Level Parallel Processors
CPU Review and Programming Models CT101 – Computing Systems.
What is Arduino?  Arduino is a ATMEL 168 micro-controller kit designed specially for small projects  User friendly IDE(Integrated Development Environment)
Processor System Architecture
Chapter 16 Control Unit Operation No HW problems on this chapter. It is important to understand this material on the architecture of computer control units,
Microprogramming Andreas Klappenecker CPSC321 Computer Architecture.
Memory - Registers Instruction Sets
Chapter 16 Control Unit Implemntation. A Basic Computer Model.
Chapter 4 Processor Technology and Architecture. Chapter goals Describe CPU instruction and execution cycles Explain how primitive CPU instructions are.
Recap – Our First Computer WR System Bus 8 ALU Carry output A B S C OUT F 8 8 To registers’ input/output and clock inputs Sequence of control signal combinations.
Chapter 15 IA 64 Architecture Review Predication Predication Registers Speculation Control Data Software Pipelining Prolog, Kernel, & Epilog phases Automatic.
Pyxis Aaron Martin April Lewis Steve Sherk. September 5, 2005 Pyxis16002 General-purpose 16-bit RISC microprocessor bit registers 24-bit address.
GCSE Computing - The CPU
Group 5 Alain J. Percial Paula A. Ortiz Francis X. Ruiz.
Resource Saving in Micro-Computer Software & FPGA Firmware Designs Wu, Jinyuan Fermilab Nov
Riccardo de Asmundis INFN Napoli [Certified LabVIEW Developer]
created by :Gaurav Shrivastava
Advanced Topics on FPGA Applications Screen B Wu, Jinyuan Fermilab IEEE NSS 2007 Refresher Course Supplemental Materials Oct, 2007.
Processor Architecture Needed to handle FFT algoarithm M. Smith.
DLS Digital Controller Tony Dobbing Head of Power Supplies Group.
Chapter 6 Digital Filter Structures
Digital Signal Processing and Generation for a DC Current Transformer for Particle Accelerators Silvia Zorzetti.
Resource Awareness FPGA Design Practices for Reconfigurable Computing: Principles and Examples Wu, Jinyuan Fermilab, PPD/EED April 2007.
Dima Svirida (ITEP/BNL) Comments on Polarimeter Programming NEW IN RUN6  AGS polarimeter switched to 10 times faster readout with CMC100 USB controller.
May Wu Jinyuan, Fermilab 1 FPGA and Reconfigurable Computing Wu, Jinyuan Fermilab ICT May, 2009.
8085. Microcomputer Major components of the computer - the processor, the control unit, one or more memory ICs, one or more I/O ICs, and the clock Major.
Electronic Analog Computer Dr. Amin Danial Asham by.
DIGITAL SIGNAL PROCESSORS. Von Neumann Architecture Computers to be programmed by codes residing in memory. Single Memory to store data and program.
80386DX functional Block Diagram PIN Description Register set Flags Physical address space Data types.
Advanced Topics on FPGA Applications Screen A Wu, Jinyuan Fermilab IEEE NSS 2007 Refresher Course Supplemental Materials Oct, 2007.
Computer Organization CDA 3103 Dr. Hassan Foroosh Dept. of Computer Science UCF © Copyright Hassan Foroosh 2002.
Introduction to Microprocessors - chapter3 1 Chapter 3 The 8085 Microprocessor Architecture.
MICROPROGRAMMED CONTROL
Logic Gates Dr.Ahmed Bayoumi Dr.Shady Elmashad. Objectives  Identify the basic gates and describe the behavior of each  Combine basic gates into circuits.
Lecture 09b Finite Impulse Response (FIR) Filters
Application of digital filter in engineering
June 2009, Wu Jinyuan, Fermilab MicroBooNe Design Review 1 Some Data Reduction Schemes for MicroBooNe Wu, Jinyuan Fermilab June, 2009.
Data Reduction Schemes for MicroBoone Wu, Jinyuan Fermilab.
TDC and ADC Implemented Using FPGA
GCSE Computing - The CPU
Programmable Logic Devices
Unit Microprocessor.
Computers’ Basic Organization
Seminar On 8085 microprocessor
Dr.Ahmed Bayoumi Dr.Shady Elmashad
COURSE OUTCOMES OF Microprocessor and programming
The 8085 Microprocessor Architecture
Gunjeet Kaur Dronacharya Group of institutions
William Stallings Computer Organization and Architecture 8th Edition
Embedded Systems Design
The 8085 Microprocessor Architecture
The Central Processing Unit
CS1251 Computer Architecture
8085 microprocessor.
An Introduction to Microprocessor Architecture using intel 8085 as a classic processor
Digital Signal Processors
Instructor: Alexander Stoytchev
Lect5 A framework for digital filter design
Computer Organization
BIC 10503: COMPUTER ARCHITECTURE
* From AMD 1996 Publication #18522 Revision E
The 8085 Microprocessor Architecture
ARM ORGANISATION.
GCSE Computing - The CPU
Computer Operation 6/22/2019.
Chapter 4 The Von Neumann Model
Presentation transcript:

Readout Processing and Noise Elimination Firmware for the Fermilab Beam Loss Monitor System Wu, Jinyuan C. Drennan, R. Thurman-Keup, Z. Shi, A. Baumbaugh and J. Lewis Fermilab, April 2007

The Digitizer Card for the Fermilab Beam Loss Monitor System Beam loss input signals from ion chambers are integrated and digitized. Sliding sums are accumulated and compared with pre-loaded thresholds. Over threshold in several places causes beam abort based on pre-defined setting. Beam loss signals are filtered and “de- rippled” for display purposes. Sequence is controlled by “Seq128” block. ADC 21  s/sample RAM Fast Sliding Sum A>B Slow Sliding Sum Very Slow Sliding Sum Immediate Sliding Sum Threshold I Abort Logic A>B Threshold F A>B Threshold S A>B Threshold V CIC Sums De-ripple Process Ion Chamber Input Seq128

The Problem: 3  60Hz AC Rectify noise from power supply using 3-phase 60Hz AC are picked up by the input cable laying in the accelerator tunnel. Time Domain Frequency Domain ADC 21  s/sample

Filter Functions Sliding Sum Cascaded Integrator Comb (CIC) Sum of 2nd Order The CIC sum is a sliding sum of sliding sums. The frequency response of CIC sum is a sinc 2 (x) function that has 2nd order zeros and better stop band suppression. First 360 Hz Frequency 21  s/sample 124 samples

Filtering Works, But Partially Noises >360Hz, the dominating portion, are filtered out in both filter functions. CIC sum is a lot smoother than the sliding sum. But small signals are still buried under ripples of 60 and 180 Hz. Sliding Sum CIC Sum Signals

Why Not Filtering Further? Filtering is an averaging process over many periods. There is not much time after reset. The noises before the accelerator ramping and after have different amplitudes and shapes. A “De-Ripple” algorithm has been developed. Ramping

De-ripple Process (1.1) Waveform Extraction, Storage and Validation Waveform Buffer Page 0  Waveform Mean Waveform Buffer Page 1  Waveform Mean The CIC sum is stored into the waveform buffer and accumulated for the waveform mean.

De-ripple Process (1.2) Waveform Extraction, Storage and Validation Waveform Buffer Page 0  Waveform Mean Waveform Buffer Page 1  Waveform Mean When it shows a good periodic property, the waveform becomes valid.

De-ripple Process (1.3) Waveform Extraction, Storage and Validation Waveform Buffer Page 0  Waveform Mean Waveform Buffer Page 1  Waveform Mean If the data is non-periodic, the waveform becomes invalid.

De-ripple Process (2) Waveform Subtraction Waveform Buffer Page 0  Waveform Mean Waveform Buffer Page 1  Waveform Mean -- The waveform mean is subtracted to preserve DC component in the final result. The De-rippled Sum

Results of De-ripple Process Those otherwise hard- to-see small signals now become visible. DC and very slow signals are also preserved.

Filter Implementation Recursive Implementation Recursive != IIR Non-Recursive Implementation Finite Impulse Respond (FIR) Infinite Impulse Respond (IIR) Possible Yes NO Resource Friendly  x[n] s[n] + -x[n-K] x[n] The non-recursive implementation needs: 124 memory fetches, 124 additions and more ops for longer sum lengths. The recursive implementation needs: 1 memory fetch, 2 add/sub operations regardless sum length. Sliding Sum

Recursive Implementation of CIC Sum The non-recursive implementation needs: 248 memory fetches, 248 multiplications, 248 additions and more ops for longer sum lengths. + s[n] -x[n-K] x[n] + y[n] -s[n-K] + u[n] -2x[n-K] x[n] + y[n] x[n-2K]  x[n] y[n] *h1 *h2 *h[K] The CIC sum constructed as a sliding sum of sliding sums: 2 memory fetches, 0 multiplications, 4 add/sub ops for any sum length. The re-formulated CIC sum uses the raw data buffer rather than a separate buffer. CIC Sum

Process Sequencing Sum1Sum2Sum3Sum4 Sum1Sum2Sum3Sum4 Sum1Sum2Sum3Sum4 Sum1Sum2Sum3Sum4 CH0 CH1 CH2 CH3 CH0 CH1 CH2 CH3 CIC1CIC2 CIC1CIC2 CIC1CIC2 CIC1CIC2 WF SUB WF E,S,V WF SUB WF E,S,V WF SUB WF E,S,V WF SUB WF E,S,V Sum1Sum2Sum3Sum4CIC1CIC2 WF SUB WF E,S,V Sum1Sum2Sum3Sum4CIC1CIC2 WF SUB WF E,S,V Sum1Sum2Sum3Sum4CIC1CIC2 WF SUB WF E,S,V Sum1Sum2Sum3Sum4CIC1CIC2 WF SUB WF E,S,V Flat design is fast but uses a lot of logic elements. Sequencing the process saves logic elements significantly. Partially flat and partially sequence design sometimes is a better arrangement in FPGA.

BLM DC Process Sequencing The processes of calculating sliding sums and CIC sums are fully sequenced. The de-ripple processor is flat for the process path. But it operates sequentially for 4 channels. Fully Sequencing Partially Flat

FPGA Process Sequencing Options Program Type Program Length (CLK cycles) ReprogramResource Usage Finite State Machine (FSM) Fixed Wired 10HardSmall Enclosed Loop Micro-Sequencer (ELMS) Memory Stored Program EasySmall Microprocessor (MP) Memory Stored Program >1000EasyLarge

ELMS– Enclosed Loop Micro-Sequencer Loop & Return Logic + Stack Conditional Branch Logic Program Counter ROM 128x 36bits A Reset CLK Control Signals PCControl SignalsOpration LDR1, #n LDR2, #addr_a LDR3, #addr_X LDR7, # BckA1LDR4, (R2) INCR LDR5, (R3) INCR MULR6, R4, R5 0a EndA1ADDR7, R7, R6 0b DECR1 0c BRNZBckA1 Special in ELMS Supports FOR loops at machine code level PC+ROM is a good sequencer in FPGA. Adding Conditional Branch Logic allows the program to loop back. Loop & Return Logic + Stack is a special feature in ELMS that supports FOR loops at machine code level. Allows jump back as in microprocessors

ELMS – Detailed Block Diagram User Control Signals FORBckA1 EndA1 #n LDR2, #addr_a LDR3, #addr_X LDR7, #0 BckA1LDR4, (R2) INCR2 LDR5, (R3) INCR3 MULR6, R4, R5 EndA1ADDR7, R7, R6 LDR8, R7 The Stack supports nested loops, up to 128 layers.

Software: Using Spread Sheet as Compiler

What’s Good About ELMS FOR Loops at Machine Code Level Looping sequence is known in this example before entering the loop. Regular micro-processor treat the sequence as unknown. ELMS supports FOR loops with pre-defined iterations at machine code level. Execution time is saved and micro-complexities (branch penalty, pipeline bubble, etc.) associated with conditional branches are avoided. LDR1, #n LDR2, #addr_a LDR3, #addr_X LDR7, #0 BckA1LDR4, (R2) INCR2 LDR5, (R3) INCR3 MULR6, R4, R5 EndA1ADDR7, R7, R6 DECR1 BRNZBckA1 FORBckA1 EndA1 #n LDR2, #addr_a LDR3, #addr_X LDR7, #0 BckA1LDR4, (R2) INCR2 LDR5, (R3) INCR3 MULR6, R4, R5 EndA1ADDR7, R7, R6 25% MicroprocessorThe ELMS Conditional Branch

Conclusion The de-ripple algorithm is an useful alternative method for eliminating low frequency periodic noises. The ELMS is a handy sequence controller in FPGA that uses small amount of resources.

The End Thanks

What’s Good about ELMS No ALU => Small Resource Usage Program DATA Memory Princeton Architecture Harvard Architecture Fermilab Architecture(?) Program Control ALU Program Memory Program Control ALU DATA Memory Program Memory Sequencer (ELMS) Data Processor DATA Memory The Princeton Architecture is more suitable at system level while Harvard Architecture is better suited at micro-structure level. Regular microprocessors cannot run looped program without an ALU. The ALU takes large amount of resource while may not be efficiently utilized for data processing tasks in FPGA. The ELMS can run nested loop program without an ALU. Further separation of Program and data is therefore possible. The ELMS is kept small.