Download presentation
Published byCameron Bennett Modified over 9 years ago
1
Project Presentation: Physical Unclonable Functions
Michelle Dickson
2
Outline Project Goals Resource Selection PUF Architecture
Implementation Results Status & Future Work Conclusion
3
Project Goals Implement a Physical Unclonable Function
Determine feasibility of an authentication scheme based on the PUF’s unique key generation Is such an implementation robust enough to withstand environmental variations? I initially took interest in this project topic because I was skeptical about the robustness of the Physical Unclonable Function. I wanted to determine if an implemented PUF could produce consistent results across variations in the environment in order to decide if an authentication scheme based on the PUF’s ability to produce a unique key would be feasible.
4
Resource Selection Hardware: Virtex-II Pro™ FF1152 Development Board
2VP20 FPGA QTY: 2 I was able to obtain two development boards for use for the project. The development boards are Virtex-II Pro FF1152 Development Boards with 2VP20 FPGAs. I wanted to have two boards so I could confirm that the results produced by the implemented PUF did indeed vary with different hardware.
5
Resource Selection Tools Xilinx ISE Version 9.2i
Project Navigator iMPACT FPGA Editor Constraints Editor PACE Modelsim XE III Version 6.2g In order to implement the PUF on the FPGA, I also had to obtain a few tools. For implementation, I used the Xilinx ISE Version 9.2i toolset. This includes the Project Navigator, iMPACT for programming the device, FPGA editor for manually controlling placement and routing, Constraints Editor for setting up timing constraints, and PACE for setting area constraints. I also used Modelsim XE III Version 6.2g for simulation.
6
PUF Architecture Common PUF Architectures Arbiter PUF
Ring Oscillator PUF After selecting the hardware for use, I had to decide on the architecture for my design. There are three common PUF architectures that are discussed by researchers. The first is the arbiter PUF. In this scheme, a signal pulse is input to two 2-input multiplexors. The path lengths must be identical. A selector bit determines the output of the two multiplexors, and that output is routed as input into two more multiplexors as shown in the image. An array of these are cascaded, and the output of the last multiplexor is latched. In this implementation, the output will be solely based on the variation in trace lengths that are introduced my manufacturing. In essence, the output is the result of a race between two different paths that are theoretically the same length. The second type of PUF is a Ring Oscillator PUF. In this architecture, a series of inverters are connected to each other and looped back around as input to the series. The result is an oscillator whose frequency depends on the delay in the inverters and in the interconnecting traces. Again due to manufacturing variations, the frequencies will vary for every ring oscillator implementation. The output frequencies are counted and compared, and a bit is produced as a result of the comparison.
7
PUF Architecture Common PUF Architectures (continued) Butterfly PUF
The third common type of PUF implementation is the butterfly PUF. This method is based on the idea of creating structures that behave similarly to an SRAM cell during the startup phase. The butterfly PUF cell is a cross-coupled circuit which can be brought to a floating or unstable state before allowing it to settle to one of two possible states. The structure is constructed with symmetric routing. Each latch has a PRE input (which turns Q to 1) and a CLR input (which turns Q to 0). The PRE signal on the top latch and the CLR signal on the bottom latch are tied low, while the excite signal is tied to the CLR signal on the top and the PRE signal on the bottom. The outputs of the latch are cross-coupled. The excite signal is brought high for a few clock cycles and then returned to low. The output will settle to a state, which produces the single output bit.
8
Selected PUF Architecture
Ring Oscillator PUF Implementation Each RO is comprised of one NAND gate and 40 inverters 16 ROs implemented on the FPGA Compare the outputs of 2 ROs If the result is greater than, output is 1 If the result is less than or equal, output is 0 Output is 8-bit signature Motivation for selecting RO implementation Fairly simple to implement Does not require careful routing or layout Differences in oscillator frequencies will dominate skews in routing Extensive work published on RO implementations For my project, I chose to implement a Ring Oscillator PUF. I chose this because it seemed like it would be fairly simple to implement and it does not require the careful manual routing or layout that the other two require. This is because the differences in the oscillator frequencies will dominate the skews that are caused by asymmetric routing. Finally, I chose this method because there is a lot of data available for Ring Oscillator PUF implementations and many claims to be environmental-resilient. For my implementation, I did a simple 8-bit PUF. I have 16 Ring Oscillators implemented on the FPGA and each Ring Oscillator is comprised of one NAND gate and 40 inverters. The outputs of 2 Ring Oscillators are compared to produce a single bit of the 8-bit signature. For my implementation, if the comparison results in A greater than B, the output is a 1. Otherwise, the bit is a 0. The 8 comparisons produce an 8-bit signature for the device.
9
Implementation Ring Oscillator component implemented in schematic
Using the Xilinx ISE tools, I implemented a single Ring Oscillator by creating it in a schematic. Here we see the NAND gate, 40 inverters, the output of the final inverter is looped back to the input of the NAND gate and is also connected to the clock input on the 16-bit counter. A clock enable signal tells the counter when to increment, and a clear signal will reset the count output to 0.
10
Implementation Inputs to Ring Oscillator Output from Ring Oscillator
Feed input: tied to 1 PUF enable input: enables the counter Clear input: clears the counter to 0s Output from Ring Oscillator Output is 15 bit count value There are three inputs to the Ring Oscillator: the feed input, the PUF enable input, and the clear input. The Feed input is tied high. The PUF enable input is high when the counter should be incrementing and counting the oscillation cycles produced by the combinational logic. The clear input is tied to a system reset and will reset the counter output to 0. The output of the Ring Oscillator is a 15 bit count value.
11
Implementation The rest of the circuit is implemented in VHDL
Instantiate 16 ROs Compare the count of two ROs after a certain period of time to produce a bit If A is greater than B, the bit value is 1 Else, the bit value is 0 Oscillation time varied from several thousand clock cycles to 40 seconds The 8-bit output value is displayed on LEDs on the development board To verify functionality of the rest of the circuit Create a testbench with skewed clock inputs for the ROs Run simulation in Modelsim Verify that PUF bits accurately reflect the variation in oscillator frequencies The rest of the circuit is implemented in VHDL. I instantiate 16 Ring Oscillators and map the signals appropriately. Then I compare the count for two Ring Oscillators and produce a bit based on the outcome of the comparison. When I implemented this design, I varied the oscillation time from several thousand clock cycles all the way up to 40 seconds. The 8-bit output value was displayed on the 8 on-board LEDs. An illuminated LED corresponded to a bit of 0 while a non-illuminated LED corresponded to 1. A push-button on the board generated a system reset. The ring oscillator circuit cannot be verified through simulation because the tool will continue to evaluate the combinational logic loop and will quickly time-out. Therefore, I was only able to verify the rest of the circuit’s functionality through simulation. To do this, I created a testbench and added skewed clock inputs for the Ring oscillators. I was then able to verify that each PUF bit accurately reflected the variation in the oscillator frequencies.
12
Implementation After verifying the rest of the circuit through simulation, implement the PUF in actual hardware Synthesize design Implement Generate programming file and configure device After I verified that the rest of the circuit functioned as expected, I was ready to implement the PUF in the FPGA. Using the Xilinx ISE tools, I synthesized the design. The generated RTL schematic is shown here. I verified that the RTL schematic looked accurate, implemented the design, and generated the programming file to configure the device.
13
Implementation Obstacles
First, I had to set up the development environment and familiarize myself with the board and the tools Search the internet for board documentation Verify that I am able to program the FPGA and drive the outputs Xilinx tool attempts to optimize the circuit and removes the useful components The result is an empty logic design that cannot be mapped Limited visibility of internal logic values makes troubleshooting difficult Sometimes the circuit appeared to be functioning as desired, but in reality it was not Every time a bit file is created and synthesized in hardware, the results vary Another troubleshooting hurdle At this point, I encountered a number of obstacle. The first obstacle was just to set up my development environment and become familiar with the hardware and the tools. I hadn’t done any VHDL coding in a couple years and had never set up a development environment from scratch. In addition, the board I acquired did not come with any documentation, but thanks to the internet, I was able to find a user’s manual. So before I could do anything else, I needed to prove that I could burn the FPGA and get it to behave in an expected manner. After this was accomplished, I was ready to try implementing my PUF design. Here I found a number of hurdles. First off, the Xilinx tools don’t like for you to use combinational logic to drive clock inputs and they gripe about it quite a bit. In addition, if a circuit function doesn’t directly drive an output, the tools think it is unnecessary and they will remove it from the design by default. The result was that the synthesis phase would succeed, but the implementation (map, place, and route) would fail because it would trim out all of the logic. Once I was able to actually get something to place and route successfully, troubleshooting was a challenge because I only had 8 LEDs and a reset button to interface with. The result was that sometimes things would appear to be working, but as I continued to troubleshoot, I would learn that they weren’t in fact doing what I thought. Another anomaly made troubleshooting difficult as well. This was the fact that every time I implemented the design (after a small VHDL change, for example) the mapping and place and route were not the same as the previous time. This meant that between minor modifications, results could vary drastically.
14
Implementation Solutions to some obstacles
To prevent the Xilinx tool from removing the circuit Each net has to be assigned a “KEEP” attribute with a value of TRUE Disable equivalent register removal Disable optimization properties Disable trim unconnected signals *This allowed me to actually synthesize the design in hardware To create more consistent results between implementations on the same device Limit max fanout to 5 Create area constraints for each Ring Oscillator *This simplified my troubleshooting efforts I was able to overcome most of these obstacles. First, to prevent the Xilinx tool from removing the circuit, I had to assign a KEEP attribute to each net and set the value to TRUE. In addition, I had to tweak synthesis and implementation properties, including disable equivalent register removal, disable optimization properties, and disable the trim unconnected signals feature. This allowed me to actually synthesize the design in the hardware. Second, to create more consistent results between implementations on the same device (after making small schematic or VHDL changes) I limited the max fanout to 5 and created area constraints for each ring oscillator. This simplified my troubleshooting efforts because between each change, I could more accurately see the effect of the change.
15
Implementation Placed and routed design
This image just shows the placed and routed design. 8 main blocks can be seen. These are the locations to which I constrained the Ring Oscillators.
16
Results I tested my PUF implementation on two different development boards Tests for the reference were completed at ambient temperature with nominal power inputs Through troubleshooting, I found Oscillators were indeed oscillating at different frequencies To verify this, I simply changed the VHDL to check for count equality instead of inequality and verified that each comparison consistently produced a FALSE value Varying the oscillation time before checking the PUF output did not seem to make a large difference Whether I waited several thousand clock cycles or 40 seconds, the output seemed to have the same consistency For this reason, I chose a shorter oscillation period such that the counter would not cycle back to 0x0000 and result in potentially inconsistent comparison results Results were most consistent when the ring oscillator counter is always enabled and when the ring oscillator feed input is always tied to 1 The alternative was to only activate these inputs during the oscillation time before checking for inequality As I mentioned before, I obtained two development boards so that I could compare results between the two boards. Up to this point, all tests have been conducted at ambient temperature with nominal power inputs. Through troubleshooting, I was able to determine that the oscillators were indeed oscillating at different frequencies. To verify this, I simply changed the VHDL to check for count equality instead of inequality and verified each comparison consistently produced a false value. I was also able to determine that varying the length of time I allowed the oscillators to oscillate before capturing the count value didn’t seem to make a big difference for the output. Whether I waited several thousand clock cycles or 40 seconds, the output seemed to be the same with the same consistency. As a result, I chose a shorter oscillation period such that the counter would not cycle back to 0x0000 because I worried that this had higher potential to produce inconsistent results. Finally, I found that the results were the most consistent when the ring oscillator counter was always enabled and when the feed input was always tied to 1. For this reason, both these inputs were held high and constant. The alternative was to only bring these inputs high during the oscillation time before checking for inequality, but as I stated, these results were less consistent.
17
Results Using the same bit file, each board produced a unique 8-bit output; however, they only differed by one bit Board 1 produced the output Board 2 produced the output The results were not very consistent, even at ambient conditions Out of 100 trials, Board 1 produced a different output 10 times Two bits were not consistent Bit 6 varied 3 times out of 100 Bit 2 varied 7 times out of 100 Out of 100 trials, Board 2 produced a different output 40 times Bit 6 varied 2 times out of 100 Bit 4 varied 39 times out of 100 NOTE: On one trial, both bit 6 and bit 4 varied
18
Status & Future Work Since I haven’t been able to obtain consistent results at ambient, I have not experimented with any environmental variations To improve the consistency of the results, future work would include Calculate the actual frequency produced by each oscillator Select those oscillators with frequencies that are farther apart for comparison Pro: Results will be more consistent and presumably less susceptible to environment variations Con: This means that each time a PUF is implemented in hardware, it requires manual tweaking to ensure consistency
19
Status & Future Work To improve the randomness of results across different hardware Implement the Ring Oscillator as a hard macro Don’t put area constraints on the place and route tool Pro: This will make the output unique for each piece of hardware in which the PUF is implemented Con: Can’t guarantee consistency To improve practicality Expand the circuit to generate a 128 bit key instead of an 8 bit key Set up a challenge-response based authentication scheme and use board communication channel Cascade two boards together to determine feasibility of a system- level signature
20
Conclusion Although I haven’t been able to obtain consistent results to date, I can see that a Ring Oscillator PUF could be used to generate a unique hardware ID. However, implementation difficulty has been over-simplified by researchers. From my project experience, I don’t believe consistent results can be obtained without manual intervention and significant testing This results in added production costs Environmental variations would only exacerbate the problem Adding multiplexors to select which oscillators to compare would make a challenge-response authentication scheme possible Again, manual intervention and testing would be required during production to ensure adequate results
21
Questions? Please contact Michelle Dickson michelle.k.dickson@lmco.com
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.