Download presentation
Presentation is loading. Please wait.
Published byZachary Gallegos Modified over 11 years ago
1
Deep Packet Inspection Which Implementation Platform? Sarang Dharmapurikar Cisco
2
Implementation Platform Several choices, each with some pros and cons –ASICs –FPGA –Network Processors –Graphics Processors (nVidia) –multiple-core, multi-threaded Commodity processors Needs evaluation with respect to –Cost –Speed –Overall system performance (DPI is just a small piece of the puzzle) –Ease of use and upgrading A hardware-software co-design approach –Profile a DPI system and push some components in hardware if the overall speed up is effective (Ahmdals law)
3
ASIC Examples: ClassiPi, NetLogic, Tarari, some Cisco ASICs Requires too much investment –NRE close to a million dollars! A long design cycle –Most of the time is consumed in verification Hard to upgrade –Algorithms evolve –It is hard to build a flexible enough ASIC Applications get locked to a platform –To migrate to a new platform requires a lot of software rewriting
4
FPGA Very flexible but expensive and power-consuming –Virtex-5 offers 330,000 lookup tables units –4MB of SRAM Latest Xilinx FPGA contain multiple PowerPC cores Possible to design hybrid hw/sw systems –The compoents that assist DPI such as TCP-reassembly, normalization, flow classification done in hardware Several FPGA platforms for networking acceleration available today –NetFPGA –FPX Need to be careful in the DPI approach –The raw signature matching techniques that use FPGA logic resources for each signature wont scale
5
Network Processors Intel IXP2850 –16 micro-engines with 2KB D$ and 8KB I$ and 16 entry CAM –An integrated XScale processor for control path 32KB I$ and 32kB D$ –2 Crypto units –16KB shared scratch pad SRAM Cisco QuantumFlow processor –40 packet processing engines (PPE) each @ 1.2 GHz –4 threads per PPE –Dedicated hardware for queuing, buffering, IP lookup and classification
6
Commodity processors Really powerful server class processors coming up –Intels Nehalem 8 cores 2 threads per core 32KB L1, 256 KB L2, 10+MB of shared L3 cache –Suns Niagara2 8 cores 8 threads per core! 16KB I$ and 8KB D$ per core, 4MB shared L2 cache. Integrated cryptographic coprocessors units Need to think multi-core, multi-threaded –Think in terms of a complete system, not just pattern matching –Which core should do what? Need to design cache-friendly data structures
7
Conclusion While hardware can assist DPI systems, building proprietary hardware not a good idea Lets understand the actual performance needs –Lets not be misguided by marketing needs Need to think of hardware-software co-design –Requires careful profiling of DPI systems to identify the components that can be pushed to hardware Need to design algorithms for multi-core multi-threaded processors
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.