Download presentation
Presentation is loading. Please wait.
Published byHendra Hermanto Modified over 5 years ago
1
DPDK: Prevention & Detection of DP/CP memory corruption
Quick Intro : Sys App Eng Manager at Intel, SivaPrasad is part of my team Network Processor based Dataplane packet processing application background, Now transitioning to the DPDK based DP packet processing on the X86 platforms. Recently ran in to a DP memory corruption issue that took us 4 weeks to debug 9 March 2018 Amol Patel, Sivaprasad Tumalla
2
Background Most of the custom packet-processing hardware i.e. Network Processors provides mechanisms to, Separate Dataplane and Controlplane memory Protect and detect the Dataplane memory corruption/configuration issues We are Proposing something similar for the DPDK applications running on the GPP (General Purpose Processor)
3
Background The DPDK applications are running on the multi-threaded model Main process spawns worker-threads for the Dataplane processing This is a flat memory architecture wherein Control-thread and DP worker- thread’s memory is part of the same address-space, it would be difficult to isolate the issue in case of Dataplane/Controlplane data structure corruption We need ways to prevent/detect the Controlplane/Dataplane memory corruption. Shared Libraries Shared Libraries DPDK primary process memory map Primary ProcessMemory Hugepage mmap Thread - n Stack-n Huge Page Segment 1 Thread - 2 Stack-2 Mem-zone area 1 Thread - 1 Stack-1 Huge Page Segment 2 Heap Mem-zone area 2 Huge Page Segment 3 text
4
Ways to Prevent and Detect DP/CP Memory corruption
Prevention Control-plane memory (stack) protection Dataplane memory protection Controlplane and Dataplane memory access control Detection Dataplane Memory Corruption Detection DPDK Mem-zone Dump
5
Prevention Controlplane stack protection Dataplane memory protection
Controlplane and Dataplane memory access control
6
Controlplane memory Protection : Existing memory model
Worker-threads are the Linux threads Stack memory for the worker- threads is allocated in primary- threads’ Address space Thread level MMU protection does NOT exist All the worker-threads of the primary-process has access to stack memory of all the other threads, one worker thread can corrupt the other worker thread’s stack Shared Libraries Primary ProcessMemory Hugepage mmap Thread - n Stack-n Huge Page Segment 1 Thread - 2 Stack-2 Mem-zone area 1 Thread - 1 Stack-1 Huge Page Segment 2 Heap Mem-zone area 2 Huge Page Segment 3 text
7
Worker thread Stack Protection: Proposed Memory Model
Mechanism: Stack of the worker-thread to be allocated on mem-zone Assign the allocated mem-zone memory to the worker-thread stack address, int thread_attr_setstack(pthread_attr_t *attr void *stackaddr, size_t stacksize); Advantages: Stricter control on any possible out-of-bound memory access for the stack variables Localizing the stack corruption effect to the thread Ability to dump the thread’s stack contents by dumping the mem-zone Shared Libraries Primary Process Hugepage mmap Worker Thread-n Stack-n Huge Page Segment 1 WorkerThread - 2 Stack-2 Mem-zone area 1 Worker Thread- 1 Stack-1 Huge Page Segment 2 Heap Mem-zone area 2 Huge Page Segment 3 text
8
DP Memory Protection: Existing DPDK Dataplane Memory Layout
Mem-zone 1 Huge Page Segment 1 DP table Mechanism: Actual tables are allocated using the rte_lmp_create/rte_hash_create which internally invokes the rte_dpdk callocs() to allocate memory from the Hugepage. Disadvantages Memory allocated for the different Dataplane tables might be scattered in different memory-segments/huge pages No direct way to check for out-of-bound write for the Dataplane tables There is no way to dump the contents of all the Dataplane tables and counters in one go Mem-zone 2 Huge Page Segment 2 Mem-zone 2 RTE_CONFIG Shared Libraries Huge Page Segment 3 Stack Thread - n Stack Thread - 2 Ability to Dump the all the Dataplane tables in one go is important for troubleshooting the packet drop issues. Huge Page Segment 4 Stack Thread - 1 malloc, calloc, zmalloc regions DP tables DP caches DP counters Huge Page Segment 4 Heap - n Other tables Huge Page Segment 5 text
9
rte_malloc, rte_calloc
DP Memory Protection: Proposed DPDK DP Memory Allocation using memory zones DP-Mem-zone text Stack Thread - n Heap - n Shared Libraries Stack Thread - 2 rte_malloc, rte_calloc DP table2 DP counters DP table1 Guard-band Huge Page Segment 1 Mechanism One mem-zone reserved for all the dataplane tables, data-structures, all the DP tables are placed on this this mem-zone by a user-defined wrappers Advantages Dataplane memory separation DPDK memory zone dump APIs supported to dump all the DP tables. Ease the troubleshooting the memory and configuration issues Guard-band allows checking of any out-of-bound access Disadvantages Additional wrappers for allocating the Dataplane tables (LPM, Hash, counters) on the mem-zones DP-Mem-zone Huge Page Segment 2 Huge Page Segment 3 Dumping the tables to find the reason for the packet drop Huge Page Segment 4 Stack Thread - 1 Huge Page Segment 4 Huge Page Segment 5
10
CP / DP Memory Access Control
Worker thread to access the dataplane-table memzones in Read- only mode (mmap call supports the RO access, DPDK would have to add this support). In the worker threads, all the dataplane pointers MUST be accessed with the “constant” primitive to avoid any accidental corruption Using “restrict” qualifier for all Read- Only function argument Dataplane code not to change any dataplane entries except the fastpath counters Control plane tracking the #entries for the DP tables and returning error if it’s reached MAX (this way we can avoid any over-stepping to the next DP table)
11
Detection Detection (Debugging / troubleshooting)
Dataplane Memory Corruption Detection DPDK Mem-zone Dump
12
Dataplane Memory Corruption Detection
Single DP memzone for the dataplane objects (forwarding tables, counters, structure etc.), each DP reserves few extra-rows for every DP table (Max + N) which serves as the guard-band Monitoring the guard-bands b/w the individual DP tables to keep check on any out of bound access Use DPDK utilities to dump the memzones to dump the DP tables, counters for debugging Application test-framework to check any out-of-bound access for DP tables by monitoring the memzone
13
4 Stage Memzone Dumping Process
DPDKProcInfo –m : Dump Memzone physical and Virtual address cat /proc/<pid>/maps : Dump process’ virtual memory map Huge_page mapping Indentify the Memzone Hugepage mapping by comparing the virtual-address hexdump /mnt/huge/mntrte_N : Dump the Memzone content
14
Detection – DPDK Memzone Dump : dpdkprocinfo –m
15
cat /proc/<pid>/maps
16
Troubleshooting – DPDK Memzone Dump : hexdump /mnt/huge/rtemap_
17
Questions?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.