DPDK: Prevention & Detection of DP/CP memory corruption

Slides:



Advertisements
Similar presentations
Dynamic Memory Allocation in C.  What is Memory What is Memory  Memory Allocation in C Memory Allocation in C  Difference b\w static memory allocation.
Advertisements

Memory management.
Configuring the Operating System Configure Performance Options Processor scheduling and memory usage Virtual memory Memory for network performance Configure.
Memory Management Questions answered in this lecture: How do processes share memory? What is static relocation? What is dynamic relocation? What is segmentation?
Chapter 8.3: Memory Management
Threads 1 CS502 Spring 2006 Threads CS-502 Spring 2006.
OS Spring’03 Introduction Operating Systems Spring 2003.
1 Last Class: Introduction Operating system = interface between user & architecture Importance of OS OS history: Change is only constant User-level Applications.
C and Data Structures Baojian Hua
Memory Layout C and Data Structures Baojian Hua
CS 3013 & CS 502 Summer 2006 Threads1 CS-3013 & CS-502 Summer 2006.
CSE 451: Operating Systems Autumn 2013 Module 6 Review of Processes, Kernel Threads, User-Level Threads Ed Lazowska 570 Allen.
Eric Keller, Evan Green Princeton University PRESTO /22/08 Virtualizing the Data Plane Through Source Code Merging.
The Structure of Processes (Chap 6 in the book “The Design of the UNIX Operating System”)
Chapter 8 – Main Memory (Pgs ). Overview  Everything to do with memory is complicated by the fact that more than 1 program can be in memory.
Paging Example What is the data corresponding to the logical address below:
PA0 due 60 hours. Lecture 4 Memory Management OSTEP Virtualization CPU: illusion of private CPU RAM: illusion of private memory Concurrency Persistence.
Chapter 4 Memory Management Virtual Memory.
Copyright ©: University of Illinois CS 241 Staff1 Threads Systems Concepts.
University of Amsterdam Computer Systems – virtual memory Arnoud Visser 1 Computer Systems Virtual Memory.
Paging (continued) & Caching CS-3013 A-term Paging (continued) & Caching CS-3013 Operating Systems A-term 2008 (Slides include materials from Modern.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Demand Paging.
Processes and Virtual Memory
NETW3005 Memory Management. Reading For this lecture, you should have read Chapter 8 (Sections 1-6). NETW3005 (Operating Systems) Lecture 07 – Memory.
A Survey on Runtime Smashed Stack Detection 坂井研究室 M 豊島隆志.
What is a Process ? A program in execution.
Memory Management. 2 How to create a process? On Unix systems, executable read by loader Compiler: generates one object file per source file Linker: combines.
Embedded Real-Time Systems
Chapter 4: Threads Modified by Dr. Neerja Mhaskar for CS 3SH3.
MODERN OPERATING SYSTEMS Third Edition ANDREW S
Introduction to Operating Systems
Virtualization Virtualize hardware resources through abstraction CPU
Segmentation COMP 755.
Debugging Memory Issues
Session 3 Memory Management
Structure of Processes
COMBINED PAGING AND SEGMENTATION
Process Realization In OS
Workshop in Nihzny Novgorod State University Activity Report
Checking Memory Management
Modularity and Memory Clearly, programs must have access to memory
Instructors: Haryadi Gunawi
Threads & multithreading
Chapter 4: Threads.
Chapter 8: Main Memory.
Structure of Processes
Processes in Unix, Linux, and Windows
Introduction to Operating Systems
System Structure and Process Model
Operating System Support for Virtual Machines
CSE 451: Operating Systems Spring 2012 Module 6 Review of Processes, Kernel Threads, User-Level Threads Ed Lazowska 570 Allen.
Main Memory Session - 16.
Lecture 3: Main Memory.
CS 5204 Operating Systems Lecture 10
By Vipin Varghese Application Engineer (NCSD)
Binding Times Binding is an association between two things Examples:
Operating System Chapter 7. Memory Management
CHAPTER 4:THreads Bashair Al-harthi OPERATING SYSTEM
Code-Pointer Integrity
C (and C++) Pointers April 4, 2019.
Chapter 4: Threads & Concurrency
Processes in Unix, Linux, and Windows
Lecture 7: Flexible Address Translation
CSE 542: Operating Systems
Buddy Allocation CS 161: Lecture 5 2/11/19.
OPERATING SYSTEMS MEMORY MANAGEMENT BY DR.V.R.ELANGOVAN.
Structure of Processes
Virtual Memory and Paging
Interrupts and System Calls
Process Address Spaces and Binary Formats
Presentation transcript:

DPDK: Prevention & Detection of DP/CP memory corruption Quick Intro : Sys App Eng Manager at Intel, SivaPrasad is part of my team Network Processor based Dataplane packet processing application background, Now transitioning to the DPDK based DP packet processing on the X86 platforms. Recently ran in to a DP memory corruption issue that took us 4 weeks to debug 9 March 2018 Amol Patel, Sivaprasad Tumalla

Background Most of the custom packet-processing hardware i.e. Network Processors provides mechanisms to, Separate Dataplane and Controlplane memory Protect and detect the Dataplane memory corruption/configuration issues We are Proposing something similar for the DPDK applications running on the GPP (General Purpose Processor)

Background The DPDK applications are running on the multi-threaded model Main process spawns worker-threads for the Dataplane processing This is a flat memory architecture wherein Control-thread and DP worker- thread’s memory is part of the same address-space, it would be difficult to isolate the issue in case of Dataplane/Controlplane data structure corruption We need ways to prevent/detect the Controlplane/Dataplane memory corruption. Shared Libraries Shared Libraries DPDK primary process memory map Primary ProcessMemory Hugepage mmap Thread - n Stack-n Huge Page Segment 1 Thread - 2 Stack-2 Mem-zone area 1 Thread - 1 Stack-1 Huge Page Segment 2 Heap Mem-zone area 2 Huge Page Segment 3 text

Ways to Prevent and Detect DP/CP Memory corruption Prevention Control-plane memory (stack) protection Dataplane memory protection Controlplane and Dataplane memory access control Detection Dataplane Memory Corruption Detection DPDK Mem-zone Dump

Prevention Controlplane stack protection Dataplane memory protection Controlplane and Dataplane memory access control

Controlplane memory Protection : Existing memory model Worker-threads are the Linux threads Stack memory for the worker- threads is allocated in primary- threads’ Address space Thread level MMU protection does NOT exist All the worker-threads of the primary-process has access to stack memory of all the other threads, one worker thread can corrupt the other worker thread’s stack Shared Libraries Primary ProcessMemory Hugepage mmap Thread - n Stack-n Huge Page Segment 1 Thread - 2 Stack-2 Mem-zone area 1 Thread - 1 Stack-1 Huge Page Segment 2 Heap Mem-zone area 2 Huge Page Segment 3 text

Worker thread Stack Protection: Proposed Memory Model Mechanism: Stack of the worker-thread to be allocated on mem-zone Assign the allocated mem-zone memory to the worker-thread stack address, int thread_attr_setstack(pthread_attr_t *attr void *stackaddr, size_t stacksize); Advantages: Stricter control on any possible out-of-bound memory access for the stack variables Localizing the stack corruption effect to the thread Ability to dump the thread’s stack contents by dumping the mem-zone Shared Libraries Primary Process Hugepage mmap Worker Thread-n Stack-n Huge Page Segment 1 WorkerThread - 2 Stack-2 Mem-zone area 1 Worker Thread- 1 Stack-1 Huge Page Segment 2 Heap Mem-zone area 2 Huge Page Segment 3 text

DP Memory Protection: Existing DPDK Dataplane Memory Layout Mem-zone 1 Huge Page Segment 1 DP table Mechanism: Actual tables are allocated using the rte_lmp_create/rte_hash_create which internally invokes the rte_dpdk callocs() to allocate memory from the Hugepage. Disadvantages Memory allocated for the different Dataplane tables might be scattered in different memory-segments/huge pages No direct way to check for out-of-bound write for the Dataplane tables There is no way to dump the contents of all the Dataplane tables and counters in one go Mem-zone 2 Huge Page Segment 2 Mem-zone 2 RTE_CONFIG Shared Libraries Huge Page Segment 3 Stack Thread - n Stack Thread - 2 Ability to Dump the all the Dataplane tables in one go is important for troubleshooting the packet drop issues. Huge Page Segment 4 Stack Thread - 1 malloc, calloc, zmalloc regions DP tables DP caches DP counters Huge Page Segment 4 Heap - n Other tables Huge Page Segment 5 text

rte_malloc, rte_calloc DP Memory Protection: Proposed DPDK DP Memory Allocation using memory zones DP-Mem-zone text Stack Thread - n Heap - n Shared Libraries Stack Thread - 2 rte_malloc, rte_calloc DP table2 DP counters DP table1 Guard-band Huge Page Segment 1 Mechanism One mem-zone reserved for all the dataplane tables, data-structures, all the DP tables are placed on this this mem-zone by a user-defined wrappers Advantages Dataplane memory separation DPDK memory zone dump APIs supported to dump all the DP tables. Ease the troubleshooting the memory and configuration issues Guard-band allows checking of any out-of-bound access Disadvantages Additional wrappers for allocating the Dataplane tables (LPM, Hash, counters) on the mem-zones DP-Mem-zone Huge Page Segment 2 Huge Page Segment 3 Dumping the tables to find the reason for the packet drop Huge Page Segment 4 Stack Thread - 1 Huge Page Segment 4 Huge Page Segment 5

CP / DP Memory Access Control Worker thread to access the dataplane-table memzones in Read- only mode (mmap call supports the RO access, DPDK would have to add this support). In the worker threads, all the dataplane pointers MUST be accessed with the “constant” primitive to avoid any accidental corruption Using “restrict” qualifier for all Read- Only function argument Dataplane code not to change any dataplane entries except the fastpath counters Control plane tracking the #entries for the DP tables and returning error if it’s reached MAX (this way we can avoid any over-stepping to the next DP table)

Detection Detection (Debugging / troubleshooting) Dataplane Memory Corruption Detection DPDK Mem-zone Dump

Dataplane Memory Corruption Detection Single DP memzone for the dataplane objects (forwarding tables, counters, structure etc.), each DP reserves few extra-rows for every DP table (Max + N) which serves as the guard-band Monitoring the guard-bands b/w the individual DP tables to keep check on any out of bound access Use DPDK utilities to dump the memzones to dump the DP tables, counters for debugging Application test-framework to check any out-of-bound access for DP tables by monitoring the memzone

4 Stage Memzone Dumping Process DPDKProcInfo –m : Dump Memzone physical and Virtual address cat /proc/<pid>/maps : Dump process’ virtual memory map  Huge_page mapping Indentify the Memzone  Hugepage mapping by comparing the virtual-address hexdump /mnt/huge/mntrte_N : Dump the Memzone content

Detection – DPDK Memzone Dump : dpdkprocinfo –m

cat /proc/<pid>/maps

Troubleshooting – DPDK Memzone Dump : hexdump /mnt/huge/rtemap_

Questions?