A look at memory issues Data-transfers must occur between system memory and the network interface controller
Typical Chipset Layout MCH Memory Controller Hub (Northbridge) ICH I/O Controller Hub (Southbridge) CPU Central Processing Unit DRAM Dynamic Random Access Memory NIC Network Interface Controller HDC Hard Disk Controller AC Audio Controller Graphics Controller TimerKeyboardMouseClock Multimedia Controller Firmware Hub USB controller
Typical Chipset Layout MCH Memory Controller Hub (Northbridge) ICH I/O Controller Hub (Southbridge) CPU Central Processing Unit DRAM Dynamic Random Access Memory NIC Network Interface Controller HDC Hard Disk Controller AC Audio Controller Graphics Controller TimerKeyboardMouseClock Multimedia Controller Firmware Hub USB controller DMA
PCI Bus Master DMA 82573L i/o-memory RX and TX FIFOs (32-KB total) Host’s Dynamic Random Access Memory Descriptor Queue packet-buffer DMA on-chip RX descriptors on-chip TX descriptors
Memory-mapped I/O We mentioned that Intel’s x86 architecture originally was designed with two separate address-spaces, one for memory and the other for I/O ports, unlike the designs for CPUs by many of Intel’s competitors in which I/O access was “memory-mapped” But now the newer Intel processors also can support memory-mapped I/O as well
Address-bus widths The Intel Core-2 Quad processors in our classroom and Lab machines potentially could address 2 36 physical memory cells (i.e., 64GB), although only 4GB of RAM actually are installed at the present time Some PCI-compliant hardware devices were designed for a 32-bit address-bus, thus they must be “mapped” below 4G
Physical-address assignments The CPU’s physical address-space Devices’ registers must be mapped to addresses in the bottom 4G Dynamic Random Access Memory
Virtual addresses Software running on the x86 processor is unable to use actual memory addresses, but instead uses ‘virtual’ addresses that map to physical addresses by means of mapping-tables which Linux dynamically defines for each different process it runs This complicates the steps software must take to arrange for the DMA to take place
Our ‘dram.c’ module To help us confirm that our hardware-level network software is working as we intend, or to diagnose our ‘bugs’ if it isn’t, we can use an LKM we’ve written that implements a character-mode device-driver for system memory, allowing us to view the contents of physical memory as if it were a file; for example, by using our ‘fileview.cpp’ tool
How to view system memory Download ‘dram.c’ from course website Compile it using our ‘mmake’ utility Install ‘dram.ko’ by using ‘/sbin/insmod Insure the ‘/dev/dram’ device-node exists Download ‘fileview.cpp’ from website and compile it with ‘g++’ (or with ‘make’) Execute ‘fileview /dev/dram’ and use the arrow-keys to navigate (or hit )
“canonical” addresses “non-canonical” (invalid) virtual addresses “canonical” addresses 0xFFFFFFFFFFFFFFFF … 0xFFFF x00007FFFFFFFFFFF … 0x Analogy using 5-bit values 64-bit “vrtual” address space
4-Levels of mapping Page Map Level-4 Table CR3 Page Directory Pointer Table Page Directory Page Table Page Frame (4KB) offset 64-bit ‘canonical’ virtual address sign-extension PML4PDPTPDIRPTBL Each mapping-table contains up to 512 quadword-size entries
4-level address-translation The CPU examines any virtual address it encounters, subdividing it into five fields offset into page-frame index into page-table bits9-bits12-bits index into page- directory index into page- directory pointer table index into level 4 page-map table 9-bits sign- extension Any 48-bit virtual-address is sign-extended to a 64-bit “canonical” address Only “canonical” 64-bit virtual-addresses are legal in 64-bit mode
Reserved (must be 0) Format of 64-bit table-entries Page-frame physical base-address[31..12] Page-frame physical base-address [39..32] EXBEXB PWU PWTPWT PCDPCD A avl Meaning of these bits varies with the table Legend: P = Present (1=yes, 0=no)PWT = Page Cache Disable (1=yes, 0=no) W = Writable (1=yes, 0=no)PWT = Page Write-Through (1=yes, 0=no) U = User-page (1=yes, 0=no) avl = available for user-defined purposes A = Accessed (1=yes, 0=no) EXB = Execution-disabled Bit (if EFER.NXE=1)
Our ‘mem64.c’ module We wrote an LKM to create a pseudo-file that will let us see how the virtual memory is being utilized by an application program Download this file, compile it with ‘mmake’ and install ‘mem64.ko’ in the Linux kernel Then view the virtual-memory mapping that is being used by the ‘cat’ program: $ cat /proc/mem64
The NIC’s PCI ‘resources’ Status Register Command Register DeviceID 0x109A VendorID 0x8086 BIST Cache Line Size Class Code Class/SubClass/ProgIF Revision ID Base Address 0 Subsystem Device ID Subsystem Vendor ID CardBus CIS Pointer reserved capabilities pointer Expansion ROM Base Address Minimum Grant Interrupt Pin reserved Latency Timer Header Type Base Address 1 Base Address 2Base Address 3 Base Address 4Base Address 5 Interrupt Line Maximum Latency doublewords Dwords
Mechanisms compared kernel memory-space NIC i/o-memory CPU’s ‘virtual’ address-space io user memory-space Each NIC register has its own address in memory (allows one-step access) addrdata Access to all of the NIC’s registers is muliplexed through a pair of I/O-ports (requires multiple instructions) CPU’s ‘I/O’ address-space
‘nicstatus.c’ Here’s an LKM that creates a pseudo-file (called ‘/proc/nicstatus’) which will allow a user to view the current value in our Intel 82573L Network Interface Controller’s ‘DEVICE_STATUS’ register It uses the I/O-port interface to the NIC’s registers, rather than a ‘memory-mapped’ interface to those device-registers
0 Device Status (0x0008) ? GIO Master EN PHY RA ASDV ILOSILOS SLUSLU 0 TX OFF 0 FDFD Function ID LULU SPEED FD = Full-Duplex LU = Link Up TXOFF = Transmission Paused SPEED (00=10Mbps,01=100Mbps, 10=1000Mbps, 11=reserved) ASDV = Auto-negotiation Speed Detection Value PHYRA = PHY Reset Asserted 82573L some undocumented functionality?
‘82573.c’ This is a more elaborate example of an LKM which not only creates a pseudo-file (i.e., ‘/proc/82573’) that we can view using the Linux ‘cat’ command and that lets us see our NIC’s PCI Configuration Space, but also implements some device-driver functions that let us view the NIC’s device registers by using our ‘fileview.cpp’ tool
Linux PCI helper-functions #include struct pci_dev*devp; unsigned intmmio_base; unsigned intmmio_size; void*io; devp = pci_get_device( VENDOR_ID, DEVICE_ID, NULL ); if ( devp == NULL ) return –ENODEV; mmio_base = pci_resource_start( devp, 0 ); mmio_size = pci_resource_len( devp, 0 ); io = ioremap_nocache( mmio_base, iomm_size ); if ( io == NULL ) return –ENOSPC;
In-class exercise Two of the NIC’s 32-bit device registers are used to hold its 48-bit Ethernet MAC address – which will be a different value for each of the hosts in our classroom These two registers are located at offsets 0x5400 and 0x5404 in device-memory The six bytes occur in network byte-order Write code to show the MAC address!