82573L Initializing our Pro/1000. Chicken-and-Egg? We want to create a Linux Kernel Module that can serve application-programs as a character-mode device-driver.

Slides:



Advertisements
Similar presentations
The Linux Kernel: Memory Management
Advertisements

Hardware ‘flow control’ How we can activate our NIC’s ability to avoid overwhelming the capacities of its ‘link partner’
FIU Chapter 7: Input/Output Jerome Crooks Panyawat Chiamprasert
Dr A Sahu Dept of Comp Sc & Engg. IIT Guwahati. PCI Devices NIC Cards NIC card architecture Access to NIC register – PCI access.
More 82573L details Getting ready to write and test a character-mode device-driver for our anchor-LAN’s ethernet controllers.
Fixing some driver problems Most software is discovered to have some ‘design-flaws’ after it has been put into use for awhile.
Receiver ‘packet-splitting’
Offloading TCP Segmentation Using Context Descriptors lets a driver offload ‘TCP Segmentation’ as well as checksum calculations.
Virtual Local Area Networks A look at how the Intel 82573L nic supports IEEE standard 802.1q for ethernet VLANs.
What’s needed to receive? A look at the minimum steps required for programming our 82573L nic to receive packets.
Message Signaled Interrupts
The RealTek interface Introduction to the RTL-8139 network controller registers.
A look at memory issues Data-transfers must occur between system memory and the network interface controller.
Exploring a modern NIC An introduction to programming the Intel 82573L gigabit ethernet network interface controller.
RTL-8139 experimentation Setting up an environment for studying the Network Controller.
Examining network packets Information about the RTL8139 needed for understanding our ‘watch235.c’ pseudo driver.
Our ‘recv1000.c’ driver Implementing a ‘packet-receive’ capability with the Intel 82573L network interface controller.
Our ‘xmit1000.c’ driver Implementing a ‘packet-transmit’ capability with the Intel 82573L network interface controller.
Informationsteknologi Friday, November 16, 2007Computer Architecture I - Class 121 Today’s class Operating System Machine Level.
Our ‘nic.c’ module We create a ‘character-mode’ device-driver for the 82573L NIC to use in futrure experiments.
Our ‘nic.c’ module We create a ‘character-mode’ device-driver for the 82573L NIC to use in future experiments.
What’s needed to transmit? A look at the minimum steps required for programming our 82573L nic to send packets.
Adjusting out device-driver Here we complete the job of modifying our ‘nicf.c’ Linux driver to support ‘raw’ packet-transfers.
Checksum ‘offloading’ A look at how the Pro1000 NICs can be programmed to compute and insert TCP/IP checksums.
7-1 Digital Serial Input/Output Two basic approaches  Synchronous shared common clock signal all devices synchronised with the shared clock signal data.
What’s needed to transmit? A look at the minimum steps required for programming our anchor nic’s to send packets.
1 Today I/O Systems Storage. 2 I/O Devices Many different kinds of I/O devices Software that controls them: device drivers.
What’s needed to receive? A look at the minimum steps required for programming our anchor nic’s to receive packets.
Building TCP/IP packets A look at the computation-steps which need to be performed for utilizing the TCP/IP protocol.
Group 5 Alain J. Percial Paula A. Ortiz Francis X. Ruiz.
1-1 Ethernet Ethernet Controller How do you interface with an Ethernet PHY?
Multicore Navigator: Queue Manager Subsystem (QMSS)
Serial Peripheral Interface Module MTT M SERIAL PERIPHERAL INTERFACE (SPI)
The University of New Hampshire InterOperability Laboratory Serial ATA (SATA) Protocol Chapter 10 – Transport Layer.
Input/Output. Input/Output Problems Wide variety of peripherals —Delivering different amounts of data —At different speeds —In different formats All slower.
1 Token Passing: IEEE802.5 standard  4 Mbps  maximum token holding time: 10 ms, limiting packet length  packet (token, data) format:  SD, ED mark start,
Chapter 10: Input / Output Devices Dr Mohamed Menacer Taibah University
Brierley 1 Module 4 Module 4 Introduction to LAN Switching.
Hardware Definitions –Port: Point of connection –Bus: Interface Daisy Chain (A=>B=>…=>X) Shared Direct Device Access –Controller: Device Electronics –Registers:
1-1 Embedded Network Interface (ENI) API Concepts Shared RAM vs. FIFO modes ENI API’s.
1 © 2003, Cisco Systems, Inc. All rights reserved. CCNA 3 v3.0 Module 4 Switching Concepts.
Token Passing: IEEE802.5 standard  4 Mbps  maximum token holding time: 10 ms, limiting packet length  packet (token, data) format:  SD, ED mark start,
TCP : Transmission Control Protocol Computer Network System Sirak Kaewjamnong.
11 NETWORK CONNECTION HARDWARE Chapter 3. Chapter 3: NETWORK CONNECTION HARDWARE2 NETWORK INTERFACE ADAPTER  Provides the link between a computer and.
Ethernet Driver Changes for NET+OS V5.1. Design Changes Resides in bsp\devices\ethernet directory. Source code broken into more C files. Native driver.
NS Training Hardware.
1 © 2003, Cisco Systems, Inc. All rights reserved. CCNA 3 v3.0 Module 4 Switching Concepts.
Cisco 3 - Switching Perrine. J Page 16/4/2016 Chapter 4 Switches The performance of shared-medium Ethernet is affected by several factors: data frame broadcast.
1 © 2003, Cisco Systems, Inc. All rights reserved. CCNA 3 v3.0 Module 4 Switching Concepts.
Accessing I/O Devices Processor Memory BUS I/O Device 1 I/O Device 2.
Intel Open Source Technology Center Lu Baolu 2015/09
NS Training Hardware Traffic Flow Note: Traffic direction in the 1284 is classified as either forward or reverse. The forward direction is.
CSCI1600: Embedded and Real Time Software Lecture 16: Advanced Programming with I/O Steven Reiss, Fall 2015.
Input/Output Problems Wide variety of peripherals —Delivering different amounts of data —At different speeds —In different formats All slower than CPU.
Input Output Techniques Programmed Interrupt driven Direct Memory Access (DMA)
CCNA3 Module 4 Brierley Module 4. CCNA3 Module 4 Brierley Topics LAN congestion and its effect on network performance Advantages of LAN segmentation in.
Renesas Electronics America Inc. RX Ethernet Peripheral © 2011 Renesas Electronics America Inc. All rights reserved A Rev /16/2011.
TCP/IP1 Address Resolution Protocol Internet uses IP address to recognize a computer. But IP address needs to be translated to physical address (NIC).
An Introduction to Microprocessor Architecture using intel 8085 as a classic processor
TCP - Part I Relates to Lab 5. First module on TCP which covers packet format, data transfer, and connection management.
Asynchronous Serial Communications
Module 2: Computer-System Structures
Module 2: Computer-System Structures
VIRTIO 1.1 FOR HARDWARE Rev2.0
Module 2: Computer-System Structures
Module 2: Computer-System Structures
Who’s listening? Some experiments with an ‘echo’ service on our anchor-cluster’s local network of 82573L nic’s.
Presentation transcript:

82573L Initializing our Pro/1000

Chicken-and-Egg? We want to create a Linux Kernel Module that can serve application-programs as a character-mode device-driver for our NIC So, as with the UART device, we will need to implement ‘read()’ and ‘write()’ methods But which method should we do first? No way to “test” a ‘read()’ method without having a way to send packets to our NIC

How ‘transmit’ works descriptor0 descriptor1 descriptor2 descriptor Buffer0 Buffer1 Buffer2 Buffer3 List of Buffer-Descriptors We setup each data-packets that we want to be transmitted in a ‘Buffer’ area in ram We also create a list of buffer-descriptors and inform the NIC of its location and size Then, when ready, we tell the NIC to ‘Go!’ (i.e., start transmitting), but let us know when these transmissions are ‘Done’ Random Access Memory

Registers’ Names Memory-information registers TDBA(L/H) = Transmit-Descriptor Base-Address Low/High (64-bits) TDLEN = Transmit-Descriptor array Length TDH = Transmit-Descriptor Head TDT = Transmit-Descriptor Tail Transmit-engine control registers TXDCTL = Transmit-Descriptor Control Register TCTL = Transmit Control Register Notification timing registers TIDV = Transmit Interrupt Delay Value TADV = Transmit-interrupt Absolute Delay Value

Tx-Desc Ring-Buffer Circular buffer (128-bytes minimum) TDBA base-address TDLEN (in bytes) TDH (head) TDT (tail) = owned by hardware (nic) = owned by software (cpu) 0x00 0x10 0x20 0x30 0x40 0x50 0x60 0x70 0x80

Tx-Descriptor Control (0x3828) GRANGRAN 00 WTHRESH (Writeback Threshold) 000 FRC DPLX FRC SPD 0 HTHRESH (Host Threshold) ILOSILOS 0 ASDEASDE 0 LRSTLRST PTHRESH (Prefetch Threshold) 00 Recommended for 82573: 0x (GRAN=1, WTHRESH=1) “This register controls the fetching and write back of transmit descriptors. The three threshhold values are used to determine when descriptors are read from, and written to, host memory. Their values can be in units of cache lines or of descriptors (each descriptor is 16 bytes), based on the value of the GRAN bit (0=cache lines, 1=descriptors). When GRAN = 1, all descriptors are written back (even if not requested).” --Intel manual

Transmit Control (0x0400) R =0 R =0 R =0 MULRTXCSCMT UNO RTX RTLC R =0 SW XOFF COLD (upper 6-bits) (COLLISION DISTANCE) COLD (lower 4-bits) (COLLISION DISTANCE) 0ASDV ILOSILOS SLUSLU TBI mode PSPPSP 0 R = R =0 ENEN SPEED CT (COLLISION THRESHOLD) EN = Transmit EnableSWXOFF = Software XOFF Transmission PSP = Pad Short PacketsRLTC = Retransmit on Late Collision CT = Collision Threshold (=0xF)UNORTX = Underrun No Re-Transmit COLD = Collision Distance (=0x3F)TXCSCMT = TxDescriptor Minimum Threshold MULR = Multiple Request Support 82573L

Tx Configuration Word (0x0178) 82573L ANE Tx Config ITCE R =0 IAME R =0 DF PAR EN PB PAR EN Tx LS Tx LS Flow =0 R =0 Phy Pwr Down En DMA Dyn GE R =0 RO DIS Reserved (=0) SPD BYPS R =0 EE RST ASD CHK R =0 R =0 R =0 R =0 R =0 R =0 R =0 R =0 0 TxConfigWord ANE = Auto-Negotiation Enable TxConfig = Transmit Configuration Control bit TxConfigWord = Transmit Configuration Word This register has two meanings, depending on the state of the ANE bit (i.e., setting ANE=1 enables the hardware auto-negotiation machine). Applicable only in SerDes mode; program as 0 for internal-PHY mode.

Legacy Tx-Descriptor Layout special 0x0 0x4 0x8 0xC CMD Buffer-Address high (bits ) Buffer-Address low (bits 31..0) 31 0 Packet Length (in bytes)CSO statusCSS reserved =0 Buffer-Address = the packet-buffer’s 64-bit address in physical memory Packet-Length = number of bytes in the data-packet to be transmitted CMD = Command-field CSO/CSS = Checksum Offset/Start (in bytes) STA = Status-field

Suggested C syntax typedef struct { unsigned long long base_addr; unsigned shortpkt_length; unsigned charcksum_off; unsigned chardesc_cmd; unsigned chardesc_stat; unsigned charcksum_org; unsigned shortspecial; } tx_descriptor;

TxDesc Command-field IDEVLEDEXT reserved =0 RSICIFCSEOP EOP = End Of Packet (1=yes, 0=no) IFCS = Insert Frame CheckSum (1=yes, 0=no) – provided EOP is set IC = Insert CheckSum (1=yes, 0=no) as indicated by CSO/CSS fields RS = Report Status (1=yes, 0=no) DEXT = Descriptor Extension (1=yes, 0=no) use ‘0’ for Legacy-Mode VLE = VLAN-Packet Enable (1=yes, 0=no) – provided EOP is set IDE = Interrupt-Delay Enable (1=yes, 0=no)

TxDesc Status field reserved =0 LCECDD DD = Descriptor Done this bit is written back after the NIC processes the descriptor provided the descriptor’s RS-bit was set (i.e., Report Status) EC = Excess Collisions indicates that the packet has experienced more than the maximum number of excessive collisions (as defined by the TCTL.CT field) and therefore was not transmitted. (This bit is meaningful only in HALF-DUPLEX mode.) LC = Late Collision indicates that Late Collision has occurred while operating in HALF-DUPLEX mode. Note that the collision window size is dependent on the SPEED: 64-bytes for 10/100-MBps, or 512-bytes for 1000-Mbps.

Bit-mask definitions enum { DD = (1<<0), // Descriptor Done EC = (1<<1),// Excess Collisions LC = (1<<2),// Late Collision EOP = (1<<0),// End Of Packet IFCS = (1<<1),// Insert Frame CheckSum IC = (1<<2), // Insert CheckSum as per CSO/CSS RS = (1<<3),// Report Status DEXT = (1<<5),// Descriptor Extension VLE = (1<<6),// VLAN packet IDE = (1<<7) // Interrupt-Delay Enable };

Allocating kernel-memory Our 82573L device-driver will need to use a segment of contiguous physical memory which is cache-aligned and non-pageable As explained in our LDD3 textbook, such a memory-block can be allocated using the Linux kernel’s ‘kmalloc()’ function (and it can later be deallocated using ‘kfree()’) The maximum-size allocation is 128-KB You should use the ‘GFP_KERNEL’ flag

Network MTU Unless the ‘Large-Send’ functionality has been enabled, there will be a maximum length for your network ‘datagrams’ equal to 1536 bytes (=0x0600) So if you reused the same Packet-Buffer for successive transmissions, you could fit your packet-buffer and a moderate-sized Descriptor-Buffer into one 4KB-pageframe

Single page-frame option Packet-Buffer (3-KB) (reused for successive transmissions) 4KB Page- Frame Descriptor-Buffer (1-KB) (room for up to 256 descriptors)

Another design-option… 16 Packet-Buffers (3968-bytes) (248-bytes per buffer ) 4KB Page- Frame Descriptor-Buffer (128-bytes) (room for 16 descriptors)

Initialization Your device-driver needs to initialize your 82573L hardware to a known state, and configure its options for your desired mode of operation The Device Control register has bits which let you initiate a ‘device reset’ operation The Device Status register has bits which inform you when a ‘reset’ has completed

0 Device Status (0x0008) ? GIO Master EN PHY reset ASDV ILOSILOS SLUSLU 0 TX OFF 0 FDFD Function ID LULU SPEED FD = Full-Duplex LU = Link Up TXOFF = Transmission Paused SPEED (00=10Mbps,01=100Mbps, 10=1000Mbps, 11=reserved) ASDV = Auto-negotiation Speed Detection Value 82573L some undocumented functionality?

Device Control (0x0000) PHY RST VME R =0 TFCERFCE RST R =0 R =0 R =0 R =0 R =0 ADV D3 WUC R =0 D/UD status R =0 R =0 R =0 R =0 R =0 FRC DPLX FRC SPD R =0 SPEED R =0 SLUSLU R =0 R =0 R =1 0 FDFD GIO M D R = FD = Full-DuplexSPEED (00=10Mbps, 01=100Mbps, 10=1000Mbps, 11=reserved) GIOMD = GIO Master DisableADVD3WUP = Advertise Cold Wake Up Capability SLU = Set Link UpD/UD = Dock/Undock statusRFCE = Rx Flow-Control Enable FRCSPD = Force SpeedRST = Device ResetTFCE = Tx Flow-Control Enable FRCDPLX = Force DuplexPHYRST = Phy ResetVME = VLAN Mode Enable 82573L

Extended Control (0x0018) R =0 R =0 ? ITCE R =0 IAME R =0 DF PAR EN PB PAR EN Tx LS Tx LS Flow =0 R =0 Phy Pwr Down En DMA Dyn GE R =0 RO DIS R =0 SPD BYPS R =0 EE RST ASD CHK R =0 R =0 R =0 R =0 R =0 R =0 R =0 R =0 0 R = R =0 R = L R =0 ASDCHK = AutoSpeed Detection CheckTxLSFlow = Tx Large-Send Flow EERST = EEPROM ResetTxLS = Tx Large-Send functionality SPDBYPS = Speed-selection BypassPBPAREN = Packet-Buffer Parity-Error Detect RODIS = Relaxed-Ordering DisableDFPAREN = Descriptor-FIFO Parity-Error Detect DMADynGE = DMA Dynamic-Gating EnableIAME = Interrupt-Acknowledge Auto-Mask Enable PhyPwrDownEn = Phy PowerDown EnableITCE = Interrupt Timers Cleared Enable

Example // clear STATUS bit #31 iowrite32( 0x , io + E1000_STATUS ); // initiate Device-Reset and Phy-Reset iowrite32( 0x , io + E1000_CTRL ); // wait until STATUS bit #31 is set while ( ( ioread32( io + E1000_STATUS )&(1<<31)) == 0 ); // program Link Up with desired operating-mode settings iowrite32( 0x , io + E1000_CTRL ); // wait until LU-bit in STATUS is set while ( ( ioread32( io + E1000_STATUS )&(1<<10)) == 0 );

Interrupt Cause Read (0x00C0) INT assert R =0 R =0 R =0 R =0 R =0 R =0 R =0 ACKACK SRPDSRPD TXD LOW R =0 R =0 R =0 MDAC RXT0 RXO R =0 RXD MT0 R =0 0 TXDWTXDW LSCLSC TXQETXQE R =0 R =0 R =0 R =0 R =0 R =0 R =0 R =0 R =0 TXDW = Transmit Descriptor Written backLSC = Link Status Changed TXQE = Transmit Queue EmptyMDAC = MDI/O Access Completed SRPD = Small Receive Packet DetectedACK = Receive ACK-frame detected RXT0 = Receiver Timer InterruptRXO = Receiver Overrun TXDLOW = Transmit Descriptor Low Threshhold Reached RXDMT0 = Receive Descriptor Minimum Threshhold Reached INT-Assert = Interrupt Assertion is still pending Mechanism for NIC-event notifications

In-Class Exercise #1 Try compiling and installing our ‘tryreset.c’ demo-module, and examine the messages put in the kernel’s log-file (use ‘dmesg’) Then modify the module-code so that it also outputs the value in the ICR register (Interrupt Cause Read) during each pass through the two ‘busy-waiting’ loops #define E1000_ICR0x00C0

In-Class Exercise #2 Apply the save techniques we employed in our earlier ‘announce.c’ demo-module so that the ‘printk()’ statements in ‘tryreset.c’ get replaced by statements that will show the messages onscreen, or in the current desktop window, rather than writing them to the kernel’s (out-of-view) log-file