1 / 24

What’s needed to transmit?

What’s needed to transmit?. A look at the minimum steps required for programming our 82573L nic to send packets. Typical NIC hardware. main memory. packet. nic. TX FIFO. transceiver. buffer. LAN cable. B U S. RX FIFO. CPU. Quotation.

ajay
Download Presentation

What’s needed to transmit?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. What’s needed to transmit? A look at the minimum steps required for programming our 82573L nic to send packets

  2. Typical NIC hardware main memory packet nic TX FIFO transceiver buffer LAN cable B U S RX FIFO CPU

  3. Quotation Many companies do an excellent job of providing information to help customers use their products... but in the end there's no substitute for real-life experiments: putting together the hardware, writing the program code, and watching what happens when the code executes. Then when the result isn't as expected -- as it often isn't -- it means trying something else or searching the documentation for clues.-- Jan Axelson, author, Lakeview Research (1998)

  4. Thanks, Intel!☻ • Intel Corporation has kindly posted details online for programming its family of gigabit Ethernet controllers – includes our 82573L

  5. Our ‘nictx.c’ module • We’ve created an LKM which has minimal functionality – enough to be sure we know how to ‘transmit’ a raw Ethernet packet – but we do this in a forward-looking way so that our source-code can later be turned into a Linux character-mode device-driver (once we’ve also seen how to write code which allows our nic to ‘receive’ packets)

  6. Access to PRO1000 registers • Device registers are hardware mapped to a range of addresses in physical memory • We obtain the location (and the length) of this memory-range from a BAR register in the nic device’s PCI Configuration Space • Then we request the Linux kernel to setup an I/O ‘remapping’ of this memory-range to ‘virtual’ addresses within kernel-space

  7. Tx-Desc Ring-Buffer TDBA base-address 0x00 0x10 0x20 0x30 0x40 0x50 0x60 0x70 0x80 TDH (head) TDLEN (in bytes) TDT (tail) = owned by hardware (nic) = owned by software (cpu) Circular buffer (128-bytes minimum)

  8. How ‘transmit’ works Buffer0 List of Buffer-Descriptors descriptor0 descriptor1 Buffer1 descriptor2 descriptor3 0 0 0 Buffer2 0 We setup each data-packets that we want to be transmitted in a ‘Buffer’ area in ram We also create a list of buffer-descriptors and inform the NIC of its location and size Then, when ready, we tell the NIC to ‘Go!’ (i.e., start transmitting), but let us know when these transmissions are ‘Done’ Buffer3 Random Access Memory

  9. Allocating kernel-memory • Our 82573L device-driver will need to use a segment of contiguous physical memory which is cache-aligned and non-pageable • Such a memory-block can be allocated by using the kernel’s ‘kzalloc()’ function (and it can later be deallocated using ‘kfree()’) • You should use the ‘GFP_KERNEL’ flag (and we also used the ‘GFP_DMA’ flag)

  10. NIC registers (for transmit) enum { E1000_CTRL = 0x0000, // Device Control E1000_STATUS = 0x0008, // Device Status E1000_TCTL = 0x0400, // Transmit Control E1000_TDBAL = 0x3800, // Tx-Descriptor Base-Address Low E1000_TDBAH = 0x3804, // Tx-Descriptor Base-Address High E1000_TDLEN = 0x3808, // Tx-Descriptor queue Length E1000_TDH = 0x3810, // Tx-Descriptor Head E1000_TDT = 0x3818, // Tx-Descriptor Tail E1000_TXDCTL = 0x3828, // Tx-Descriptor Control E1000_RA = 0x5400, // Receive-address Array };

  11. Device Control (0x0000) 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 PHY RST VME R =0 TFCE RFCE RST R =0 R =0 R =0 R =0 R =0 ADV D3 WUC R =0 D/UD status R =0 R =0 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 R =0 R =0 R =0 FRC DPLX FRC SPD R =0 SPEED R =0 S L U R =0 R =0 R =1 0 0 GIO M D R =0 F D FD = Full-Duplex SPEED (00=10Mbps, 01=100Mbps, 10=1000Mbps, 11=reserved) GIOMD = GIO Master Disable ADVD3WUP = Advertise Cold Wake Up Capability SLU = Set Link Up D/UD = Dock/Undock status RFCE = Rx Flow-Control Enable FRCSPD = Force Speed RST = Device Reset TFCE = Tx Flow-Control Enable FRCDPLX = Force Duplex PHYRST = Phy Reset VME = VLAN Mode Enable 82573L

  12. Device Status (0x0008) 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 ? 0 0 0 0 0 0 0 0 0 0 0 GIO Master EN 0 0 0 some undocumented functionality? 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 0 0 0 0 0 PHY RA ASDV I L O S SPEED S L U 0 TX OFF Function ID 0 0 L U F D FD = Full-Duplex LU = Link Up TXOFF = Transmission Paused SPEED (00=10Mbps,01=100Mbps, 10=1000Mbps, 11=reserved) ASDV = Auto-negotiation Speed Detection Value PHYRA = PHY Reset Asserted 82573L

  13. Transmit Control (0x0400) 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 R =0 R =0 R =0 MULR TXCSCMT UNO RTX RTLC R =0 SW XOFF COLD (upper 6-bits) (COLLISION DISTANCE) 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 COLD (lower 4-bits) (COLLISION DISTANCE) CT (COLLISION THRESHOLD) 0 ASDV I L O S SPEED S L U TBI mode P S P 0 0 R =0 E N R =0 EN = Transmit Enable SWXOFF = Software XOFF Transmission PSP = Pad Short Packets RLTC = Retransmit on Late Collision CT = Collision Threshold (=0xF) UNORTX = Underrun No Re-Transmit COLD = Collision Distance (=0x3F) TXCSCMT = TxDescriptor Minimum Threshold MULR = Multiple Request Support 82573L

  14. Tx-Descriptor Control (0x3828) 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 0 0 0 0 0 0 0 G R A N 0 0 WTHRESH (Writeback Threshold) 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 0 0 HTHRESH (Host Threshold) 0 FRC DPLX FRC SPD 0 0 0 0 I L O S 0 A S D E PTHRESH (Prefetch Threshold) 0 L R S T 0 0 0 0 “This register controls the fetching and write back of transmit descriptors. The three threshhold values are used to determine when descriptors are read from, and written to, host memory. Their values can be in units of cache lines or of descriptors (each descriptor is 16 bytes), based on the value of the GRAN bit (0=cache lines, 1=descriptors). When GRAN = 1, all descriptors are written back (even if not requested).” --Intel manual Recommended for 82573: 0x01010000 (GRAN=1, WTHRESH=1)

  15. An observation • We notice that the 82573L device retains the values in many of its internal registers • This fact reduces the programming steps that will be required to operate our nic on the anchor cluster machines, since Intel’s own Linux device driver (‘e1000e.ko’) has already initialized many nic registers • But we MAY need to bring ‘eth1’ down!

  16. Using ‘/sbin/ifconfig’ • You can use the ‘/sbin/ifconfig’ command to find out whether the ‘eth1’ interface has been brought ‘down’: $ /sbin/ifconfig eth1 • If it is still operating, you can turn it off with the (privileged) command: $ sudo /sbin/ifconfig eth1 down

  17. Programming steps • Detect the presence of the 82573L network controller (VENDOR_ID, DEVICE_ID) • Obtain the physical address-range where the nic’s device-registers are mapped • Ask the kernel to map this address range into the kernel’s virtual address-space • Copy the network controller’s MAC-address into a 6-byte array for future access • Allocate a block of kernel memory large enough for our descriptors and buffers • Insure that the network controller’s ‘Bus Master’ capability has been enabled • Select our desired configuration-options for the DEVICE CONTROL register • Perform a nic ‘reset’ operation (by toggling bit 26), then delay until reset completes • Select our desired configuration-options for the TRANSMIT CONTROL register • Initialize our array of Transmit Descriptors with the physical addresses of buffers • Initialize the Transmit Engine’s registers (for Tx-Descriptor Queue and Control) • Setup the buffer-contents for an Ethernet packet we want to be transmitted • Enable the Transmit Engine • Give ‘ownership’ of a Tx-Descriptor to the network controller • Install our ‘/proc/nictx’ pseudo-file (for user-diagnostic purposes)

  18. Legacy Tx-Descriptor Layout 31 0 Buffer-Address low (bits 31..0) 0x0 0x4 0x8 0xC Buffer-Address high (bits 63..32) CMD CSO Packet Length (in bytes) special CSS reserved =0 status Buffer-Address = the packet-buffer’s 64-bit address in physical memory Packet-Length = number of bytes in the data-packet to be transmitted CMD = Command-field CSO/CSS = Checksum Offset/Start (in bytes) STA = Status-field

  19. Suggested C syntax typedef struct { unsigned long long base_address; unsigned short packet_length; unsigned char cksum_offset; unsigned char desc_command; unsigned char desc_status; unsigned char cksum_origin; unsigned short special_info; } TX_DESCRIPTOR;

  20. TxDesc Command-field 7 6 5 4 3 2 1 0 IDE VLE DEXT reserved =0 RS IC IFCS EOP EOP = End Of Packet (1=yes, 0=no) IFCS = Insert Frame CheckSum (1=yes, 0=no) – provided EOP is set IC = Insert CheckSum (1=yes, 0=no) as indicated by CSO/CSS fields RS = Report Status (1=yes, 0=no) DEXT = Descriptor Extension (1=yes, 0=no) use ‘0’ for Legacy-Mode VLE = VLAN-Packet Enable (1=yes, 0=no) – provided EOP is set IDE = Interrupt-Delay Enable (1=yes, 0=no)

  21. TxDesc Status field 3 2 1 0 reserved =0 LC EC DD DD = Descriptor Done this bit is written back after the NIC processes the descriptor provided the descriptor’s RS-bit was set (i.e., Report Status) EC = Excess Collisions indicates that the packet has experienced more than the maximum number of excessive collisions (as defined by the TCTL.CT field) and therefore was not transmitted. (This bit is meaningful only in HALF-DUPLEX mode.) LC = Late Collision indicates that Late Collision has occurred while operating in HALF-DUPLEX mode. Note that the collision window size is dependent on the SPEED: 64-bytes for 10/100-MBps, or 512-bytes for 1000-Mbps.

  22. Bit-mask definitions enum { DD = (1<<0), // Descriptor Done EC = (1<<1), // Excess Collisions LC = (1<<2), // Late Collision EOP = (1<<0), // End Of Packet IFCS = (1<<1), // Insert Frame CheckSum IC = (1<<2), // Insert CheckSum as per CSO/CSS RS = (1<<3), // Report Status DEXT = (1<<5), // Descriptor Extension VLE = (1<<6), // VLAN packet IDE = (1<<7) // Interrupt-Delay Enable };

  23. Ethernet packet layout • Total size normally can vary from 64 bytes up to 1536 bytes (unless ‘jumbo’ packets and/or ‘undersized’ packets are enabled) • The NIC expects a 14-byte packet ‘header’ and it appends a 4-byte CRC check-sum 0 6 12 14 the packet’s data ‘payload’ goes here (usually varies from 56 to 1500 bytes) destination MAC address (6-bytes) source MAC address (6-bytes) Type/length (2-bytes) Cyclic Redundancy Checksum (4-bytes)

  24. In-class exercises • Modify the code in our ‘nictx.c’ module so that it will transmit more than just one raw packet when you install it into the kernel • Can you also modify the ‘module_exit()’ function so that it will transmit a packet before it disables the ‘Transmit Engine’?

More Related