180 likes | 310 Views
This document outlines common design flaws found in NIC device drivers after prolonged use and provides effective fixes. It addresses critical problems, including buffer overflow due to mismatched packet sizes, data loss from insufficient read operations, and the management of transmit and receive buffers. Each issue is accompanied by detailed code examples showing how to declare variables as static, implement wait-queues, and dynamically adjust receive buffer ownership to ensure data integrity and prevent corruption. The solutions aim to enhance driver reliability and performance.
E N D
Fixing some driver problems Most software is discovered to have some ‘design-flaws’ after it has been put into use for awhile
Problem 1 • Statistics registers are ‘clear-on-read’, but ‘get_info()’ function can be entered twice, so if a statistics register is re-read upon a second entry to this function, a zero-value ‘overwrites’ the original statistic-value int my_get_info( char *buf, char **start, off_t off, int count ) { int n_xmit_packets = ioread32( io + E1000_TPT ); int len = 0; len = sprintf( buf+len, “ packets sent = %d \n”, n_xmit_packets );
Fix for problem 1 • We can declare any variables that will be used to store statistics to be ‘static’, then add any new inputs to their prior values int my_get_info( char *buf, char **start, off_t off, int count ) { static int n_xmit_packets = 0; int len = 0; n_xmit_packets += ioread32( io + E1000_TPT ); len = sprintf( buf+len, “ packets sent = %d \n”, n_xmit_packets );
Problem 2 • Our ‘nic.c’ device-driver programmed the hardware to use 2KB buffers for received packets, but our software only allocated 1536-bytes for each of our receive buffers, so if an oversized packet gets transmitted to our NIC, it can cause ‘buffer overflow’ (i.e., possible data-corruption – or even a system-crash!) data-corruption A 2048-byte packet can “overflow” a 1536-byte buffer system-crash
Fix for problem 2 • Insure our driver programs the software in a manner consistent with its programming of the Network Interface hardware! 30 27 25 17 16 RCTL: BSIZE (Buffer-Size) BSEX (Buffer-Size Extension bit) FLEXBUF (size of all receive-buffers in KB, if nonzero) Possible sizes for hardware receive-buffers: If FLEXBUF=0 and BSEX=0: 2048-bytes, 1024-bytes, 512-bytes, 256-bytes If FLEXBUF=0 and BSEX=1: 32768-bytes, 16384-bytes. 8192-bytes, 4096-bytes Othewise: 15K, 14K, 13K, 12K, 11K, 10K, 9K, 8K, 7K, 6K, 5K, 4K, 3K, 2K, 1K
Problem 3 • If an application tries to ‘read’ fewer bytes than are contained in a received packet, the extra bytes get discarded (i.e., ‘lost’), which is NOT how a character-device is supposed to work! excess bytes never do get returned to the application User-space: application-program’s buffer copy_to_user() Kernel-space: device-driver’s packet-buffer
Fix for problem 3 • We have introduced a new static variable (called ‘pickup’) that keeps track of where the ‘extra’ data begins -- so next time the application tries to ‘read()’, our driver can ‘pick up’ from where it had left off before ssize_t my_read( struct file *file, char *buf, size_t len, loff_t *pos ) { static int rxhead = 0; // the current rxring[] array-index static int pickup = 0; // count of bytes returned so far copy_to_user( buf, cp+pickup, len ); pickup += len; if ( pickup >= rxring[ rxhead ].packet_length ) { rxhead = (1 + rxhead) % N_RX_DESC; pickup = 0; }
Problem 4 • A program might call our driver’s ‘write()’ procedure more rapidly than the hardware can transmit previously written packets, so ‘old’ packets not yet sent get ‘overwritten’ by ‘new’ packets – thus data gets ‘lost’! TDH txring: TDT NOTE: the NIC ‘stalls’ when TDT == TDH User’s next packet goes here, but the hardware still isn’t ‘done’ with transmitting previous data put there
Fix for problem 4 • We created an additional ‘wait-queue’ so our driver’s ‘write()’ routine can put a task to sleep until the hardware is ‘done’ with the descriptor indexed by the TDT value wait_quete_head_t wq_xmit; ssize_t my_write( struct file *file, const char *buf, size_t len, loff_t *pos ) { int txtail = ioread32( io + E1000_TDT ); if ( txring[ txtail ].desc_status == 0 ) wait_event_interruptible( wq_xmit, txring[ txtail ].desc_status );
Our ISR’s modification • Our driver’s Interrupt Service Routine has the duty of ‘awakening’ a sleeping writer when the NIC generates a TXDW interrupt (i.e., for Transmit-Descriptor Writeback) irqreturn_t my_isr( int irq, void *dev_id ) { int intr_cause = ioread32( io + E1000_ICR ); if ( intr_cause & (1<<0) // TXDW has occurred wake_up_interruptible( &wq_xmit );
Problem 5 • The hardware might receive packets at a faster rate than the application program desires to read them – causing our ring-buffer to ‘fill up’ with newer data before being adequately drained of its older data RDH rxring: RDT Because we allowed all of our receive-buffers to be ‘owned’ by the network controller, it continues round-and-round receiving everything!
Fix for problem 5 • We only grant ‘ownership’ to some of the receive-buffers at any given time – but we arrange for our interrupt-handler to adjust the RDT value dynamically whenever the the number owned by the NIC falls below a certain threshold (signaled by RXDMT0) 9 8 RCTL: RDMTS RDMTS (Receive Descriptors Minimum Threshold Size): 00=one-half, 01=one-fourth, 10=one-eighth. 11=one-sixteenth
Our ISR’s modification • Our driver’s Interrupt Service Routine has the duty of advancing the RDT register’s value whenever the ‘Minimum Threshold Reached’ event is signaled irqreturn_t my_isr( int irq, void *dev_id ) { int intr_cause = ioread32( io + E1000_ICR ); if ( intr_cause & (1<<4) // RXDMT0 has occurred { int rxtail = ioread32( io + E1000_RDT ); rxtail = (8 + rxtail) % N_RX_DESC; iowrite32( rxtail, io + E1000_RDT ); }
Discussion question • Should our ‘fix’ for problem 5 be modified to employ the controller’s ‘flow control’? • What will happen if an application program stops reading, but the NIC’s link-partner keep on sending out more data? • Does this suggest a use drivers can make of the SWXOFF-bit in register TCTL? 22 TCTL: SW XOFF SWXOFF (Software XOFF) writing ‘1’ causes NIC to send a PAUSE frame
Problem 6 • When we all are doing development of device-drivers for the 82573L controller using our ‘anchor-cluster’ network, any broadcast-packets sent by one driver cause interference with others’ work switched hub
Fix for problem 6 • We implemented VLAN capabilities in our ‘nic2.c’ revised character-mode driver, so students can employ VLAN identification-numbers in their outgoing packets that will cause those packets to be ‘filtered out’ by receiving drivers with different VLAN tags switched hub VLAN 1 VLAN 1 VLAN 2 VLAN 3 VLAN 4
Modification to IOCTL • We needed a convenient way to let user-programs change their driver’s VLAN tag int my_ioctl( struct inode *inode, struct file *file, unsigned int request, unsigned long address ) { switch ( request ) { case 0: // SET destination MAC-address case 1: // GET destination MAC-address case 2: // SET a revised VLAN identifier case 3: // GET the current VLAN identifier }
In-class exercise • Try using our ‘joinvlan.cpp’ demo-program which lets a user change the current VLAN identification in the running ‘nic2.c’ driver • How shall we arrange a scheme for every student to have their own unique VLAN id?