Download
advanced character driver operations n.
Skip this Video
Loading SlideShow in 5 Seconds..
Advanced Character Driver Operations PowerPoint Presentation
Download Presentation
Advanced Character Driver Operations

Advanced Character Driver Operations

119 Views Download Presentation
Download Presentation

Advanced Character Driver Operations

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Advanced Character Driver Operations Ted Baker  Andy Wang COP 5641 / CIS 4930

  2. Topics • Managing ioctl command numbers • Block/unblocking a process • Seeking on a device • Access control

  3. ioctl • For operations beyond simple data transfers • Eject the media • Report error information • Change hardware settings • Self destruct • Alternatives • Embedded commands in the data stream • Driver-specific file systems

  4. ioctl • User-level interface int ioctl(int fd, unsigned long cmd, ...); • ... • Variable number of arguments • Problematic for the system call interface • In this context, it is meant to pass a single optional argument • Just a way to bypass the type checking • Difficult to audit ioctl calls • E.g., 32-bit vs. 64-bit modes • Currently uses lock_kernel(), or the global kernel lock • See vfs_ioctl() in /fs/ioctl.c

  5. ioctl • Driver-level interface int (*ioctl) (struct inode *inode, struct file *filp, unsigned int cmd, unsigned long arg); • cmd is passed from the user unchanged • arg can be an integer or a pointer • Compiler does not type check

  6. Choosing the ioctl Commands • Need a numbering scheme to avoid mistakes • E.g., issuing a command to the wrong device (changing the baud rate of an audio device) • Check include/asm/ioctl.h and Documentation/ioctl/ioctl-decoding.txt

  7. Choosing the ioctl Commands • A command number uses four bitfields • Defined in <linux/ioctl.h> • < direction, type, number, size> • direction: direction of data transfer • _IOC_NONE • _IOC_READ • _IOC_WRITE • _IOC_READ | WRITE

  8. Choosing the ioctl Commands • type (ioctl device type) • 8-bit (_IOC_TYPEBITS) magic number • Associated with the device • number • 8-bit (_IOC_NRBITS) sequential number • Unique within device • size: size of user data involved • The width is either 13 or 14 bits (_IOC_SIZEBITS)

  9. Choosing the ioctl Commands • Useful macros to create ioctl command numbers • _IO(type, nr) • _IOR(type, nr, datatype) • _IOW(type, nr, datatype) • _IOWR(type, nr, datatype) • Example • cmd = _IOWR(‘k’, 1, struct foo) The macro will figure out that size = sizeof(datatype)

  10. Choosing the ioctl Commands • Useful macros to decode ioctl command numbers • _IOC_DIR(nr) • _IOC_TYPE(nr) • _IOC_NR(nr) • _IOC_SIZE(nr)

  11. Choosing the ioctl Commands • The scull example /* Use 'k' as magic number */ #define SCULL_IOC_MAGIC 'k‘ /* Please use a different 8-bit number in your code */ #define SCULL_IOCRESET _IO(SCULL_IOC_MAGIC, 0)

  12. Choosing the ioctl Commands • The scull example /* * S means "Set" through a ptr, * T means "Tell" directly with the argument value * G means "Get": reply by setting through a pointer * Q means "Query": response is on the return value * X means "eXchange": switch G and S atomically * H means "sHift": switch T and Q atomically */ #define SCULL_IOCSQUANTUM _IOW(SCULL_IOC_MAGIC, 1, int) #define SCULL_IOCSQSET _IOW(SCULL_IOC_MAGIC, 2, int) #define SCULL_IOCTQUANTUM _IO(SCULL_IOC_MAGIC, 3) #define SCULL_IOCTQSET _IO(SCULL_IOC_MAGIC, 4) #define SCULL_IOCGQUANTUM _IOR(SCULL_IOC_MAGIC, 5, int) Set new value and return the old value

  13. Choosing the ioctl Commands • The scull example #define SCULL_IOCGQSET _IOR(SCULL_IOC_MAGIC, 6, int) #define SCULL_IOCQQUANTUM _IO(SCULL_IOC_MAGIC, 7) #define SCULL_IOCQQSET _IO(SCULL_IOC_MAGIC, 8) #define SCULL_IOCXQUANTUM _IOWR(SCULL_IOC_MAGIC, 9, int) #define SCULL_IOCXQSET _IOWR(SCULL_IOC_MAGIC,10, int) #define SCULL_IOCHQUANTUM _IO(SCULL_IOC_MAGIC, 11) #define SCULL_IOCHQSET _IO(SCULL_IOC_MAGIC, 12) #define SCULL_IOC_MAXNR 14

  14. The Return Value • When the command number is not supported • Return –EINVAL • Or –ENOTTY (according to the POSIX standard)

  15. The Predefined Commands • Handled by the kernel first • Will not be passed down to device drivers • Three groups • For any file (regular, device, FIFO, socket) • Magic number: “T.” • For regular files only • Specific to the file system type

  16. Using the ioctl Argument • If it is an integer, just use it directly • If it is a pointer • Need to check for valid user address int access_ok(int type, const void *addr, unsigned long size); • type: either VERIFY_READ or VERIFY_WRITE • Returns 1 for success, 0 for failure • Driver then results –EFAULT to the caller • Defined in <asm/uaccess.h> • Mostly called by memory-access routines

  17. Using the ioctl Argument • The scull example int scull_ioctl(struct inode *inode, struct file *filp, unsigned int cmd, unsigned long arg) { int err = 0, tmp; int retval = 0; /* check the magic number and whether the command is defined */ if (_IOC_TYPE(cmd) != SCULL_IOC_MAGIC) { return -ENOTTY; } if (_IOC_NR(cmd) > SCULL_IOC_MAXNR) { return -ENOTTY; } …

  18. Using the ioctl Argument • The scull example … /* the concept of "read" and "write" is reversed here */ if (_IOC_DIR(cmd) & _IOC_READ) { err = !access_ok(VERIFY_WRITE, (void __user *) arg, _IOC_SIZE(cmd)); } else if (_IOC_DIR(cmd) & _IOC_WRITE) { err = !access_ok(VERIFY_READ, (void __user *) arg, _IOC_SIZE(cmd)); } if (err) return -EFAULT; …

  19. Using the ioctl Argument • Data transfer functions optimized for most used data sizes (1, 2, 4, and 8 bytes) • If the size mismatches • Cryptic compiler error message: • Conversion to non-scalar type requested • Use copy_to_user and copy_from_user • #include <asm/uaccess.h> • put_user(datum, ptr) • Writes to a user-space address • Calls access_ok() • Returns 0 on success, -EFAULT on error

  20. Using the ioctl Argument • __put_user(datum, ptr) • Does not check access_ok() • Can still fail if the user-space memory is not writable • get_user(local, ptr) • Reads from a user-space address • Calls access_ok() • Stores the retrieved value in local • Returns 0 on success, -EFAULT on error • __get_user(local, ptr) • Does not check access_ok() • Can still fail if the user-space memory is not readable

  21. Capabilities and Restricted Operations • Limit certain ioctl operations to privileged users • See <linux/capability.h> for the full set of capabilities • To check a certain capability call int capable(int capability); • In the scull example if (!capable(CAP_SYS_ADMIN)) { return –EPERM; } A catch-all capability for many system administration operations

  22. The Implementation of the ioctl Commands • A giant switch statement … switch(cmd) { case SCULL_IOCRESET: scull_quantum = SCULL_QUANTUM; scull_qset = SCULL_QSET; break; case SCULL_IOCSQUANTUM: /* Set: arg points to the value */ if (!capable(CAP_SYS_ADMIN)) { return -EPERM; } retval = __get_user(scull_quantum, (int __user *)arg); break; …

  23. The Implementation of the ioctl Commands … case SCULL_IOCTQUANTUM: /* Tell: arg is the value */ if (!capable(CAP_SYS_ADMIN)) { return -EPERM; } scull_quantum = arg; break; case SCULL_IOCGQUANTUM: /* Get: arg is pointer to result */ retval = __put_user(scull_quantum, (int __user *) arg); break; case SCULL_IOCQQUANTUM: /* Query: return it (> 0) */ return scull_quantum; …

  24. The Implementation of the ioctl Commands … case SCULL_IOCXQUANTUM: /* eXchange: use arg as pointer */ if (!capable(CAP_SYS_ADMIN)) { return -EPERM; } tmp = scull_quantum; retval = __get_user(scull_quantum, (int __user *) arg); if (retval == 0) { retval = __put_user(tmp, (int __user *) arg); } break; …

  25. The Implementation of the ioctl Commands … case SCULL_IOCHQUANTUM: /* sHift: like Tell + Query */ if (!capable(CAP_SYS_ADMIN)) { return -EPERM; } tmp = scull_quantum; scull_quantum = arg; return tmp; default: /* redundant, as cmd was checked against MAXNR */ return -ENOTTY; } /* switch */ return retval; } /* scull_ioctl */

  26. The Implementation of the ioctl Commands • Six ways to pass and receive arguments from the user space • Need to know command number int quantum; ioctl(fd,SCULL_IOCSQUANTUM, &quantum); /* Set by pointer */ ioctl(fd,SCULL_IOCTQUANTUM, quantum); /* Set by value */ ioctl(fd,SCULL_IOCGQUANTUM, &quantum); /* Get by pointer */ quantum = ioctl(fd,SCULL_IOCQQUANTUM); /* Get by return value */ ioctl(fd,SCULL_IOCXQUANTUM, &quantum); /* Exchange by pointer */ /* Exchange by value */ quantum = ioctl(fd,SCULL_IOCHQUANTUM, quantum);

  27. Device Control Without ioctl • Writing control sequences into the data stream itself • Example: console escape sequences • Advantages: • No need to implement ioctl methods • Disadvantages: • Need to make sure that escape sequences do not appear in the normal data stream (e.g., cat a binary file) • Need to parse the data stream

  28. Blocking I/O • Needed when no data is available for reads • When the device is not ready to accept data • Output buffer is full

  29. Introduction to Sleeping

  30. Introduction to Sleeping • A process is removed from the scheduler’s run queue • Certain rules • Never sleep when running in an atomic context • Multiple steps must be performed without concurrent accesses • Not while holding a spinlock, seqlock, or RCU lock • Not while disabling interrupts

  31. Introduction to Sleeping • Okay to sleep while holding a semaphore • Other threads waiting for the semaphore will also sleep • Need to keep it short • Make sure that it is not blocking the process that will wake it up • After waking up • Make no assumptions about the state of the system • The resource one is waiting for might be gone again • Must check the wait condition again

  32. Introduction to Sleeping • Wait queue: contains a list of processes waiting for a specific event • #include <linux/wait.h> • To initialize statically, call DECLARE_WAIT_QUEUE_HEAD(my_queue); • To initialize dynamically, call wait_queue_head_t my_queue; init_waitqueue_head(&my_queue);

  33. Simple Sleeping • Call variants of wait_event macros • wait_event(queue, condition) • queue = wait queue head • Passed by value • Waits until the boolean condition becomes true • Puts into an uninterruptible sleep • Usually is not what you want • wait_event_interruptible(queue, condition) • Can be interrupted by any signals • Returns nonzero if sleep was interrupted • Your driver should return -ERESTARTSYS

  34. Simple Sleeping • wait_event_killable(queue, condition) • Can be interrupted only by fatal signals • wait_event_timeout(queue, condition, timeout) • Wait for a limited time (in jiffies) • Returns 0 regardless of condition evaluations • wait_event_interruptible_timeout(queue, condition, timeout)

  35. Simple Sleeping • To wake up, call variants of wake_up functions void wake_up(wait_queue_head_t *queue); • Wakes up all processes waiting on the queue void wake_up_interruptible(wait_queue_head_t *queue); • Wakes up processes that perform an interruptible sleep

  36. Simple Sleeping • Example module: sleepy static DECLARE_WAIT_QUEUE_HEAD(wq); static int flag = 0; ssize_t sleepy_read(struct file *filp, char __user *buf, size_t count, loff_t *pos) { printk(KERN_DEBUG "process %i (%s) going to sleep\n", current->pid, current->comm); wait_event_interruptible(wq, flag != 0); flag = 0; printk(KERN_DEBUG "awoken %i (%s)\n", current->pid, current->comm); return 0; /* EOF */ } Multiple threads can wake up at this point

  37. Simple Sleeping • Example module: sleepy ssize_t sleepy_write(struct file *filp, const char __user *buf, size_t count, loff_t *pos) { printk(KERN_DEBUG "process %i (%s) awakening the readers...\n", current->pid, current->comm); flag = 1; wake_up_interruptible(&wq); return count; /* succeed, to avoid retrial */ }

  38. Blocking and Nonblocking Operations • By default, operations block • If no data is available for reads • If no space is available for writes • Non-blocking I/O is indicated by the O_NONBLOCK flag in filp->flags • Defined in <linux/fcntl.h> • Only open, read, and write calls are affected • Returns –EAGAIN immediately instead of block • Applications need to distinguish non-blocking returns vs. EOFs

  39. A Blocking I/O Example • scullpipe • A read process • Blocks when no data is available • Wakes a blocking write when buffer space becomes available • A write process • Blocks when no buffer space is available • Wakes a blocking read process when data arrives

  40. A Blocking I/O Example • scullpipe data structure struct scull_pipe { wait_queue_head_t inq, outq; /* read and write queues */ char *buffer, *end; /* begin of buf, end of buf */ int buffersize; /* used in pointer arithmetic */ char *rp, *wp; /* where to read, where to write */ int nreaders, nwriters; /* number of openings for r/w */ struct fasync_struct *async_queue; /* asynchronous readers */ struct semaphore sem; /* mutual exclusion semaphore */ struct cdev cdev; /* Char device structure */ };

  41. A Blocking I/O Example static ssize_t scull_p_read(struct file *filp, char __user *buf, size_t count, loff_t *f_pos) { struct scull_pipe *dev = filp->private_data; if (down_interruptible(&dev->sem)) return -ERESTARTSYS; while (dev->rp == dev->wp) { /* nothing to read */ up(&dev->sem); /* release the lock */ if (filp->f_flags & O_NONBLOCK) return -EAGAIN; if (wait_event_interruptible(dev->inq, (dev->rp != dev->wp))) return -ERESTARTSYS; if (down_interruptible(&dev->sem)) return -ERESTARTSYS; }

  42. A Blocking I/O Example if (dev->wp > dev->rp) count = min(count, (size_t)(dev->wp - dev->rp)); else /* the write pointer has wrapped */ count = min(count, (size_t)(dev->end - dev->rp)); if (copy_to_user(buf, dev->rp, count)) { up (&dev->sem); return -EFAULT; } dev->rp += count; if (dev->rp == dev->end) dev->rp = dev->buffer; /* wrapped */ up (&dev->sem); /* finally, awake any writers and return */ wake_up_interruptible(&dev->outq); return count; }

  43. Advanced Sleeping

  44. Advanced Sleeping • Uses low-level functions to affect a sleep • How a process sleeps 1. Allocate and initialize a wait_queue_t structure DEFINE_WAIT(my_wait); • Or wait_queue_t my_wait; init_wait(&my_wait); Queue element

  45. Advanced Sleeping 2. Add to the proper wait queue and mark a process as being asleep • TASK_RUNNINGTASK_INTERRUPTIBLE or TASK_UNINTERRUPTIBLE • Call void prepare_to_wait(wait_queue_head_t *queue, wait_queue_t *wait, int state);

  46. Advanced Sleeping 3. Give up the processor • Double check the sleeping condition before going to sleep • The wakeup thread might have changed the condition between steps 1 and 2 if (/* sleeping condition */) { schedule(); /* yield the CPU */ }

  47. Advanced Sleeping 4. Return from sleep Remove the process from the wait queue if schedule() was not called void finish_wait(wait_queue_head_t *queue, wait_queue_t *wait);

  48. Advanced Sleeping • scullpipewrite method /* How much space is free? */ static int spacefree(struct scull_pipe *dev) { if (dev->rp == dev->wp) return dev->buffersize - 1; return ((dev->rp + dev->buffersize - dev->wp) % dev->buffersize) - 1; }

  49. Advanced Sleeping static ssize_t scull_p_write(struct file *filp, const char __user *buf, size_t count, loff_t *f_pos) { struct scull_pipe *dev = filp->private_data; int result; if (down_interruptible(&dev->sem)) return -ERESTARTSYS; /* Wait for space for writing */ result = scull_getwritespace(dev, filp); if (result) return result; /* scull_getwritespace called up(&dev->sem) */ /* ok, space is there, accept something */ count = min(count, (size_t)spacefree(dev));

  50. Advanced Sleeping if (dev->wp >= dev->rp) count = min(count, (size_t)(dev->end - dev->wp)); else /* the write pointer has wrapped, fill up to rp - 1 */ count = min(count, (size_t)(dev->rp - dev->wp - 1)); if (copy_from_user(dev->wp, buf, count)) { up (&dev->sem); return -EFAULT; } dev->wp += count; if (dev->wp == dev->end) dev->wp = dev->buffer; /* wrapped */ up(&dev->sem); wake_up_interruptible(&dev->inq); if (dev->async_queue) kill_fasync(&dev->async_queue, SIGIO, POLL_IN); return count; } Notify asynchronous readers who are waiting