1 / 40

CSC 660: Advanced OS

CSC 660: Advanced OS. System Calls. A Different Kind of C. No access to C library. ISO C99 + GNU C extensions. No memory protection. Small fixed-size (8KB) stack. Limited floating point support. Concurrency and synchronization. Portability. Coding style and idioms. Debugging.

imala
Download Presentation

CSC 660: Advanced OS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSC 660: Advanced OS System Calls CSC 660: Advanced Operating Systems

  2. A Different Kind of C • No access to C library. • ISO C99 + GNU C extensions. • No memory protection. • Small fixed-size (8KB) stack. • Limited floating point support. • Concurrency and synchronization. • Portability. • Coding style and idioms. • Debugging. CSC 660: Advanced Operating Systems

  3. No access to C library Why not? Bootstrapping (C library uses system calls…) Performance and size. Kernel equivalent functions Use lib/string.c for string operations. Use printk() instead of printf() CSC 660: Advanced Operating Systems

  4. ISO C 99 Inline Functions static inline void dog(int tail) Struct Assignment struct file_operations fops = { .read = device_read, .write = device_write, .open = device_open, .release = device_release }; CSC 660: Advanced Operating Systems

  5. GNU C Inline Assembly (asm or __asm__ keyword) asm ( assembler template : output operands : input operands : list of clobbered registers ); Example from arch/i386/signal.c: __asm__("movl %%gs,%0" : "=r"(tmp): "0"(tmp)); Branch Annotation Optimize branch for most likely decision. likely() and unlikely() macros CSC 660: Advanced Operating Systems

  6. GNU C asmlinkage Function attribute to allow C functions to be called from assembly language (prevents parameters being placed in registers.) volatile Warns compiler that variable may be changed asynchronously by other threads (prevents compiler from optimizing away reads.) static inline Inline function expansion to improve speed. CSC 660: Advanced Operating Systems

  7. No Memory Protection Kernel traps illegal memory access for users Sends SIGSEGV to kill offending process. No one to look out for kernel. Memory violations result in kernel oops. Kernel memory is not pageable. Uses physical memory, not swap space. CSC 660: Advanced Operating Systems

  8. Small Fixed Stack Kernel stack is 2 4KB pages Cannot create many local variables. No deep recursion. CSC 660: Advanced Operating Systems

  9. Floating Point Floating point used to be handled by FPU. Integrated into CPU with 80486DX. Still performed with ESCAPE instructions. FPU has own FP registers. Shared with MMX unit. Not saved by default on context switch. Must use FP carefully in kernel Call kernel_fpu_begin() before using FPU. Call kernel_fpu_end() after using FPU. CSC 660: Advanced Operating Systems

  10. Concurrency Asynchronous interrupts Interrupt handlers may access resources at the same time as your function. Multiprocessing Another processor may be executing function at the same time. Preemptive kernel Scheduler can preempt your kernel thread in favor of another thread. Synchronization Solutions Spinlocks Semaphors CSC 660: Advanced Operating Systems

  11. Portability Kernel runs on 22 architectures. Different endianess. Different word sizes. Different page sizes. Kernel code must be Endian neutral 64-bit clean No assumptions about word or page size. CSC 660: Advanced Operating Systems

  12. Portability A char is always 8 bits (may be signed or unsigned). A short is currently 16 bits on all archs. An int is currently 32 bits on all archs. A long may be 32 or 64 bits. A pointer may be 32 or 64 bits. Use explicitly sized types when necessary: s8,u8,s16,u16,s32,u32,s64,u64 Use opaque types for portability atomic_t, pid_t CSC 660: Advanced Operating Systems

  13. Coding Style Indentation Tabs that are 8-characters in length. Braces Conditionals/loops: initial { at end of statement if (foo) { … } else { … } Functions: { on separate line int foo() { … } CSC 660: Advanced Operating Systems

  14. Coding Style Naming Lower case, words separated by underscores. Use descriptive names, especially for globals. Functions No longer than 2 screens of text. Fewer than 10 local variables. Comments Describe what and why, not how your code works. Ifdefs Restrict them to include (.h) files. CSC 660: Advanced Operating Systems

  15. Idioms do { stmt1; stmt2 } while (0) Found in macros. Allows multi-statement macros in if/else Heavy use of bit operators and(&), or(|), xor(^), not(~) Heavy use of goto Often used to exit control structures on error. CSC 660: Advanced Operating Systems

  16. Kernel Debugging: Oops An oops is a major kernel failure. Ex: dereferencing a null pointer If kernel cannot recover, a panic results. Information sent to console Text description Register contents Stack backtrace CSC 660: Advanced Operating Systems

  17. Kernel Debugging: Oops Unable to handle kernel NULL pointer dereference at virtual address 00000000 c0203c18 EIP: 0060:[<c0203c18>] Not tainted Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010086 eax: c137a800 ebx: c0e80200 ecx: c1379050 edx: 00000000 esi: c137a800 edi: c13d0000 ebp: 00000246 esp: c13d1f2c ds: 007b es: 007b ss: 0068 Stack: c1379050 00000002 c137a800 00000008 00000000 c137a800 c02060b3 c137a800 0001221e 00000000 c030b004 c030b000 c13fdc10 c02037c0 c137a800 00000293 c0125b6d 00000000 c13fdc28 c13fdc20 c13d0000 c13d0000 c13d0000 00000000 Call Trace: [<c02060b3>] is_complete+0x2c3/0x310 [<c02037c0>] run+0x30/0x40 [<c0125b6d>] worker_thread+0x1bd/0x2b0 [<c0203790>] run+0x0/0x40 [<c0113b10>] default_wake_function+0x0/0x20 [<c0108fd6>] ret_from_fork+0x6/0x20 [<c0113b10>] default_wake_function+0x0/0x20 [<c01259b0>] worker_thread+0x0/0x2b0 CSC 660: Advanced Operating Systems

  18. printk() Robust and callable except early in boot Enable early_printk() option for that. Sends output to klog circular log buffer klogd reads /proc/kmsg syslogd gets data from klogd writes to a file under /var/log can also access with dmesg Message priorities 0(high) .. 7(low) Named: KERN_EMERG, _ALERT, _CRIT, _ERR, _WARNING, _NOTICE, _INFO, _DEBUG CSC 660: Advanced Operating Systems

  19. Printing Debugging Information printk() Assertions BUG_ON(bad_condition) causes oops Panics if (terrible_condition) panic(“Terrible condition!”); Stack traces if (!debug_check) { printk(KERN_DEBUG “Check x failed\n”); dump_stack(); } CSC 660: Advanced Operating Systems

  20. System Calls System calls provide the interface between user programs and kernel. 1. Abstracted hardware interface. 2. Security and stability. 3. Allows virtualization. Programmers typically don’t invoke system calls directly, but rather use libc library calls. CSC 660: Advanced Operating Systems

  21. System call users malloc() free() exec() fork() printf() fopen() fputc() fclose() socket() Non-system call users asin() log() sin() strcmp() strcpy() atoi() bsearch() qsort() rand() libc: C standard library/POSIX API CSC 660: Advanced Operating Systems

  22. User to Kernel Mode Transition CSC 660: Advanced Operating Systems

  23. Hello World > cat >hello.c #include <stdio.h> int main(int argc, char *argv[]) { printf("Hello world!\n"); return 0; } > gcc –o hello hello.c > ltrace ./hello __libc_start_main(0x8048394, 1, 0xbffff914, 0x80483b8, 0x8048400 <unfinished ...> printf("Hello world!\n"Hello world! ) = 13 +++ exited (status 0) +++ CSC 660: Advanced Operating Systems

  24. Hello World >strace ./hello execve("./hello", ["./hello"], [/* 40 vars */]) = 0 uname({sys="Linux", node="tara", ...}) = 0 brk(0) = 0x804a000 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) old_mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7fe9000 open("/etc/ld.so.preload", O_RDONLY) = -1 ENOENT (No such file or directory) open("/etc/ld.so.cache", O_RDONLY) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=50648, ...}) = 0 old_mmap(NULL, 50648, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7fdc000 close(3) = 0 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) open("/lib/tls/i686/cmov/libc.so.6", O_RDONLY) = 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\215Y\1"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0644, st_size=1222116, ...}) = 0 CSC 660: Advanced Operating Systems

  25. Hello World old_mmap(NULL, 1232428, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0xb7eaf000 old_mmap(0xb7fd1000, 36864, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x121000) = 0xb7fd1000 old_mmap(0xb7fda000, 7724, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7fda000 close(3) = 0 old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7eae000 set_thread_area({entry_number:-1 -> 6, base_addr:0xb7eae080, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0 munmap(0xb7fdc000, 50648) = 0 fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 3), ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7fe8000 write(1, "Hello world!\n", 13Hello world!) = 13 munmap(0xb7fe8000, 4096) = 0 exit_group(0) = ? CSC 660: Advanced Operating Systems

  26. Using a System Call Application Calls printf() C library (glibc) printf() function issues write() system call. Kernel write() system call manages output. sets global errno variable if an error occurs. returns to user application CSC 660: Advanced Operating Systems

  27. Making a System Call Software Interrupt Historically: int $0x80 Modern: sysenter System Call Number Put in %eax register before interrupt sys_call_table in arch/i386/kernel/entry.S Parameters 1-5 args: %ebx, %ecx, %edx, %esi, %edi 6+ args: one register has pointer to user space params Returning Return from software interrupt: iret or sysexit Return value stored in %eax register. CSC 660: Advanced Operating Systems

  28. System Call Handler Invoked on all system calls. Functionality: • Saves register contents. • Reads syscall number from %EAX. • Invokes system call service routine found at sys_call_table + 4 * %EAX. • Stores syscall return value over stack %EAX. • Restores registers (moving return val to %EAX) • Switch from Kernel Mode to User Mode CSC 660: Advanced Operating Systems

  29. Kernel System Call Handlerarch/i386/kernel/entry.S ENTRY(system_call) pushl %eax # save orig_eax SAVE_ALL GET_THREAD_INFO(%ebp) # system call tracing in operation testb $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT),TI_flags(%ebp) jnz syscall_trace_entry cmpl $(nr_syscalls), %eax jae syscall_badsys syscall_call: call *sys_call_table(,%eax,4) movl %eax,EAX(%esp) # store return value syscall_exit: cli movl TI_flags(%ebp), %ecx testw $_TIF_ALLWORK_MASK, %cx # current->work jne syscall_exit_work restore_all: RESTORE_ALL CSC 660: Advanced Operating Systems

  30. System Call Parameters • Must use registers instead of stack • System call runs in kernel mode. • User process doesn’t have access to kernel stack. • Copying user stack to kernel stack is slow. • Register limitations • Parameter size <= register size (32 bits) • x86 only has a few registers, so #params limited. • Solutions • Pass large parameters by reference. • If >6 params needed, use reference to params in memory. • System call handler saves registers to stack before calling system call service routine, allowing service routine to use parameters like a normal C function. CSC 660: Advanced Operating Systems

  31. Verifying Parameters • Must ensure users cannot access files, processes, or memory that they don’t have permission to access. • Before accessing a user pointer, must ensure • Pointer points to user, not kernel memory. • Pointers points to region of memory in process’s address space. • Access is permitted by memory access restrictions (read, write, execute.) CSC 660: Advanced Operating Systems

  32. Accessing User Memory • To copy user memory to kernel memory with the appropriate safety checks, use • copy_from_user(kern_buf, user_buf, len) • copy_to_user(user_buf, kern_buf, len) • Both functions return number of bytes they failed to copy on error, 0 on success. • Syscall returns –EFAULT on such an error. CSC 660: Advanced Operating Systems

  33. System Call Errors • System calls return errors as -ESYMBOL • Error #s in include/asm-generic/errno-base.h • ENOSYS: No such system call. • EPERM: Permission denied. • EAGAIN: Try again. • EIO: I/O error • User program API returns a -1 error value. • Actual error # stored in errno variable. CSC 660: Advanced Operating Systems

  34. Adding a System Call • Write system call function sys_mycall. • Add entry to end of sys_call_table In arch/i386/kernel/entry.S add .long sys_mycall • Define system call number for user In include/asm-i386/unistd.h #define __NR_mycall 289 • Update max # of system calls. • Compile kernel. CSC 660: Advanced Operating Systems

  35. Defining a System Call System call name: getpid() System call function: sys_getpid() asmlinkage long sys_getpid(void) { return current->tgid; } CSC 660: Advanced Operating Systems

  36. Invoking System Calls • Standard syscalls called indirectly via libc. • What if you’ve created a new system call? • Manually write assembly to create a software interrupt and pass parameters in registers. • Or use _syscall macros in <linux/unistd.h> to automatically generate a function that calls your new system call. CSC 660: Advanced Operating Systems

  37. System Call Declaration Macrosinclude/asm-i386/unistd.h _syscall0(int, fork) • fork is the system call to be invoked. • int is the type of the return value #define _syscall0(type,name) \ type name(void) \ { \ long __res; \ __asm__ volatile ("int $0x80" \ : "=a" (__res) \ : "0" (__NR_##name)); \ __syscall_return(type,__res); \ } CSC 660: Advanced Operating Systems

  38. System Call Declaration Macrosinclude/asm-i386/unistd.h _syscall3(int,write,int,fd,const char *,buf,unsigned int, count) • write is the system call with 3 arguments to be called. • 3 parameters are fd, buf, and count. #define _syscall3(type,name,t1,arg1,t2,arg2,t3,arg3) \ typename(type1 arg1,type2 arg2,type3 arg3) \ { long __res; asm__ volatile ("int $0x80" : "=a" (__res) \ : "" (__NR_##name),"b" ((long)(arg1)), "c" ((long)(arg2)), "d" ((long)(arg3))); \ __syscall_return(type,__res); \ } CSC 660: Advanced Operating Systems

  39. Calling your new syscall #include <linux/unistd.h> #define __NR_current_time 289 _syscall0(long, current_time) #include <stdio.h> int main() { long retval = 1; retval = current_time(); printf("The return value is %ld\n", retval); return 0; } CSC 660: Advanced Operating Systems

  40. References • Daniel P. Bovet and Marco Cesati, Understanding the Linux Kernel, 3rd edition, O’Reilly, 2005. • GNU, GNU C Library Manual, http://www.gnu.org/software/libc/manual/, 2003. • Robert Love, Linux Kernel Development, 2nd edition, Prentice-Hall, 2005. • Claudia Rodriguez et al, The Linux Kernel Primer, Prentice-Hall, 2005. • Peter Salzman et. al., Linux Kernel Module Programming Guide, version 2.6.1, 2005. • Andrew S. Tanenbaum, Modern Operating Systems, 2nd edition, Prentice-Hall, 2001. CSC 660: Advanced Operating Systems

More Related