1 / 17

Reverse Engineered Architecture of the Linux Kernel

Reverse Engineered Architecture of the Linux Kernel. Kristof De Vos. Why Linux Kernel?. Linux is a Unix-like Operating System Large system : 800 KLOC Open Source no barriers to discuss the details of the system implementation Has no fully documented architecture.

duncan
Download Presentation

Reverse Engineered Architecture of the Linux Kernel

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Reverse Engineered Architecture of the Linux Kernel Kristof De Vos

  2. Why Linux Kernel? • Linux is a Unix-like Operating System • Large system : 800 KLOC • Open Source • no barriers to discuss the details of the system implementation • Has no fully documented architecture

  3. Conceptual vs. Concrete Architecture • Conceptual: • how developers think about the system • meaningful relationships • Concrete • as-build (as in the implementation) • might include dependencies for debugging, ...

  4. 6 Steps 1 Examine existing documentation 2 form conceptual architecture 3 group source files in subsystems based on: • directory structure • naming conventions • source code comments • examining source code

  5. 6 Steps 4 Extract relations between source files 5 use relations between source files to determine relations between subsystems 6 form concrete architecture

  6. Conceptual Architecture • Descriptions of related operating systems and existing Linux documentation used:

  7. 7 major subsystems • Process Scheduler • responsible for multitasking • Memory Manager • separates memory spaces for each process • uses swapping to support more processes • File System • access to hardware devieces

  8. 7 major subsystems • Network Interface • access to network devices • Inter-Process Communication (IPC) • allows communication between processes on the same processor • Initialization • responsible for initialization of the rest of the kernel • Library • routines, used by the whole kernel

  9. File-sub-architecture • Extracted roles: • provide access to a variety of hardware devices • supports several logical file system formats • allows programs to be stored in several executable formats • Further investigations: Facade design pattern • subsystems are accessible through a single interface • subsystem interdependency is reduced

  10. File System subsystems

  11. File System subsystems • Main roles are implemented in 5 subsystems: • Device Drivers • performs all communication with hardware devices • Logical File Systems • implements several logical file systems • allows interoperability with different OS • encryption, compression, high performance, ...

  12. subsystems • Executable File Formats • allows execution of different executables • File Quota • limits amount of file storage for individual users • Buffer Cache • memory buffers for I/O-operations • 2 other subsystems define facade interfaces • all information is extracted from other documentation

  13. Extraction • Manual examination too costly (800KLOC) • automated tools (GROK): • manually define the subsystem hierarchy • manually assign source files to subsystems • let the beast loose: • grok examines all source files • finds relations between subsystems • output is not readable for humans • lsedit visually shows relations

  14. Partial subsystem hierarchy

  15. Concrete Architecture • Combination of • conceptual architecture • subsystem hierarchy • results of automated tools

  16. Concrete Architecture • Same subsystems, but different dependencies • 19 vs. 37 interprocess dependencies • reasons: • efficiency • exploration, maybe not really needed • possibly faulty

More Related