Section 10.4.Parameter PassingSection10.5.KernelWrapperRoutinesChapter11.SignalsSection 11.1.The Role of SignalsSection 11.2.Generating a SignalSection 11.3. Delivering a SignalSection11.4.SystemCalls Related to Signal HandlingChapter 12.The Virtual FilesystemSection 12.1.The Role of the Virtual Filesystem (VFS)Section12.2.VFSData StructuresSection 12.3.Filesystem TypesSection12.4.Filesystem HandlingSection12.5.PathnameLookupSection12.6.ImplementationsofVFSSystemCallsSection 12.7.File LockingChapter13.I/OArchitectureand DeviceDriversSection13.1.I/OArchitectureSection 13.2. The Device Driver ModeSection 13.3.Device FilesSection 13.4.Device DriversSection13.5.CharacterDeviceDriversChapter14.BlockDeviceDriversSection 14.1.Block Devices HandlingSection 14.2.TheGeneric Block LayerSection 14.3.The I/O SchedulerSection 14.4.Block DeviceDriversSection14.5.OpeningaBlockDeviceFileChapter15.ThePageCacheSection15.1.ThePageCacheSection 15.2.Storing Blocks in the Page CacheSection 15.3.Writing Dirty Pages to DiskSection 15.4. The sync(),fsync(),and fdatasync()System CallsChapter16.Accessing FilesSection 16.1.Reading and Writing a FileSection16.2.Memory MappingSection16.3.DirectVOTransfersSection16.4.AsynchronousV/OChapter17.PageFrameReclaiming6
6 Section 10.4. Parameter Passing Section 10.5. Kernel Wrapper Routines Chapter 11. Signals Section 11.1. The Role of Signals Section 11.2. Generating a Signal Section 11.3. Delivering a Signal Section 11.4. System Calls Related to Signal Handling Chapter 12. The Virtual Filesystem Section 12.1. The Role of the Virtual Filesystem (VFS) Section 12.2. VFS Data Structures Section 12.3. Filesystem Types Section 12.4. Filesystem Handling Section 12.5. Pathname Lookup Section 12.6. Implementations of VFS System Calls Section 12.7. File Locking Chapter 13. I/O Architecture and Device Drivers Section 13.1. I/O Architecture Section 13.2. The Device Driver Model Section 13.3. Device Files Section 13.4. Device Drivers Section 13.5. Character Device Drivers Chapter 14. Block Device Drivers Section 14.1. Block Devices Handling Section 14.2. The Generic Block Layer Section 14.3. The I/O Scheduler Section 14.4. Block Device Drivers Section 14.5. Opening a Block Device File Chapter 15. The Page Cache Section 15.1. The Page Cache Section 15.2. Storing Blocks in the Page Cache Section 15.3. Writing Dirty Pages to Disk Section 15.4. The sync( ), fsync( ), and fdatasync( ) System Calls Chapter 16. Accessing Files Section 16.1. Reading and Writing a File Section 16.2. Memory Mapping Section 16.3. Direct I/O Transfers Section 16.4. Asynchronous I/O Chapter 17. Page Frame Reclaiming
Section17.1.ThePageFrameReclaiming AlgorithmSection17.2.ReverseMappingSection17.3.ImplementingthePFRASection17.4.SwappingChapter18.TheExt2andExt3FilesystemsSection 18.1.General Characteristics of Ext2Section18.2.Ext2DiskDataStructuresSection18.3.Ext2MemoryDataStructuresSection 18.4.Creating theExt2 FilesystemSection 18.5. Ext2 MethodsSection18.6.ManagingExt2DiskSpaceSection18.7.TheExt3FilesystemChapter19.ProcessCommunicationSection 19.1.PipesSection 19.2. FIFOsSection19.3.SystemVIPCSection19.4.POSIXMessageQueuesChapter20.ProgramExZecutionSection 20.1. Executable FilesSection 20.2.ExecutableFormatsSection 20.3.Execution DomainsSection20.4.TheexecFunctionsAppendixA.SystemStartupSection A.1. Prehistoric Age:the BIOSSection A2. Ancient Age: the Boot LoaderSection A.3. Middle Ages: the setup()FunctionSectionA.4.Renaissance:thestartup 32()FunctionsSection A.5.ModernAge:the start kernel()Function_Appendix B. ModulesSectionB.1.ToBe(aModule)orNottoBe?Section B.2.Module ImplementationSection B.3.Linking and Unlinking ModulesSection B.4.Linking Modules on DemandBibliographyBookson UnixKernelsBooksontheLinuxKernelBooksonPCArchitectureandTechnical Manualson Intel MicroprocessorsOtherOnlineDocumentation Sources7
7 Section 17.1. The Page Frame Reclaiming Algorithm Section 17.2. Reverse Mapping Section 17.3. Implementing the PFRA Section 17.4. Swapping Chapter 18. The Ext2 and Ext3 Filesystems Section 18.1. General Characteristics of Ext2 Section 18.2. Ext2 Disk Data Structures Section 18.3. Ext2 Memory Data Structures Section 18.4. Creating the Ext2 Filesystem Section 18.5. Ext2 Methods Section 18.6. Managing Ext2 Disk Space Section 18.7. The Ext3 Filesystem Chapter 19. Process Communication Section 19.1. Pipes Section 19.2. FIFOs Section 19.3. System V IPC Section 19.4. POSIX Message Queues Chapter 20. Program ExZecution Section 20.1. Executable Files Section 20.2. Executable Formats Section 20.3. Execution Domains Section 20.4. The exec Functions Appendix A. System Startup Section A.1. Prehistoric Age: the BIOS Section A.2. Ancient Age: the Boot Loader Section A.3. Middle Ages: the setup( ) Function Section A.4. Renaissance: the startup_32( ) Functions Section A.5. Modern Age: the start_kernel( ) Function Appendix B. Modules Section B.1. To Be (a Module) or Not to Be? Section B.2. Module Implementation Section B.3. Linking and Unlinking Modules Section B.4. Linking Modules on Demand Bibliography Books on Unix Kernels Books on the Linux Kernel Books on PC Architecture and Technical Manuals on Intel Microprocessors Other Online Documentation Sources
ResearchPapersRelatedtoLinuxDevelopmentAbout the AuthorsColophonIndexUnderstanding the Linux Kernel,Third EditionbyDanielP.BovetandMarcoCesatiCopyright2006O'ReillyMedia,Inc.All rightsreserved.PrintedintheUnitedStatesofAmericaPublishedbyO'ReillyMedia,Inc.,1005GravensteinHighwayNorth,Sebastopol,CA95472O'Reilly booksmaybe purchased for educational,business,orsales promotional use.Onlineeditionsarealsoavailableformosttitles(safari.oreilly.com).Formoreinformation,contactourcorporate/institutional salesdepartment:(800)998-9938orcorporate@oreilly.com.Editor:Andy OramDarren KellyProductionEditor:ProductionServices:Amy ParkerEdie FreedmanCover Designer:InteriorDesigner:David FutatoPrinting History:November2000:First Edition.December 2002:Second Edition.November2005:Third Edition.NutshellHandbook,theNutshell Handbooklogo,andtheO'ReillylogoareregisteredtrademarksofO'ReillyMedia, Inc.The Linux series designations,Understanding the Linux Kernel,Third Edition,the image of a man with a bubble, and related trade dress are trademarks of O'Reilly Media, Inc.Manyofthedesignationsusedbymanufacturersandsellersto distinguishtheirproductsareclaimedastrademarks.Wherethosedesignationsappearinthisbook,andO'ReillyMedia,Inc.was8
8 Research Papers Related to Linux Development About the Authors Colophon Index Understanding the Linux Kernel, Third Edition by Daniel P. Bovet and Marco Cesati Copyright © 2006 O'Reilly Media, Inc. All rights reserved. Printed in the United States of America. Published by O'Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O'Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (safari.oreilly.com). For more information, contact our corporate/institutional sales department: (800) 998-9938 or corporate@oreilly.com. Editor: Andy Oram Production Editor: Darren Kelly Production Services: Amy Parker Cover Designer: Edie Freedman Interior Designer: David Futato Printing History: November 2000: First Edition. December 2002: Second Edition. November 2005: Third Edition. Nutshell Handbook, the Nutshell Handbook logo, and the O'Reilly logo are registered trademarks of O'Reilly Media, Inc. The Linux series designations, Understanding the Linux Kernel, Third Edition, the image of a man with a bubble, and related trade dress are trademarks of O'Reilly Media, Inc. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and O'Reilly Media, Inc. was
aware of atrademark claim,the designationshavebeenprinted in caps or initial capsWhileeveryprecautionhas beentaken inthepreparationof this book,thepublisherand authorsassume no responsibilityfor errors oromissions,orfordamages resulting from the useof theinformationcontainedherein.ISBN:0-596-00565-2[M]6
9 aware of a trademark claim, the designations have been printed in caps or initial caps. While every precaution has been taken in the preparation of this book, the publisher and authors assume no responsibility for errors or omissions, or for damages resulting from the use of the information contained herein. ISBN: 0-596-00565-2 [M]
PrefaceIn the spring semester of 1997,we taught a course on operating systems based on Linux 2.0.Theidea was to encourage students to read the source code.To achieve this, we assigned termprojects consisting of making changes to the kernel and performing tests on the modified version.We also wrote course notes for our students about a few critical features of Linux such as taskswitching andtask scheduling.Outofthisworkandwithalotof supportfromourO'ReillyeditorAndyOramcamethefirsteditionof Understanding the Linux Kernel at the end of 2000,which covered Linux 2.2 with a fewanticipations on Linux 2.4. The success encountered by this book encouraged us to continue alongthis line.At the end of 2o02, we came out witha second edition covering Linux2.4.Youare nowlookingatthethirdedition,whichcoversLinux2.6.As in our previous experiences,wereadthousands of lines of code, trying to make sense of them.After all this work,we can say that it was worth the effort.Welearned a lot of things you don't findin books,andwehopewehavesucceededinconveying someofthisinformationinthefollowingpages.TheAudienceforThisBookAll people curious about how Linux works and why it is so efficient will find answers here. Afterreading the book, you will find your way through the many thousands of lines of code,distinguishing between crucial data structuresand secondary onesin short,becoming a true Linuxhacker.Our work might be considered a guided tour of the Linux kernel:most of the significant datastructuresandmanyalgorithmsandprogrammingtricksusedinthekernelarediscussed.Inmanycases,therelevantfragmentsof codearediscussed linebyline.Ofcourse,youshouldhavetheLinux source code on hand and should be willing to expend some effort deciphering some of thefunctionsthatarenot,forsakeofbrevity,fullydescribed.Onanotherlevel,thebookprovidesvaluableinsighttopeoplewhowanttoknowmoreaboutthecritical design issues in a modern operating system.It is not specifically addressed to systemadministratorsorprogrammers;itismostlyforpeoplewhowantto understandhowthingsreallywork insidethemachine!Aswithanygoodguide,wetrytogobeyondsuperficialfeatures.Weofferabackground,suchasthehistoryofmajorfeaturesandthereasonswhytheywereused.10
10 Preface In the spring semester of 1997, we taught a course on operating systems based on Linux 2.0. The idea was to encourage students to read the source code. To achieve this, we assigned term projects consisting of making changes to the kernel and performing tests on the modified version. We also wrote course notes for our students about a few critical features of Linux such as task switching and task scheduling. Out of this work and with a lot of support from our O'Reilly editor Andy Oram came the first edition of Understanding the Linux Kernel at the end of 2000, which covered Linux 2.2 with a few anticipations on Linux 2.4. The success encountered by this book encouraged us to continue along this line. At the end of 2002, we came out with a second edition covering Linux 2.4. You are now looking at the third edition, which covers Linux 2.6. As in our previous experiences, we read thousands of lines of code, trying to make sense of them. After all this work, we can say that it was worth the effort. We learned a lot of things you don't find in books, and we hope we have succeeded in conveying some of this information in the following pages. The Audience for This Book All people curious about how Linux works and why it is so efficient will find answers here. After reading the book, you will find your way through the many thousands of lines of code, distinguishing between crucial data structures and secondary onesin short, becoming a true Linux hacker. Our work might be considered a guided tour of the Linux kernel: most of the significant data structures and many algorithms and programming tricks used in the kernel are discussed. In many cases, the relevant fragments of code are discussed line by line. Of course, you should have the Linux source code on hand and should be willing to expend some effort deciphering some of the functions that are not, for sake of brevity, fully described. On another level, the book provides valuable insight to people who want to know more about the critical design issues in a modern operating system. It is not specifically addressed to system administrators or programmers; it is mostly for people who want to understand how things really work inside the machine! As with any good guide, we try to go beyond superficial features. We offer a background, such as the history of major features and the reasons why they were used