01.2168Page 8Thursday,January 20,20059:21AMIn addition to device drivers, other functionalities, both hardware and software, aremodularized in the kernel. One common example is filesystems.A filesystem typedetermines how information is organized on a block device in orderto representatree of directories and files. Such an entity is not a device driver, in that there's noexplicit device associated with the way the information is laid down; the filesystemtype is instead a software driver, because it maps the low-level data structures tohigh-level data structures. It is the filesystem that determines how long a filenamecan be and what information about each file is stored in a directory entry.The file-system modulemustimplementthelowest level of thesystemcallsthataccess direc-tories and files, by mapping filenames and paths (as well as other information, suchas accessmodes)todata structures stored indata blocks.Such an interfaceis com+pletely independent of the actual data transfer to and from the disk (or othermedium), which is accomplished by a block device driver.If you think of how strongly a Unix system depends on the underlying filesystem,you'll realizethat such a software concept isvital to system operation.The abilitytodecode filesystem information stays at the lowest level of thekernel hierarchy and isofutmost importance;even ifyouwriteablockdriverforyour newCD-ROM,it isuselessifyouarenotabletorunlsorcponthedataithosts.Linuxsupportsthecon-cept of a filesystem module, whose software interface declares the different opera-tions that can be performed on a filesystem inode,directory,file,and superblock.It'squite unusual for a programmer to actually need to write a filesystem module,becausetheofficial kernel alreadyincludes codeforthemost importantfilesystemtypes.SecurityIssuesSecurity is an increasingly important concern in modern times.Wewill discuss secu-rity-related issues as they come up throughout the book. There are a few general con-cepts,however,thatareworthmentioningnowAny security check in the system is enforced by kernel code. If the kernel has secu-rity holes, then the system as a whole has holes. In the official kernel distribution,only an authorized user can load modules;the system call init_module checks if theinvoking process is authorized to load a module into the kernel. Thus, when run-ning an official kernel, only the superuser, or an intruder who has succeeded inbecoming privileged,can exploitthepowerof privileged code.When possible, driver writers should avoid encoding security policy in their code.Security is a policy issue that is often best handled at higher levels within the kernel,under the control of the system administrator.There are always exceptions, however.* Technically, only somebody with the CAP_sYs_MODULE capability can perform this operation. We discusscapabilities inChapter6.3↓Chapter 1: An Introduction to Device Drivers
This is the Title of the Book, eMatter Edition Copyright © 2005 O’Reilly & Associates, Inc. All rights reserved. 8 | Chapter 1: An Introduction to Device Drivers In addition to device drivers, other functionalities, both hardware and software, are modularized in the kernel. One common example is filesystems. A filesystem type determines how information is organized on a block device in order to represent a tree of directories and files. Such an entity is not a device driver, in that there’s no explicit device associated with the way the information is laid down; the filesystem type is instead a software driver, because it maps the low-level data structures to high-level data structures. It is the filesystem that determines how long a filename can be and what information about each file is stored in a directory entry. The filesystem module must implement the lowest level of the system calls that access directories and files, by mapping filenames and paths (as well as other information, such as access modes) to data structures stored in data blocks. Such an interface is completely independent of the actual data transfer to and from the disk (or other medium), which is accomplished by a block device driver. If you think of how strongly a Unix system depends on the underlying filesystem, you’ll realize that such a software concept is vital to system operation. The ability to decode filesystem information stays at the lowest level of the kernel hierarchy and is of utmost importance; even if you write a block driver for your new CD-ROM, it is useless if you are not able to run ls or cp on the data it hosts. Linux supports the concept of a filesystem module, whose software interface declares the different operations that can be performed on a filesystem inode, directory, file, and superblock. It’s quite unusual for a programmer to actually need to write a filesystem module, because the official kernel already includes code for the most important filesystem types. Security Issues Security is an increasingly important concern in modern times. We will discuss security-related issues as they come up throughout the book. There are a few general concepts, however, that are worth mentioning now. Any security check in the system is enforced by kernel code. If the kernel has security holes, then the system as a whole has holes. In the official kernel distribution, only an authorized user can load modules; the system call init_module checks if the invoking process is authorized to load a module into the kernel. Thus, when running an official kernel, only the superuser,* or an intruder who has succeeded in becoming privileged, can exploit the power of privileged code. When possible, driver writers should avoid encoding security policy in their code. Security is a policy issue that is often best handled at higher levels within the kernel, under the control of the system administrator. There are always exceptions, however. * Technically, only somebody with the CAP_SYS_MODULE capability can perform this operation. We discuss capabilities in Chapter 6. ,ch01.2168 Page 8 Thursday, January 20, 2005 9:21 AM
h01.2168 Page 9 Thursday,January 20, 2005 9:21AMAs a device driver writer,you should be aware of situations in which some types ofdevice access could adversely affect the system as a whole and should provide ade-quate controls.For example, device operations that affect global resources (such assetting an interrupt line), which could damage the hardware (loading firmware, forexample), or that could affect other users (such as setting a default block size on atape drive), are usually only available to sufficiently privileged users, and this checkmust bemade in thedriver itself.Driver writers must also be careful, of course, to avoid introducing security bugs.The C programming language makes it easy to make several types of errors. Manycurrentsecurity problems arecreated, for example,bybufferoverrunerrors,in whichthe programmerforgets to check how much data is written to a buffer, and data endsup written beyond the end of the buffer, thus overwriting unrelated data. Such errorscan compromise the entire system and must be avoided.Fortunately,avoiding theseerrors is usually relatively easy in the device driver context, in which the interface tothe user is narrowly defined and highly controlled.Some other general security ideas are worth keeping in mind.Any input receivedfrom user processes should be treated with great suspicion; never trust it unless youcan verify it. Be careful with uninitialized memory; any memory obtained from thekernel should be zeroed or otherwise initialized before being made available to a userprocess or device.Otherwise,information leakage (disclosure of data,passwords,etc.) could result. If your device interprets data sent to it, be sure the user cannotsend anything that could compromise the system. Finally, think about the possibleeffect of device operations; if there are specific operations (e.g., reloading the firmware on an adapter board or formatting a disk) that could affect the system, thoseoperations should almost certainly be restricted to privileged users.Be careful, also, when receiving software from third parties, especially when the ker-nel is concerned:because everybody has access to the source code,everybody canbreak and recompile things.Although you can usually trust precompiled kernelsfound in your distribution, you should avoid running kernels compiled by anuntrusted friend-if you wouldn't run a precompiled binary as root, then you'd bet-ter not run a precompiled kernel.For example,a maliciously modified kernel couldallowanyonetoload a module,thus opening an unexpected back doorvia init_module.Note that the Linux kernel can be compiled to have no module support whatsoever,thus closing any module-related security holes. In this case, of course, all neededdrivers must be built directly into thekernel itself.It is also possible,with 2.2andlaterkernels,todisable the loading of kernel modules after system boot via the capa-bilitymechanism.Security lssuesI
This is the Title of the Book, eMatter Edition Copyright © 2005 O’Reilly & Associates, Inc. All rights reserved. Security Issues | 9 As a device driver writer, you should be aware of situations in which some types of device access could adversely affect the system as a whole and should provide adequate controls. For example, device operations that affect global resources (such as setting an interrupt line), which could damage the hardware (loading firmware, for example), or that could affect other users (such as setting a default block size on a tape drive), are usually only available to sufficiently privileged users, and this check must be made in the driver itself. Driver writers must also be careful, of course, to avoid introducing security bugs. The C programming language makes it easy to make several types of errors. Many current security problems are created, for example, by buffer overrun errors, in which the programmer forgets to check how much data is written to a buffer, and data ends up written beyond the end of the buffer, thus overwriting unrelated data. Such errors can compromise the entire system and must be avoided. Fortunately, avoiding these errors is usually relatively easy in the device driver context, in which the interface to the user is narrowly defined and highly controlled. Some other general security ideas are worth keeping in mind. Any input received from user processes should be treated with great suspicion; never trust it unless you can verify it. Be careful with uninitialized memory; any memory obtained from the kernel should be zeroed or otherwise initialized before being made available to a user process or device. Otherwise, information leakage (disclosure of data, passwords, etc.) could result. If your device interprets data sent to it, be sure the user cannot send anything that could compromise the system. Finally, think about the possible effect of device operations; if there are specific operations (e.g., reloading the firmware on an adapter board or formatting a disk) that could affect the system, those operations should almost certainly be restricted to privileged users. Be careful, also, when receiving software from third parties, especially when the kernel is concerned: because everybody has access to the source code, everybody can break and recompile things. Although you can usually trust precompiled kernels found in your distribution, you should avoid running kernels compiled by an untrusted friend—if you wouldn’t run a precompiled binary as root, then you’d better not run a precompiled kernel. For example, a maliciously modified kernel could allow anyone to load a module, thus opening an unexpected back door via init_module. Note that the Linux kernel can be compiled to have no module support whatsoever, thus closing any module-related security holes. In this case, of course, all needed drivers must be built directly into the kernel itself. It is also possible, with 2.2 and later kernels, to disable the loading of kernel modules after system boot via the capability mechanism. ,ch01.2168 Page 9 Thursday, January 20, 2005 9:21 AM
01.2168 Page 10 Thursday,January 20, 2005 9:21 AMVersionNumberingBefore digging into programming, we should comment on the version numberingscheme used in Linux and which versions are covered by this book.First of all, note that every software package used in a Linux system has its ownrelease number, and there are often interdependencies across them: you need a par-ticular version of one package to run a particular version of another package.Thecreators of Linuxdistributions usuallyhandlethemessyproblemof matchingpack-ages, and the user who installs from a prepackaged distribution doesn't need to dealwith version numbers. Those who replace and upgrade system software, on the otherhand, are on their own in this regard. Fortunately, almost all modern distributionssupport the upgrade of single packages by checking interpackage dependencies;thedistribution's package manager generally does not allow an upgrade until the depen-denciesaresatisfied.To run the examples we introduce during the discussion,you won't need particularversions of any tool beyond what the 2.6kernel requires; any recent Linux distribu-tion can be used to run our examples. We won't detail specific requirementsbecause the file Documentation/Changes in your kernel sources is the best source ofsuch information if you experience anyproblems.As far as thekernel is concerned, the even-numbered kernel versions (i.e., 2.6.x) arethe stable ones that are intended for general distribution.The odd versions (such as2.7.x),on the contrary,are development snapshots and are quite ephemeral; the lat-est of them representsthecurrent status of development,but becomesobsolete in afewdaysorso.This book covers Version 2.6 of thekernel.Our focus has been to show all the fea-tures available to device driver writers in 2.6.10, the current version at the time wearewriting.This editionof thebook does not cover priorversions ofthekernel.Forthose of you who are interested, the second edition covered Versions 2.0 through 2.4indetail.That editionis stillavailableonlineat http:/lwn.net/Kernel/LDD2/.Kernel programmers should be awarethat thedevelopmentprocess changed with2.6The2.6series isnowaccepting changes that previouslywould havebeen consideredtoo large for a "stable" kernel. Among other things, that means that internal kernelprogramming interfaces can change, thus potentially obsoleting parts of this book;forthis reason,thesamplecode accompanyingthetext isknowntowork with2.6.10but some modules don't compile under earlier versions. Programmers wanting tokeep up with kernel programming changes are encouraged to join the mailing listsand to make use of the web sites listed in the bibliography. There is also a web pagemaintained at http://lwn.net/Articles/2.6-kernel-apil,which containsinformationabout API changes that have happened since this book was published.101Chapter 1: An Introduction to Device Drivers
This is the Title of the Book, eMatter Edition Copyright © 2005 O’Reilly & Associates, Inc. All rights reserved. 10 | Chapter 1: An Introduction to Device Drivers Version Numbering Before digging into programming, we should comment on the version numbering scheme used in Linux and which versions are covered by this book. First of all, note that every software package used in a Linux system has its own release number, and there are often interdependencies across them: you need a particular version of one package to run a particular version of another package. The creators of Linux distributions usually handle the messy problem of matching packages, and the user who installs from a prepackaged distribution doesn’t need to deal with version numbers. Those who replace and upgrade system software, on the other hand, are on their own in this regard. Fortunately, almost all modern distributions support the upgrade of single packages by checking interpackage dependencies; the distribution’s package manager generally does not allow an upgrade until the dependencies are satisfied. To run the examples we introduce during the discussion, you won’t need particular versions of any tool beyond what the 2.6 kernel requires; any recent Linux distribution can be used to run our examples. We won’t detail specific requirements, because the file Documentation/Changes in your kernel sources is the best source of such information if you experience any problems. As far as the kernel is concerned, the even-numbered kernel versions (i.e., 2.6.x) are the stable ones that are intended for general distribution. The odd versions (such as 2.7.x), on the contrary, are development snapshots and are quite ephemeral; the latest of them represents the current status of development, but becomes obsolete in a few days or so. This book covers Version 2.6 of the kernel. Our focus has been to show all the features available to device driver writers in 2.6.10, the current version at the time we are writing. This edition of the book does not cover prior versions of the kernel. For those of you who are interested, the second edition covered Versions 2.0 through 2.4 in detail. That edition is still available online at http://lwn.net/Kernel/LDD2/. Kernel programmers should be aware that the development process changed with 2.6. The 2.6 series is now accepting changes that previously would have been considered too large for a “stable” kernel. Among other things, that means that internal kernel programming interfaces can change, thus potentially obsoleting parts of this book; for this reason, the sample code accompanying the text is known to work with 2.6.10, but some modules don’t compile under earlier versions. Programmers wanting to keep up with kernel programming changes are encouraged to join the mailing lists and to make use of the web sites listed in the bibliography. There is also a web page maintained at http://lwn.net/Articles/2.6-kernel-api/, which contains information about API changes that have happened since this book was published. ,ch01.2168 Page 10 Thursday, January 20, 2005 9:21 AM
D1.2168 Page 11Thursday,January 20,20059:21AMThis text doesn't talk specifically about odd-numbered kernel versions.General usersnever have a reason to run developmentkernels.Developers experimenting withnewfeatures, however, want to be running the latest development release. They usuallykeep upgrading to the most recent version to pick up bug fixes and new implementa-tions of features. Note, however, that there's no guarantee on experimental kernels,and nobodyhelps you if you have problems due to a bug in a noncurrent odd-numberedkernel.Those who run odd-numbered versions ofthekernel areusually skilledenough to dig in the code without the need for a textbook, which is another reasonwhywedon'ttalkaboutdevelopmentkernelshereAnother feature of Linux is that it is a platform-independent operating system, notjust“a Unix clone for PC clones” anymore: it currently supports some 20 architec-tures. This book is platform independent as far as possible, and all the code sampleshavebeentested onatleast thex86 and x86-64platforms.Becausethecodehas beentested on both 32-bit and 64-bit processors, it should compile and run on all otherplatforms. As you might expect, the code samples that rely on particular hardwaredon't work on all the supported platforms, but this is always stated in the sourcecode.LicenseTermsLinux is licensed under Version 2of the GNU General Public License (GPL),a docu-ment devised for the GNU project by the Free Software Foundation.The GPL allowsanybody to redistribute, and even sell, a product covered by the GPL, as long as therecipient has access to the source and is able to exercise the same rights.Addition-ally,any software product derived from a product covered by the GPL must, if it isredistributed atall, bereleased undertheGPL.The main goal of such a license is to allow the growth of knowledge by permittingeverybodytomodifyprogramsatwill;atthesametime,peopleselling softwaretothepublic can still do their job.Despite this simple objective,there's a never-endingdiscussion about the GPL and its use.If you want to read the license,you can find itin several places in your system, including the top directory of your kernel sourcetreein theCOPYINGfileVendors often ask whether they can distribute kernel modules in binary form onlyThe answer to that question has been deliberately left ambiguous.Distribution ofbinarymodules-aslongas theyadheretothepublishedkernel interface-hasbeentolerated so far.But the copyrights on thekernel are held by many developers,andnot all of them agree that kernel modules are not derived products. If you or youremployer wish to distribute kernel modules under a nonfree license,you really needNote that there's noeven-numbered kernels as well, unless you rely on a commercial providerthat grants its own warranty.License Terms 11
This is the Title of the Book, eMatter Edition Copyright © 2005 O’Reilly & Associates, Inc. All rights reserved. License Terms | 11 This text doesn’t talk specifically about odd-numbered kernel versions. General users never have a reason to run development kernels. Developers experimenting with new features, however, want to be running the latest development release. They usually keep upgrading to the most recent version to pick up bug fixes and new implementations of features. Note, however, that there’s no guarantee on experimental kernels,* and nobody helps you if you have problems due to a bug in a noncurrent odd-numbered kernel. Those who run odd-numbered versions of the kernel are usually skilled enough to dig in the code without the need for a textbook, which is another reason why we don’t talk about development kernels here. Another feature of Linux is that it is a platform-independent operating system, not just “a Unix clone for PC clones” anymore: it currently supports some 20 architectures. This book is platform independent as far as possible, and all the code samples have been tested on at least the x86 and x86-64 platforms. Because the code has been tested on both 32-bit and 64-bit processors, it should compile and run on all other platforms. As you might expect, the code samples that rely on particular hardware don’t work on all the supported platforms, but this is always stated in the source code. License Terms Linux is licensed under Version 2 of the GNU General Public License (GPL), a document devised for the GNU project by the Free Software Foundation. The GPL allows anybody to redistribute, and even sell, a product covered by the GPL, as long as the recipient has access to the source and is able to exercise the same rights. Additionally, any software product derived from a product covered by the GPL must, if it is redistributed at all, be released under the GPL. The main goal of such a license is to allow the growth of knowledge by permitting everybody to modify programs at will; at the same time, people selling software to the public can still do their job. Despite this simple objective, there’s a never-ending discussion about the GPL and its use. If you want to read the license, you can find it in several places in your system, including the top directory of your kernel source tree in the COPYING file. Vendors often ask whether they can distribute kernel modules in binary form only. The answer to that question has been deliberately left ambiguous. Distribution of binary modules—as long as they adhere to the published kernel interface—has been tolerated so far. But the copyrights on the kernel are held by many developers, and not all of them agree that kernel modules are not derived products. If you or your employer wish to distribute kernel modules under a nonfree license, you really need * Note that there’s no guarantee on even-numbered kernels as well, unless you rely on a commercial provider that grants its own warranty. ,ch01.2168 Page 11 Thursday, January 20, 2005 9:21 AM
D1.2168Page 12Thursday,January 20,20059:21AMto discuss the situation with your legal counsel. Please note also that thekerneldevelopers have no qualms against breaking binary modules between kernel releases,even in the middle of a stablekernel series.If it is at all possible, both you and yourusers are better off if you release your module as free software.If you want your code to go into the mainline kernel, or if your code requires patchesto the kernel, you must use a GPL-compatible license as soon as you release the codeAlthough personal use of your changes doesn't force the GPL on you, if you distrib-ute your code,you must include the source code in the distribution--peopleacquir-ingyour packagemustbe allowed to rebuild thebinaryat will.As far as this book is concerned, most of the code is freely redistributable, either insource or binary form,and neitherwe nor O'Reilly retain any right on any derivedworks.All the programs are available at ftp:/iftp.ora.com/publexamples/linux/drivers/,and the exact license terms are stated in the LICENSE file in the same directoryJoiningtheKernelDevelopmentCommunityAs you begin writing modules for the Linux kernel, you become part of a larger com-munity of developers. Within that community, you can find not only people engagedin similarwork,butalso agroup of highlycommitted engineers working towardmaking Linux a better system.These people canbe a sourceof help,ideas, and criti-cal review as well-they will be the first people you will likely turn to when you arelookingfortestersforanewdriver.The central gathering point for Linux kernel developers is the linux-kernel mailinglist. All major kernel developers, from Linus Torvalds on down, subscribe to this listPlease note that the list is not for the faint of heart: traffic as of this writing can runup to 200messages per day ormore.Nonetheless, following this list is essential forthose who are interested in kernel development; it also can be a top-qualityresourceforthoseinneedofkerneldevelopmenthelpToiointhelinux-kernel listfollowtheinstructionsfound inthelinux-kernel mailing listFAQ:http://www.tux.org/lkml.Read therest of theFAQwhileyou areat it;there is a great deal of useful information there. Linux kernel developers are busypeople, and they are much more inclined to help people who have clearly done theirhomeworkfirst.OverviewoftheBookFrom here on, we enter the world of kernel programming. Chapter 2 introducesmodularization, explaining the secrets of the art and showing the code for runningmodules.Chapter3talksabout char drivers andshows thecomplete codefor a121 Chapter 1: An Introduction to Device Drivers
This is the Title of the Book, eMatter Edition Copyright © 2005 O’Reilly & Associates, Inc. All rights reserved. 12 | Chapter 1: An Introduction to Device Drivers to discuss the situation with your legal counsel. Please note also that the kernel developers have no qualms against breaking binary modules between kernel releases, even in the middle of a stable kernel series. If it is at all possible, both you and your users are better off if you release your module as free software. If you want your code to go into the mainline kernel, or if your code requires patches to the kernel, you must use a GPL-compatible license as soon as you release the code. Although personal use of your changes doesn’t force the GPL on you, if you distribute your code, you must include the source code in the distribution—people acquiring your package must be allowed to rebuild the binary at will. As far as this book is concerned, most of the code is freely redistributable, either in source or binary form, and neither we nor O’Reilly retain any right on any derived works. All the programs are available at ftp://ftp.ora.com/pub/examples/linux/drivers/, and the exact license terms are stated in the LICENSE file in the same directory. Joining the Kernel Development Community As you begin writing modules for the Linux kernel, you become part of a larger community of developers. Within that community, you can find not only people engaged in similar work, but also a group of highly committed engineers working toward making Linux a better system. These people can be a source of help, ideas, and critical review as well—they will be the first people you will likely turn to when you are looking for testers for a new driver. The central gathering point for Linux kernel developers is the linux-kernel mailing list. All major kernel developers, from Linus Torvalds on down, subscribe to this list. Please note that the list is not for the faint of heart: traffic as of this writing can run up to 200 messages per day or more. Nonetheless, following this list is essential for those who are interested in kernel development; it also can be a top-quality resource for those in need of kernel development help. To join the linux-kernel list, follow the instructions found in the linux-kernel mailing list FAQ: http://www.tux.org/lkml. Read the rest of the FAQ while you are at it; there is a great deal of useful information there. Linux kernel developers are busy people, and they are much more inclined to help people who have clearly done their homework first. Overview of the Book From here on, we enter the world of kernel programming. Chapter 2 introduces modularization, explaining the secrets of the art and showing the code for running modules. Chapter 3 talks about char drivers and shows the complete code for a ,ch01.2168 Page 12 Thursday, January 20, 2005 9:21 AM