All the nodes of the tree, except the leaves, denote directory names.A directory node containsinformation about the files and directories just beneath it.A file or directoryname consists of asequence of arbitraryAsCII characters,with theexceptionof /and of thenullcharacter\o.Mostfilesystems placea limiton the lengthofa filename,typicallynomorethan 255characters.Thedirectorycorrespondingtotherootofthetreeiscalledtherootdirectory.Byconvention,itsnameis a slash (/).Names must be different within the same directory,but the same name may be usedin different directories.omeoeratingsystemsallowfilenamesobexpressedinmanydfferentalphabetsbasedon6bitexendedodingfgraphicalcharactersuchasUnicodeUnix associates a current working directory with each process (see the section"The Process/KernelModel" later in this chapter); it belongs to the process execution context, and it identifies thedirectorycurrentlyusedbytheprocess.To identifyaspecificfile,theprocessusesapathname,whichconsistsof slashesalternatingwithasequenceof directorynamesthat leadtothefile.Ifthefirst item in thepathname is a slash,thepathnameis said tobeabsolute,because its starting pointis the root directory.Otherwise,if thefirst itemisa directorynameorfilename,thepathnameissaid to be relative, because its starting point is the process's current directory.While specifying filenames, the notations "." and ".." are also used. They denote the currentworkingdirectoryanditsparentdirectory,respectively.If thecurrent workingdirectoryistherootdirectory,"."and".."coincide.1.5.2.HardandSoftLinksAfilename included in a directory is calledafilehard link,ormoresimply,a link.The samefilemayhave several links included in the same directory or in different ones,so it mayhave severalfilenames.TheUnix command:$lnp1p2is used to create a new hard link that has thepathnamep2 fora file identified bythepathname p1.Hard linkshavetwolimitations:Itisnotpossibletocreatehard linksfordirectories.Doingsomighttransformthedirectorytreeintoagraphwithcycles,thusmakingit impossibletolocateafileaccordingto itsnameLinkscanbecreatedonlyamongfilesincluded inthesamefilesystem.Thisisaseriouslimitation,because modern Unix systems may include several filesystems located on31
31 All the nodes of the tree, except the leaves, denote directory names. A directory node contains information about the files and directories just beneath it. A file or directory name consists of a sequence of arbitrary ASCII characters,[*] with the exception of / and of the null character \0. Most filesystems place a limit on the length of a filename, typically no more than 255 characters. The directory corresponding to the root of the tree is called the root directory. By convention, its name is a slash (/). Names must be different within the same directory, but the same name may be used in different directories. [*] Some operating systems allow filenames to be expressed in many different alphabets, based on 16-bit extended coding of graphical characters such as Unicode. Unix associates a current working directory with each process (see the section "The Process/Kernel Model" later in this chapter); it belongs to the process execution context, and it identifies the directory currently used by the process. To identify a specific file, the process uses a pathname, which consists of slashes alternating with a sequence of directory names that lead to the file. If the first item in the pathname is a slash, the pathname is said to be absolute, because its starting point is the root directory. Otherwise, if the first item is a directory name or filename, the pathname is said to be relative, because its starting point is the process's current directory. While specifying filenames, the notations "." and "." are also used. They denote the current working directory and its parent directory, respectively. If the current working directory is the root directory, "." and "." coincide. 1.5.2. Hard and Soft Links A filename included in a directory is called a file hard link, or more simply, a link. The same file may have several links included in the same directory or in different ones, so it may have several filenames. The Unix command: $ ln p1 p2 is used to create a new hard link that has the pathname p2 for a file identified by the pathname p1. Hard links have two limitations: It is not possible to create hard links for directories. Doing so might transform the directory tree into a graph with cycles, thus making it impossible to locate a file according to its name. Links can be created only among files included in the same filesystem. This is a serious limitation, because modern Unix systems may include several filesystems located on
different disks and/or partitions, and users may be unaware of the physical divisionsbetween them.Toovercomethese limitations,soft links (alsocalled symbolic links)were introduced along timeago.Symbolic links are short files that contain an arbitrary pathname of another file. Thepathname may refer to any file or directory located in any filesystem; it may even refer to anonexistentfile.The Unix command:$ln-splp2createsa new soft link with pathnamep2 that refers to pathnamep1.Whenthis command isexecuted,thefilesystem extracts the directorypart of p2 and creates a new entry in that directoryoftypesymboliclink,withthenameindicatedbyp2.Thisnewfilecontainsthenameindicatedbypathnamep1.Thisway,eachreferencetop2canbetranslatedautomatically intoareferencetop1.1.5.3.FileTypesUnixfilesmayhaveoneofthefollowingtypes:RegularfileDirectory·Symbolic link··Block-oriented device fileCharacter-orienteddevicefile。Pipeandnamedpipe (alsocalledFIFO).SocketThefirst threefiletypes areconstituents of anyUnix filesystem.Theirimplementation is describedindetail inChapter18Device files are related both to I/O devices, and to device drivers integrated into the kernel. Forexample,whenaprogramaccessesadevicefile,itactsdirectlyontheI/Odeviceassociatedwiththatfile (seeChapter13)Pipes and sockets are special files used for interprocess communication (see the section"SynchronizationandCriticalRegions"laterinthischapter;alsoseeChapter19)1.5.4.FileDescriptorandInode32
32 different disks and/or partitions, and users may be unaware of the physical divisions between them. To overcome these limitations, soft links (also called symbolic links) were introduced a long time ago. Symbolic links are short files that contain an arbitrary pathname of another file. The pathname may refer to any file or directory located in any filesystem; it may even refer to a nonexistent file. The Unix command: $ ln -s p1 p2 creates a new soft link with pathname p2 that refers to pathname p1. When this command is executed, the filesystem extracts the directory part of p2 and creates a new entry in that directory of type symbolic link, with the name indicated by p2. This new file contains the name indicated by pathname p1. This way, each reference to p2 can be translated automatically into a reference to p1. 1.5.3. File Types Unix files may have one of the following types: Regular file Directory Symbolic link Block-oriented device file Character-oriented device file Pipe and named pipe (also called FIFO) Socket The first three file types are constituents of any Unix filesystem. Their implementation is described in detail in Chapter 18. Device files are related both to I/O devices, and to device drivers integrated into the kernel. For example, when a program accesses a device file, it acts directly on the I/O device associated with that file (see Chapter 13). Pipes and sockets are special files used for interprocess communication (see the section "Synchronization and Critical Regions" later in this chapter; also see Chapter 19). 1.5.4. File Descriptor and Inode
Unixmakesa clear distinction betweenthecontents of afile and the information about afile.Withtheexceptionofdevicefilesandfilesofspecialfilesystems,eachfileconsistsofasequenceofbytesThefile doesnot includeany control information,suchas its length or an end-of-file (EoF)delimiter.All information needed by the filesystem to handle a file is included in a data structure called aninode.Eachfilehasitsowninode,whichthefilesystemusestoidentifythefile.WhilefilesystemsandthekernelfunctionshandlingthemcanvarywidelyfromoneUnixsystemtoanother,they must always provide at least the following attributes,which are specified in thePOSIX standard:.Filetype(seetheprevioussection)Numberofhard linksassociatedwiththefile。Filelengthinbytes.DeviceID(i.e.,an identifierofthedevicecontainingthefile).Inodenumberthat identifies the file within thefilesystemUIDofthefileownerUsergroupIDof thefileSeveraltimestampsthatspecifytheinodestatuschangetime,thelastaccesstime,andthelastmodifytimeAccess rights and file mode (see the next section)1.5.5.AccessRightsandFileModeThepotential users of a filefall into three classes:TheuserwhoistheownerofthefileTheusers who belong to the same groupas thefile, not including theownerAllremainingusers(others)There arethreetypes of access rights --read,write, and executefor eachof thesethree classes.Thus,the set of access rights associated with a file consists of nine different binary flags.Threeadditional flags, called suid (Set User ID), sgid (Set Group ID), and sticky, define the file mode.These flags have the following meanings when applied to executable files:suidAprocess executing afilenormallykeepstheUserID (UID)of theprocess owner.However,if the executable file has the suid flag set, the process gets the UID of the file owner.33
33 Unix makes a clear distinction between the contents of a file and the information about a file. With the exception of device files and files of special filesystems, each file consists of a sequence of bytes. The file does not include any control information, such as its length or an end-of-file (EOF) delimiter. All information needed by the filesystem to handle a file is included in a data structure called an inode. Each file has its own inode, which the filesystem uses to identify the file. While filesystems and the kernel functions handling them can vary widely from one Unix system to another, they must always provide at least the following attributes, which are specified in the POSIX standard: File type (see the previous section) Number of hard links associated with the file File length in bytes Device ID (i.e., an identifier of the device containing the file) Inode number that identifies the file within the filesystem UID of the file owner User group ID of the file Several timestamps that specify the inode status change time, the last access time, and the last modify time Access rights and file mode (see the next section) 1.5.5. Access Rights and File Mode The potential users of a file fall into three classes: The user who is the owner of the file The users who belong to the same group as the file, not including the owner All remaining users (others) There are three types of access rights - read, write, and execute for each of these three classes. Thus, the set of access rights associated with a file consists of nine different binary flags. Three additional flags, called suid (Set User ID), sgid (Set Group ID), and sticky, define the file mode. These flags have the following meanings when applied to executable files: suid A process executing a file normally keeps the User ID (UID ) of the process owner. However, if the executable file has the suid flag set, the process gets the UID of the file owner
sgidAprocess executing a filekeeps theuser group ID of the process group.However,if theexecutablefilehasthesgidflagset,theprocessgetstheusergroupIDofthefile.stickyAn executable filewith the stickyflag set corresponds to a requestto thekernel tokeep theprograminmemoryafteritsexecutionterminates.'lThis flag has becomeobsolete;other approaches based on sharingof code pages arenow used (see Chapter2).Whenafileiscreatedbyaprocess,itsownerIDistheUIDoftheprocess.ItsownerusergroupIDcanbe eithertheprocess groupID of the creatorprocess ortheusergroup ID of theparentdirectory,depending on the value of the sgid flag of the parent directory.1.5.6.File-HandlingSystemCallsWhenauseraccessesthecontentsofeitheraregularfileoradirectory,heactuallyaccessessomedata stored in a hardware block device.In this sense, a filesystem is a user-level view of thephysical organization of a hard disk partition.Because a process in User Mode cannot directlyinteractwiththelow-levelhardwarecomponents,eachactualfileoperationmustbeperformed inKernel Mode.Therefore, the Unix operating system defines several system calls related to filehandling.All Unix kernels devote great attention to the efficient handling of hardware block devices toachievegood overall system performance.In the chapters thatfollow,wewill describetopicsrelatedtofilehandling in Linux and specificallyhowthekernel reacts to file-related systemcallsTounderstandthosedescriptions,youwillneedtoknowhowthemainfile-handlingsystemcallsare used; these are described in the next section.1.5.6.1.OpeningafileProcessescanaccessonly"opened"files.Toopenafile,theprocessinvokesthesystemcall:fd = open(path,flag,mode)The three parameters have the following meanings:path34
34 sgid A process executing a file keeps the user group ID of the process group. However, if the executable file has the sgid flag set, the process gets the user group ID of the file. sticky An executable file with the sticky flag set corresponds to a request to the kernel to keep the program in memory after its execution terminates.[*] [*] This flag has become obsolete; other approaches based on sharing of code pages are now used (see Chapter 9). When a file is created by a process, its owner ID is the UID of the process. Its owner user group ID can be either the process group ID of the creator process or the user group ID of the parent directory, depending on the value of the sgid flag of the parent directory. 1.5.6. File-Handling System Calls When a user accesses the contents of either a regular file or a directory, he actually accesses some data stored in a hardware block device. In this sense, a filesystem is a user-level view of the physical organization of a hard disk partition. Because a process in User Mode cannot directly interact with the low-level hardware components, each actual file operation must be performed in Kernel Mode. Therefore, the Unix operating system defines several system calls related to file handling. All Unix kernels devote great attention to the efficient handling of hardware block devices to achieve good overall system performance. In the chapters that follow, we will describe topics related to file handling in Linux and specifically how the kernel reacts to file-related system calls. To understand those descriptions, you will need to know how the main file-handling system calls are used; these are described in the next section. 1.5.6.1. Opening a file Processes can access only "opened" files. To open a file, the process invokes the system call: fd = open(path, flag, mode) The three parameters have the following meanings: path
Denotes thepathname(relative orabsolute)of thefileto be opened.flagSpecifies how the file must be opened (e.g., read, write, read/write, append).It also canspecify whether a nonexistingfile shouldbe created.modeSpecifies theaccess rights of a newly createdfileThis system call creates an "open file"object and returns an identifier called a file descriptor. Anopenfileobjectcontains:Somefile-handlingdata structures,suchasa setof flags specifying howthefilehasbeenopened, an offset field that denotes the current position in the file from which the nextoperationwill takeplace(theso-calledfilepointer),andsoon.Some pointers to kernel functions that the process can invoke.The set of permittedfunctions depends on the value of the flag parameter.We discuss open file objects in detail in Chapter 12. Let's limit ourselves here to describing somegeneral properties specified by the PosIX semanticsAfiledescriptorrepresentsan interactionbetween a process and an opened file,whileanopen file object contains data related to that interaction.The same open file object may beidentifiedbyseveralfiledescriptorsinthesameprocess.Several processes may concurrently open the same file.In this case, thefilesystem assignsa separate file descriptor to each file, along with a separate open file object. When thisoccurs,theUnixfilesystemdoesnotprovideanykindofsynchronizationamongtheI/ooperationsissuedbytheprocessesonthesamefile.However,severalsystemcallssuchasflock()areavailabletoallowprocessestosynchronizethemselvesontheentirefileoronportionsof it (seeChapter12).Tocreateanewfile,theprocessalsomayinvokethecreat()systemcall,whichishandledbythekernel exactly like open().1.5.6.2. Accessing an opened fileRegular Unix files can be addressed either sequentially or randomly,while devicefiles and namedpipesareusuallyaccessed sequentially.Inbothkinds ofaccess,thekernel storesthefilepointer35
35 Denotes the pathname (relative or absolute) of the file to be opened. flag Specifies how the file must be opened (e.g., read, write, read/write, append). It also can specify whether a nonexisting file should be created. mode Specifies the access rights of a newly created file. This system call creates an "open file" object and returns an identifier called a file descriptor. An open file object contains: Some file-handling data structures, such as a set of flags specifying how the file has been opened, an offset field that denotes the current position in the file from which the next operation will take place (the so-called file pointer), and so on. Some pointers to kernel functions that the process can invoke. The set of permitted functions depends on the value of the flag parameter. We discuss open file objects in detail in Chapter 12. Let's limit ourselves here to describing some general properties specified by the POSIX semantics. A file descriptor represents an interaction between a process and an opened file, while an open file object contains data related to that interaction. The same open file object may be identified by several file descriptors in the same process. Several processes may concurrently open the same file. In this case, the filesystem assigns a separate file descriptor to each file, along with a separate open file object. When this occurs, the Unix filesystem does not provide any kind of synchronization among the I/O operations issued by the processes on the same file. However, several system calls such as flock( ) are available to allow processes to synchronize themselves on the entire file or on portions of it (see Chapter 12). To create a new file, the process also may invoke the creat( ) system call, which is handled by the kernel exactly like open( ). 1.5.6.2. Accessing an opened file Regular Unix files can be addressed either sequentially or randomly, while device files and named pipes are usually accessed sequentially. In both kinds of access, the kernel stores the file pointer