xvi Contents 9.11 Common Memory-Related Bugs in C Progiams 870. 9.11.1 Dereferencing Bad Pointers 870 9.11.2 Reading Uninitialized Memory 871 9.11.3 Allowing Stack Buffer Overflows 871 9.11.4 Assuming That Pointers and the Objects They Point to Are the Same Size 872 9.11.5 Making Off-by-One Errors 872 9.11.6 Referencing a Pointer Instead of the Object It Points To 873 9.11.7 Misunderstanding Pointer Arithmetic 873 9.11.8 Referencing Nonexistent Variables 874 9.11.9 Referencing Data in Free Heap Blocks 874 9.11.10 Introducing Memory Leaks 875 9.12 Summary 875 Bibliographic Notes 876 Homework Problems 876 Solutions to Practice Problems 880 Part Ill Interaction and Communication between Programs 10 System-Level I/O 889 10.1Unix/O890 10.2 Files 891 10.3 Opening and Closing Files 893 10.4 Reading and Writing Files 895 10.5 Robust Reading and Writing with the Rio Package 897 10.5.1 RIo Unbuffered Input and Output Functions 897 10.5.2 RIo Buffered Input Functions 898 10.6 Reading File Metadata 903 10.7 Reading Directory Contents 905 10.8 Sharing Files 906 10.9 1/O Redirection 909 10.10 Standard I/O 911 10.11 Putting It Together:Which I/O Functions Should I Use?911 10.12 Summary 913 Bibliographic Notes 914 Homework Problems 914 Solutions to Practice Problems 915
xvi Conterits 9.11 Common Memory-Related Bugs in-CPrqgfams 870. 9.11.1 Dereferencing Bad Pointers 870 9.11.2 Reading Uninitialized Memory 871 9.11.3 Allowing Stack Buffer Overflows 871 9.11.4 Assuming That Pointers and the Objects They Point to Are the Same Size 872 9.11.5 Making Off-by-One Errors 872 9.11.6 Referencing a Pointer Instead of the Object It Points f'o 873 9.11.7 'Misunderstanding Pointer Arithmetic &73 9.11.8 Referencing Nonexistent Variables 874 9.11.9 Referencing Data in Free Heap Blocks 874 9.11.10 Introducing Memory Leaks 875. 9.12 Summary 875 Bibliographic Notes 876 Homework Problems 876 Solutions to Practice Pr6blems 880' Part Ill Interaction and Communicatiqn between Programs 10 System-Level I/O 889 10.1 Unix I/O S90 10.2 Files 891 10.3 Opening and Closing Files 893 10.4 Reading and Writing Files 895 10.5 Robust Reading and Writing with the Rm Pack~ge 897_ 10.5.1 Rm Unbuffered Input and Output Functions 897 10.5.2 Rio Buffered Input Function§ 898 10.6 Reading File Metadata 903 10.7 Reading Directory Contents 905 10.8 Sharing Files 206 10.9 I/O Redirection 909 10.10 Standard I/O 911 10.11 Putting It Together: Which I/O Functions Shoul,d I Use? 911 10.12 Summary 913 Bibliographic Notes 914 Homework Problems 914 Solutions to Practice Problems 915
Contents xvii 11 Network Programming 917 11.1 The Client-Server Programming Model 918 11.2 Networks 919 11.3 The Global IP Internet 924 11.3.1 IP Addresses 925 11.3.2 Internet Domain Names 927 11.3.3 Internet Connections 929 F/ 11.4 The Sockets Interface 932 11.4.1 Socket Address Structures 933 11.4.2 The socket Function 934 11.4.3 The connect Function 934 11.4.4 The bind Function,935 11.4.5 The listen Function 935 11.4.6 The accept Function 936' 11.4.7 Host and Service Conversion 937 11.4.8 Helper Functions for the Sockets Interface 942 11.4.9 Example Echo Client and Server 944 11.5 Web Servers 948 11.5.1 Web Basics 948 11.5.2 Web Content 949 11.5.3 HTTP Transactions 950 11.5.4 Serving Dynamic Content 953 11.6 Putting It Together:The TINY Web Server 956 11.7 Summary 964 Bibliographic Notes 965 Homework Problems 965 Solutions to Practice Problems 966 12 Concurrent Programming 971 12.1 Concurrent Programming with Processes 973 12.1.1 A Concurrent Server Based on Processes 974 12.1.2 Pros and Cons of Processes 975 12.2 Concurrent Programming with I/O Multiplexing 977 12.2.1 A Concurrent Event-Driven Server Based on 1/O Multiplexing 980 12.2.2 Pros and Cons of I/O Multiplexing 985 12.3 Concurrent Programming with Threads 985 12.3.1 Thread Execution Model 986
11 Network Programming 917 11.l The Client-Server Programming M9del 918 11.2 Networks 919 11.3 The Global IP Inten;iet 924 11.3.1 IP .:\dc!resses 925 11.3.2 Internet Domain N'ames 927 11.3.3 Internet Connections 929 11.4 The Sockets Interface 932 11.4.1 Socket Addre,ss Structures 933 11.4.2 The socket Function 934 11.4.3 The connect Function 934 11.4.4 The bind Function 935 r ·~ f 11.4.5 The listen Function 93~ r . 11.4.6 The acc-:pt Function 936 11.4.7 Host and Service Conversion 937 £ I 11.4.8 Helper Functions for the Socket~ Ip.terface 942 11.4.9 Example Echo Client and Server ·944 11.S Web Servers 948 11.5.1 Web Basics 948 11.5.7 Web Content 949 11.5.3 HTIP Transactions 950 11.5.4 Serving Dynamic Content 953 11.6 Putting It Together: The TINY Web Server 956 11.7 Summary 964 Bibliographic Notes 965 Homework Problems 965 Solutions to Practice Problems 966 12 Concurrent Programming 971 12.l Concurrent Programming with Processes 973 12.1.1 A Concurrent Server Based on Processes 974 12.1.2 Pros and Cons of Processes 975 12.2 Concurrent Programming with I/O Multiplexing 977 12.2.l A Concurrent Event-Driven Server Based on I/O Multiplexing 980 12.2.2 Pros and Cons of I/O Multiplexing 985 12.3 Concurrent Programming with Threads 985 12.3.l Thread Execution Model 986 Contents xvii
山 xviii Contents 12.3.2 Posix Threads 987 12.3.3 Creating Threads 988 12.3.4 Terminating Threads 988 12.3.5 Reaping Terminated Threads 989 12.3.6 Detaching Threads 989 12.3.7 Initializing Threads 990 12.3.8 A Concurrent Server Based on Threads 991 12.4 Shared Variables in Threaded Programs 992 12.4.1 Threads Memory Model 993 12.4.2 Mapping Variables to Memory 994 12.4.3 Shared Variables 995 12.5 Synchronizing Threads with Semaphores 995 12.5.1 Progress Graphs 999 12.5.2 Semaphores 1001 12.5.3 Using Semaphores for Mutual Exclusion 1002 12.5.4 Using Semaphores to Schedule Shared Resources.1004 12.5.5 Putting It Together:A Concurrent Server Based on Prethreading 1008 12.6 Using Threads for Parallelism 1013 12.7 Other Concurrency Issues 1020 12.7.1 Thread Safety 1020 12.7.2 Reentrancy 1023 12.7.3 Using Existing Library Functions in Threaded Prograis 1024 12.7.4 Races1025 12.7.5 Deadlocks 1027 12.8 Summary 1030 Bibliographic Notes 1030 Homework Problems 1031 Solutions to Practice Problems 1036 A Error Handling 1041 A.1 Error Handling in Unix Systems 1042 A.2 Error-Handling Wrappers 1043 References 1047 Index 1053
xviii Contents 12.3.2 Posix Threads 987 12.3.3 Creating Threads 988 12.3.4 Terminating Threads 988 12.3.5 Reaping Terminated Threads 9~9 12.3.6 Detaching Threads 989 12.3.7 Initializing Threads 990 12.3.8 A Concurrent Server Based on Threads 9~1 12.4 Shared Variables in Threaded Programs 992 12.4.1 Threads Memory Model 993 12.5 12.4.2 Mapping Variables to Memory 994 12.4.3 Shared Variables 995 Synchronizing Threads with Semaphores 995 12.5.1 Progress Graphs 999 12.5.2 Semaphores 1001 12.5.3 Using Semaphores for Mutual Exclusion lp02 12.5.4 Using Semaphores to Scqedule Shared Respurces. 1004 12.5.5 Putting It Together: A Concurrent Server Based on 12.6 12.7 Prethreading 1008 Using Threads for Parallelism 1013 Other Concurrency Issues 1020 12.7.1 Thread Safety 1020 12.7.2 Reentrancy 1023 " 12.7.3 Using Existing Library Functions in Threaded Programs 12.7.4 Races 1025 12.7.5 Deadlocks 1027 12.8 Summary 1030 A Bibliographic Notes 1030 Homework Problems 1031 Solutions to Practice Problems 1036 Error Handling 1041: A.1 Error Handling in Unix Systems 1042 A.2 Error-Handling Wrappers 1043 References 1047 Index 1053 - 1024
Preface This book(known as CS:APP)is for computer scientists,computer engineers,and others who want to be able to write better programs by learning what is going on "under the hood'"of a computer system. Our aim'is to explain the enduring concepts underlying all computer systems, and to show you the concrete ways that these ideas affect the correctness,'perfor- mance,and utility of your application programs.Many systems bobks are'written from a builder's perspective,describing how to implement the hardware or the sys- tems software,including the operating system,compiler,and network intefface. This book is written from'a programimer's perspective,'describing how application programmers can use their knowledge of a system to write better programs.'Of course,learning what a system is supposed to do provides a good first step in learn- ing how to build one,'so this book also serves as a'valuable introductioni to those who go on to implement systems hardware and software.Most systers books also tend to focus on just one aspect of the system,for example,the hardware archi- tecture;the operating system,the compiler,or the network.This book'spans all of-these aspects;with the unifying theme of a programmer's perspective. If you-study and learir.the concepts in-this book,you will be"on your way to becoming the rare power programmer'who knows how things work'and how to fix them when they break.You will-be able to write programs that make better use of the'capabilities provided by the operating system and systems software, that operate correctly across a wide'range of operating conditions and run-time parameters;that run faster,and that avoid the flaws that make pfograms vulner- able to cybefattack.You will be prepared to delve deeper into'advariced topics such as compilers,'computer architecture,operating systems,embedded systems, networking,and cybersecurity. Assumptions about the Reader's Background This book focuses on systems that execute x86-64 machine code.x86-64 is the latest in an evolutionary path followed by Intel and its competitors that started with the 8086 microprocessor in 1978.Due to the naming conventions used by Intel for its microprocessor line,this class of microprocessors is referred to colloquially as "x86."As semiconductor technology has evolved to allow more transistors to be integrated onto a single.chip,these processors have progressed greatly in their computing.power,and theit memory capacity.As part of this progression,they have gone from pperating on16-bit words,to 32-bit.words with the introduction of IA32 processors,and most recently to 64-bit words with x86-64. We consider how these machines execute C programs on Linux.Linux is,one of a number'of operating systems'having their heritage in the Unix operating system developed originally by Bell Laboratories.Other members.of this class xix
Preface This book (known as CS:APP) is for computer scientists, computer engineers, and others who want to be able to write better programs by learning what is going on "under the hood'" of.a computer systefn. Our aim'is to explain the enduring concepts underlying all computer systenls, and to show you the cohcrete ways that these ideas affect the correctn'ess;perfSrmance,'lmd utility of your application programs.'Mlmy systems bob ks are1 \vritten from a builder's perspective, describing how to implement the hardware or th'e systems software, inC!uding the operating system, compiler, and network·iiltefface. Thisbook is wrihen from"a progr'dmme""' pefspective,'describing how application programmers can use their knowledge of a system to write better programs. 'Of course, learning what a system i§ supposed to do provide&.a good first step in learning how to build one,'so this book also serves as a· valuable introduction to those who go on to iinplemeht systems hardware and soffware. Most systems books also tend to focus on just one aspect of the system, for example, the•hardware architecture: the operating system, the compiler, or the "network. Tiiis book" spans all of·these aspects,' with !lie unifying theme of a progranfmer's perspective. If you·study ancl-learh>.the-concepts-in-this-!:mok~you-.wiU1:le"on·yoni-wzj>'tcr-- - becoming the 'rare power progranimer'who knows how things work'and how io fix them when tbey'bteak. You will· be able to write programs that·make•better use of the'caP,abiij(ies provided by theloperating systeni'and systellis software, that operate 'correctly across'1i wicte' range of operating conditibhs and run'.-t:llne parameters,' that run faster, and that avoid the flaws that make·ptogramS vu!Ilerable t'6 cybefatt'ack-. You will be prepared to delve deeper into 'advanced topics sucn as conipilers;'c6mputer architecture, 'operating sy~tems, embedded systems, networking, and cybersecurity. Assumptions' abou't the Reader's Backgrourid " This book focuses on systems that execute x86-64 machine code. x86-64 is the latest in an evolutionary path followed by Intel and its competitors .that started with the 8086 microprocessor in 1978. Due to the naming conventions used by Intel for its microprocessor line, this class of microprocessors is referr~d tq coll9ql'ially as "x86." As semiconductor technology has evolved to allow more transistors to be integrated onto a single. chip, these processors have progressed greatly in their computing.ppwer. and theii:.mel)lory capacity. 'As.part of, this ptogression, (hey have gone from pp,erating on· 16-bit 'XQW, to.32-bit.words with the; introduction of IA32 processors, and most recently to 64-bit words with x86-64. We consider how these machines execute C programs on Linux. Linux is.one ~ of a number·of operating systems·having their heritage in the Unix operating system developed originally by Bell Laboratories. Other members-of this class xix
XX Preface Nwvicon the progtamming languae .To help readers whose background in programming is weak (or nonexistent),we have also included these special notes to highlight features that are especially impoftant in C.We assume you are familiar w起 of operating systems include Solaris,FreeBSD,and MacOS X.In recent years, these operating systems have maintained a high level of compatibility through the efforts of the Posix and Standard Unix Specification standardization efforts.Thus, the material in this book applies almost directly to these "Unix-like"operating systems. The,text contains numerous programming examples that have been compiled and run on Linux systems.We assume that you have access to such a machine,and are able to log in and do simple things such as listing files and changing directo- ries.If your computer runs Microsoft Windows,we recommend that you install one of the many different virtual machine environments(such as VirtualBox or VMWare)that allow programs written for one operating system(the guest OS) to run under another (the host OS). We also assume that you have some familiarity with C or C++.If your only prior experience is with Java,the transition will require more effort on your part, but we will help you.Java and C share similar syntax and control statements. However,there are aspects of C(particularly pointers,explicit dynamic memory allocation,and formatted I/O)that do not exist in Java.Fortunately,C is a small language,and it is clearly and beautifully described in the classic "K&R"text by Brian Kernighan and Dennis Ritchie [61].Regardless of your programming background,consider K&R an essential part of your personal systems library.If your prior experience is with an interpreted language,such as Python,Ruby,or Perl,you will definitely want to devote some time to learning C before you attempt to use this book. Several of the early chapters in the book explore the interactions between C programs and their machine-language counterparts.The machine-language exam- ples were all generated by the GNU Gcc compiler running on x86-64 processors. We do not assume any prior experience with hardware,machine language,or assembly-language programming. How to Read the Book Learning how computer systems work from a programmer's perspective is great fun,mainly because you can do it actively.Whenever you learn something new, you can try it out right away and see the result firsthand.In fact,we believe that the only way to learn systems is to do systems,either working concrete problems or writing and running programs on real systems. This theme pervades the entire book.When a new concept is introduced,it is followed in the text by one or more practice problems that you should work
xx Preface l I II t of operating systems include Solaris, FreeBSD, and MacOS X. In recent years, these operating systems have maintained a high level of compatibility through the efforts of the Posix and Standard Unix Specification standardization efforts. Thus, the material in this book applies almost directly to these "Unix-like" operating systems. The, text contains numerous programming examples that have been compiled and run on Linux systems. We assume that you have access to such a machine, and are able to log in and do simple things such as listing files and changing directories. If your computer runs Microsoft Windows, we recommend that you install one of the many different virtual machine environments (such as Virtua!Box or VMWare) that allow programs written for one operating system (the guest OS) to run under another (the host OS). We also assume that you have some familiarity with C or C++. If your only prior experience is with Java, the transition will require more effort on your part, but we will help you. Java and C share similar syntax and control statements. However, there are aspects of C (particularly pointers, explicit dynamic memory allocation, and formatted I/O) that do not exist in Java. Fortunately, C is a small language, and it is clearly and beautifully described in the classic "K&R" text by J?rian Kernighan and Dennis Ritchie [61]. Regardless of your programming background, consider K&R an essential part of your personal systems library. If your prior experience \s with an interpreted language, such as Python, Ruby, or Perl, you will definitely want to devote some ti_me to learning C before you attempt to use this book. Several of the early chapters in the book explore the interactions between C programs and their machine-language counterparts. The machine-language examples were all generated by the GNU ace compiler running on x86-64 processors. We do not assume any prior experience with hardware, machine language, or assembly-language programming. How to Read the Book Learning how computer systems work from a programmer's perspective is great fun, mainly because you can do it actively. Whenever you learn somethihg new, you can try it out right away and see the result firsthand. In fact, we believe that the only way to learn systems is to do systems, either working concrete problems or writing and running programs on real systems. This theme pervades the entire book. When a new concept is introduced, it is followed in the text by one or more practice problems that you should work