XXXVI About the Authors David R.O'Hallaron is a professor of computer science and electrical and computer engineering at Carnegie Mellon University.He received his PhD from the Uni- versity of Virginia.He served as the director of Intel Labs,Pittsburgh,from 2007 to 2010. He has taught computer systems courses at the un- dergraduate and graduate levels for 20 years on such topics as computer architecture,introductory com- puter systems,parallel processor design,and Internet services.Together with Professor Bryant,he developed the course at Carnegie Mellon that led to this book.In 2004,he was awarded the Herbert Simon Award for Teaching Excellence by the CMU School of Computer Science,an award for which the winner is chosen based on a poll of the students Professor O'Hallaron works in the area of computer systems,with specific in- terests in software systems for scientific computing,data-intensive computing,and virtualization.The best-known example of his work is the Quake project,an en- deavor involving a group of computer scientists,civil engineers,and seismologists who have developed the ability to predict the motion of the ground during strong earthquakes.In 2003,Professor O'Hallaron and the other members of the Quake team won the Gordon Bell Prize,the top international prize in high-performance computing.His current work focuses on the notion of autograding,that is,pro- grams that evaluate the quality of other programs
I I' , u , I! II 1' 1' xxxvi About the Authors David R. O'Hallaron is a professor of computer science and electrical and computer engineering at Carnegie M~llon University. He received his PhD from the University of Virginia. He served as the director of Intel Labs, Pittsburgh, from 2007 to 2010. He has taught computer systems courses at the undergraduate and graduate levels for 20 years on such topics as computer architecture, introductory computer systems, parallel processor design, and Internet services. Together with Professor Bryant, he developed the course at Carnegie Mellon that led to this book. In 2004, he was awarded the Herbert Simon Award, for Teaching Excellence by the CMU School of Computer Science, an award for which the winner is chosen based on a poll of the student& Professor O'Hallaron works in the area of computer systems, with specific interests in software systems for scientific computing, data-intensive computing, and virtualization. The best-known example of his work is the Quake project, an endeavor involving a group of computer scientists, civil engineer~, and seismologists who have developed the ability to predict the motion of the ground during strong earthquake& In 2003, Professor O'Hallaron and the other members of the Quake team won the Gordon Bell Prize, the top int~rnational prize in high-performance computing. His current work focu~es on the notion of autograding, that is, programs that evaluate the quality of other programs
GAPT正R 1 A Tour of Computer Systems 1.1 Information Is Bits Context 3 1.2 Programs Are Translated by Other Programs into Different Forms 4 1.3 It Pays to Understand How Compilation Systems Work 6 .4 Processors Read and Interpret Instructions Stored in Memory 7 1.5 Caches Matter 11 1.6 Storage Devices Form a Hierarchy 14 1.7 The Operating System Manages the Hardware 14 1.8 Systems Communicate with Other Systems Using Networks 19 1.9 Important Themes 22 1.10 Summary 27 Bibliographic Notes 28 Solutions to Practice Problems 28 1
" A Tour of Computer Systems 1.1 1.2 1.3 1 . .4 1.5 1.6 1.7 1.8 1.9 1.10 Information Is Bits + Context 3 Programs Are Translated by Other Programs into Different Forms 4 It Pays to Understand How Compilation Systems Work 6 Processors Read and Interpret Instructions Stored in fylemory 7 Caches Matter 11 Storage Devices Form a Hierarchy 14 The Operating System Manages the Hardware 14 Systems Communicate with Other Sy~tems Using Networks 19 Important Themes 22 Summary 27 Bibliographic Notes 28 Solutions to Practice Problems 28 1
2 Chapter 1 A Tour of Computer Systems computer system consists of hardware and systems software that work to- ether to rn application programs.Specific implementations of systems change over time,but the underlying concepts do not.All computer systems have similar hardware and software components that perform similar functions.This book is written for programmers who want to get better at their craft by under- standing how these components work and how they affect the correctness and performance of their programs. You are poised for an exciting journey.If you dedicate yourself to learning the concepts in this book,then you will be on your way to becoming a rare "power pro- grammer,"enligbtened by an understanding of the underfying computer system and its,impact on your application programs. 'You are.going to learn practical skills such as how to avoid strange numerical errors caused by the way that computers represent numbers.You will learn how to optimize your C code by using clever tricks that exploit the designs of modern processors and memory systems.You will learn how the compiler implements procedure calls and how to use this knowledge to avoid the security holes from buffer overflow vulnerabilities that plague network and Internet software.You will learn how to recognize and avoid the nasty errors during linking that confound the average programmer.You will learn how to write your own Unix shell,your own dynamic storage allocation package,and even your own Web server.You will learn the promises and pitfalls of concurrency,a topic of increasing importarce as multiple processor cores are integrated onto single,chips. In their classic text on the C programming language [61],Kernighan and Ritchie introduce readers to C using the hello program shown in Figure 1.1. Although hello is a very simple program,every major part of the system must work in concert in order for it to run to completion.In a sense,the goal of this book is to help you understand what happens and why when you run hello on your system. We begin our study of systems by tracing the lifetime of the hello program, from the'time it is created by a programmer,until.it runs on a system,prints its simple message,and terminates.As we follow the lifetime of the program,we will briefly introduce the key concepts,terminology,and components that come into play.Later chapters will expand on these ideas. code/intro/hello.c 1 #include <stdio.h> 2 3 int main() 5 printf("hello,world\n"); return 0; code/intro/hello.c Figure 1.1 The hello program.(Source:[60])
I' I' . ! I ~ I 'I ' 2 Chapter 1 A Tour of Computer Systems A computer system consists of hardware and systems software that work together to run application programs. Specific implementations of systems change over time, but the underlying concepts do not. All computer systems have similar hardware and software components that perforin.similar functions. This book is written for programmers who want to get better at their craft by understanding how these components work and how they affect the correctness and performance of their programs. You are poised for an exciting journey. If you ded\cate yourself to learning the concepts in this book, then you will be on your way to becoming a rare "power programmer," enlightened by an unfferstanding of the undeHyiiig "computer system and its.impact on your application programs. 'Y-ou are.g'bing i6'1earn·pi:actica1 skiJI& s_uch as ho'Y to avoid strange numerical errors caused by the way that computers represent numbers. You will learn how to optimize your C code by using clever tricks that exploit the designs of modern processors and memory systems. You will learn how the compiler implements procedure calls and how to use this knowledge to avoid the security holes from buffer overflow vulnerabilities that plague network and Internet software. You will learn how to recognize and avoid the nasty errors during linking that confound the average programmer. You will learn ho~Jo write your own Unix shell, your own dynamic storage allocation package, and even your own Web server. You will learn the promises and pitfalls of concurrency, a topic of increasing importadce as .multiple processor cores are integrated onto single ,cgips. In their classic text on 'the C programming language [61], Kernighan and Ritchie introduce rea-ders to e using the hello program shown in Figure 1.1. Although hello is a very simple program, every major part of the system must work in concert in order for it to run to completion. In a sense, the goal of this book is to help you understand what happens and why when you run hello on your system. We begin our study of systems by tracing the lifetime of the hello program, from the·time it is created by a programmer, until.it runs on a system, prints its simple message, and terminates. As we follow tl:je lify~ime of the program, we will briefly introduce the key concepts, terminology, and domponents that come into play. Later chapters will expand on these ideas. -------------------------- code!intro/hello.c 1 #include <stdio.h> 2 3 int main() 4 { 5 printf( 11 hello, world\n11 ); 6 return O; 7 } -------------------------- code/intro!hello.c Figure 1.1 The hello program. (Source: [60]) I ' I ! l
Section,1.1 Information is Bits+Context 3 n 1 u SP d 0 35 105 110 99 108117 100 101 32 60115116100 105111 46 h > n n 1 n t SP i n ( n 104 62 10 10 105110 116 32 109 97 105 110 40 41 10 123 (n SP SP SP SP h +1 10 32 32 32 32 112 114 105 110 116 102 40 34,104101108 1 SP \n SP 108 111 44 32119111 114108 100 92 110 34 41 59 10 32 SP SP SP e n SP 0 n (n 32 32 3211410111611711411032 48 59 10125 10 Figure 1.2 The ASClI text representation of hello.c. 1.1 Information Is Bits Context Our hello program begins life as a source program (or source file).that the programmer creates with an editor and saves in a text file called hello.c.The source program is a sequence of bits;each with a value of 0 or 1,organized in 8-bit chunks called bytes.Each byte represents some text character in the program. Most computer systems represent text characters using the ASCII standard that represents each character with a unique byte-size integer value.For example, Figure 1.2 shows the ASCII representation of the hello.c program. The hello.c program is stored in a file as a sequence of bytes.Each byte has an integer value that corresponds to some character.For example,the first byte has the integer value 35,which corresponds to the character'#'.The second byte has the integer value 105,which corresponds to the character'i',and so on.Notice that each text line is terminated by the invisible newline character '\n',which is represented by the integer value 10.Files such as hello.c that consist exclusively of ASCII characters are known as text files.All other files are known as binary files. The representation of hello.c illustrates a fundamental idea:All information in a system-including disk files,programs stored in memory,user data stored in memory,and data transferred across'a network-is represented as a bunch of bits. The only thing that distinguishes different data objects is the context:in which we view them.For example,in different contexts,the same sequence of bytes might represent an integer,floating-point number,character string,or machine instruction. As programmers,we need to understand machine representations of numbers because they are not the same as integers and real numbers.They are finite 1.Other encoding methods are used to represent text in non-English languages.See the aside on page 50 for a discussion on this
I Section,t.1 Information 1s Bits + Context # i n c 1 u d e SP < s t d i 35 105 110 99 108 117 100 101 32 60 115 116 100 105 h > \n \n i n t SP m a i n ( 104 62 10 10 105 ·110 ,116 32 109 97 105 110 40 41 \n SP SP SP SP p r i n t f .( h 10 32 32 32 32 112 114 105 110 116 102 40 34. 104 1 0 SP w 0 r 1 d ' \ n 108 111 44 32 119 111 114 108 100 92 110 q4 41 59 SP SP SP r e t u I' n SP 0 \n '} 32 32 32 114 101 116 117 114 110 32 48 59 10 125 Figure 1.2 The ASCII text representation of hello. c. 1.1 Information Is Bits + Context Our hello program begins life as a source program (or source file). that the programmer creates with an editor and saves in a text file called hello. c. The source program is a sequence of bits; each with a value of 0or1, organized in 8,bit chunks called bytes. Each byte represents some text character in the program. Most computer systems represent text characters using the ASCII standard that represents each character with a unique byte-size integer value.1 For example, Figure 1.2 shows the ASCII representation of the hello. c program. The hello. c program is stored in a file as a sequence of bytes. Each byte has an integer value that corresponds to some character. For example,. the first bytei has the integer value 35, which corresponds to the character'#'. The second byte has the integer value 105, which corresponds to the character' i ',and so on. Notice that each text line is terminated by the invisible newline character '\n', which is represented by the integer value 10. Files such as hello. c that consist exclusively of ASCII characters are known as text files. Al! 6thei files are known as binary files. The representation of hello. c illustrates a fundamental idea: All information in a system-including disk files, programs stored in memory, user data stored in memory, and data transferred across' a network-is ieilresented 'as a bunch of bits. The only thing that distinguishes different data objects is the context: in which we view them. For example, in different contexts,, the same sequence of bytes might represent an integer, floating-point number, chardcter string, or machine instruction. • As programmers, we need to understand machine representations of numbers because they are not the same as intygers and real numbers. They' are finite ,• I' ,) 1. Other encoding methods are used to represent text in non-English languages. See the asidp on page 50 for a discussion on this. 0 111 46 \n { 10 123 e ·l 101 108 \n SP 10 32 \n 10 3
Chapter 1 A Tour of Computer Systems Aside Origins of the C.programming language C was developed from 1969 to 1973 by Dennis Ritchie of Bell Laboratories.The American National Standards Institute(ANSIratified the ANSI Cstandard in 1989,and this standardication later became the responsibility of the International Standards Organization (ISO).The standards define the C language and a set of library functions known'as the Cstandard library.Kernighan and Ritchie describe ANSICin their classic book,which is known affectionately as"K&R"[61].Im Ritchie's words192],C is"quirky,flawed,and anenormous success."So why the success? .C.was closely tied with the.Unix operating systein.C'was developed from the beginning as the system programming-language for Unix:Most of the Unix-kernel(the core part of the operating. system),and all of its supporting tools and libraries,werewritten.in.C.As Unix became popular in universities in the late 1970s and early 1980s,many people were exposed to Cand found that they liked it.Since Unix was written almost entirely in C,it could be easily ported to new machines which created an even wider audience for both C and Unix. .Cisasmall,simple language The design was controlled bya sirigleperson,tather thana committee. and the result was a clean,consistent design with little baggage.The K&R book describes the complete language and standard library,with numerous examples and exercises,in only 261 pages The simplicity of Cmade it relatively easy to learn and to portto differentcomiputers Cwas designed for a practical purpose C was designed to implementthe Unix operating syste. Later,other people'found that they could write the programs they wanted,without the languages getting in the way C is the language of.choice for,systemlevel programiming,and there is a'huge installed base of application-level programs as well.However,it is not.perfect for all progratmers and all situations. C pointers are a common source of confusion and programming errors.C.also lacks explicit support :for useful abstractions such as'classes,objects,and exceptions.Newer languages such as .G++and Java address these issues for application-levelprograms: 特 approximations that can behave in unexpected ways.This fundamental idea is explored in detail in Chapter 2. 1.2 Programs Are Translated by Other Programs into Different Forms The hello program begins life as a high-level C program because it can be read and understood by human beings in that form.However,in order to run hello.c on the system,the individual C statements must be translated by other programs into a sequence of low-level machine-language instructions.These instructions are then packaged in a form called an executable object program and stored as a binary disk file.Object programs are also referred to as executable object files. On a Unix system,the translation from source file to object file is performed by a compiler driver:
'I " ~ II l 4 Chapter 1 A Tour of Computer Systems h ""' iii ~"' ·it---- ·~ ?<-~ Aside Origins of the C·programming. languagj! . ' , ~ ~ ,, % "# ~~ C was developed from 1969.fo 1923 b~ Dennis.Ritchie of ~ell Laboraiories. The·Alnerican National ! Standarqs Insti_~ute (ANSIYr~tifi~\it)le ANSI"~ standardii:!J.989", ahll tpis.staridardl~!J.tiort later became j the responsibility of the International Standar'ds Orgaqization (ISO). The standards define the C ! language and a set of library functions known· as the'~ standard library. Kernighan and Rifcnie dessrilie ! ANSI,C in their'.classic book, which is linown affectionately as '\K&R" [61]. IitRitchie's word~:{92]"Q: 'I is "quirky, ftaweQ, and anienormous suCcess." S.o WhY the success? l ~ ~. ~ " * • O was- closely tied with "the. U(lix operating system. C ".was ·l!el>elope"d•fnlm the '.)Jeginp.ing as the 1· •· system prqgramming-languag'l fo.r Unixo Most ot the \)nix.kernel (the core part of tl;J.e operating, system), and alt of its suppqrting tools and.libraries, were,,.ritt~n-irt.c: As Unix became popular in I •universities in the late i970s and early 1980s, mai:!y people_ wpre exposed to C and found that they,{ liked it, Since Unix was written almo~t.entirely in.f:, it could b.e easily ported to new mjlchines; f which created an even wider aqdience for.both.C and Unix'. •. • j • c is a sman simpte language. The design was controil"1:! by"a single person;rather tl\ali a committee, ! and the result was a cl earl, consi;teht d~sjgn )viih little, bA,gkage. Tue;, K&R book, describes tjle . complete language and stahdarMibrar'y, wiili lmlheroils examples and exercises, in only 261 pages. The simplicityJJf crhad~. it relatively.e~sy tP learn and t6' port "to diffe(enfcojtlputei~, • " '* " ~ ~ • C was designed'[or a practical ptlrpose. C was d~igned to implement tlie:Uhix operating s"ystelh. Larer, other p~opfe'found that they could wrif~ t,he !lrogI~nl.s they w~ntedywi!ho~t\he language< getting jn thy way: ~ ~ · "< i '' ,~, ~~ c i,s the language, oLchoice.for'.~ysfetifJeveI'progranimfog, and' there is li"huge installed ba!e of' j application-level progra,ms a~ well. However, iUs notperfect for.,pUprogr~lhmerslai_id all situatioiJ.§.· j c pointers are a corrimoh source Q(C(lnfu,siop andl'rogranimin~ errors. G,also lack~ explicit support l , •for qseful abstractions sucli."as'Classes/ 61\jects, a11d ,exceptions.'N. e"I. er lahg"uages suc.h ai .Gt+ and Java ·1. ~ address these issues fqr application~lev~l·programs: l'i> ~'"" '•"" ·f ~ "" ~~" 7 """'&,.o;,> -o,;,--,...,.oil'/ ~'>--W-,~~~,--,~'""""1~'~tS.., ... 4'>,.,A~41¢ (!\>~ approximations that can behave in unexpected ways. This fundamental idea is explored in detail in Chapter 2. 1.2 Programs Are Translated by Other Programs into Different Forms The hello program begins life as a high-level C program because it can be read and understood by human beings in that form. However, in order to run hello. c on the system, the individual C statements must be transl~ted by o\her programs into a sequence of low-level machine-language instructions. These instructions are then packaged in a form called an executable object program and stored as a binary disk file. Object programs are also referred to as executable object files. On a Unix system, the translation from source file to object file is performed by a compiler driver: