Hello,world of concurrency C++, in This chapter covers What is meant by concurrency and multithreading Why you might want to use concurrency and multithreading in your applications Some of the history of the support for concurrency in C++ What a simple multithreaded C++program looks like These are exciting times for C++users.Thirteen years after the original C++Stan- dard was published in 1998,the C++Standards Committee is giving the language and its supporting library a major overhaul.The new C++Standard(referred to as C++11 or C++0x)was published in 2011 and brings with it a whole swathe of changes that will make working with C++easier and more productive. One of the most significant new features in the C++11 Standard is the support of multithreaded programs.For the first time,the C++Standard will acknowledge the existence of multithreaded applications in the language and provide components in the library for writing multithreaded applications.This will make it possible to write 1
1 Hello, world of concurrency in C++! These are exciting times for C++ users. Thirteen years after the original C++ Standard was published in 1998, the C++ Standards Committee is giving the language and its supporting library a major overhaul. The new C++ Standard (referred to as C++11 or C++0x) was published in 2011 and brings with it a whole swathe of changes that will make working with C++ easier and more productive. One of the most significant new features in the C++11 Standard is the support of multithreaded programs. For the first time, the C++ Standard will acknowledge the existence of multithreaded applications in the language and provide components in the library for writing multithreaded applications. This will make it possible to write This chapter covers ■ What is meant by concurrency and multithreading ■ Why you might want to use concurrency and multithreading in your applications ■ Some of the history of the support for concurrency in C++ ■ What a simple multithreaded C++ program looks like
2 CHAPTER 1 Hello,world of concurrency in C++! multithreaded C++programs without relying on platform-specific extensions and thus allow writing portable multithreaded code with guaranteed behavior.It also comes at a time when programmers are increasingly looking to concurrency in general,and multi- threaded programming in particular,to improve application performance. This book is about writing programs in C++using multiple threads for concur- rency and the C++language features and library facilities that make that possible.I'll start by explaining what I mean by concurrency and multithreading and why you would want to use concurrency in your applications.After a quick detour into why you might not want to use it in your applications,I'll give an overview of the concur- rency support in C++,and I'll round off this chapter with a simple example of C++ concurrency in action.Readers experienced with developing multithreaded applica- tions may wish to skip the early sections.In subsequent chapters I'll cover more extensive examples and look at the library facilities in more depth.The book will fin- ish with an in-depth reference to all the C++Standard Library facilities for multi- threading and concurrency. So,what do I mean by concurrency and multithreading? 1.1 What is concurrency? At the simplest and most basic level,concurrency is about two or more separate activi- ties happening at the same time.We encounter concurrency as a natural part of life; we can walk and talk at the same time or perform different actions with each hand, and of course we each go about our lives independently of each other-you can watch football while I go swimming,and so on. 1.11 Concurrency in computer systems When we talk about concurrency in terms of computers,we mean a single system per- forming multiple independent activities in parallel,rather than sequentially,or one after the other.It isn't a new phenomenon:multitasking operating systems that allow a single computer to run multiple applications at the same time through task switch- ing have been commonplace for many years,and high-end server machines with mul- tiple processors that enable genuine concurrency have been available for even longer. What is new is the increased prevalence of computers that can genuinely run multiple tasks in parallel rather than just giving the illusion of doing so. Historically,most computers have had one processor,with a single processing unit or core,and this remains true for many desktop machines today.Such a machine can really only perform one task at a time,but it can switch between tasks many times per second.By doing a bit of one task and then a bit of another and so on,it appears that the tasks are happening concurrently.This is called task switching. We still talk about concurrency with such systems;because the task switches are so fast, you can't tell at which point a task may be suspended as the processor switches to another one.The task switching provides an illusion of concurrency to both the user and the applications themselves.Because there is only an illusion of concurrency,the
2 CHAPTER 1 Hello, world of concurrency in C++! multithreaded C++ programs without relying on platform-specific extensions and thus allow writing portable multithreaded code with guaranteed behavior. It also comes at a time when programmers are increasingly looking to concurrency in general, and multithreaded programming in particular, to improve application performance. This book is about writing programs in C++ using multiple threads for concurrency and the C++ language features and library facilities that make that possible. I’ll start by explaining what I mean by concurrency and multithreading and why you would want to use concurrency in your applications. After a quick detour into why you might not want to use it in your applications, I’ll give an overview of the concurrency support in C++, and I’ll round off this chapter with a simple example of C++ concurrency in action. Readers experienced with developing multithreaded applications may wish to skip the early sections. In subsequent chapters I’ll cover more extensive examples and look at the library facilities in more depth. The book will finish with an in-depth reference to all the C++ Standard Library facilities for multithreading and concurrency. So, what do I mean by concurrency and multithreading? 1.1 What is concurrency? At the simplest and most basic level, concurrency is about two or more separate activities happening at the same time. We encounter concurrency as a natural part of life; we can walk and talk at the same time or perform different actions with each hand, and of course we each go about our lives independently of each other—you can watch football while I go swimming, and so on. 1.1.1 Concurrency in computer systems When we talk about concurrency in terms of computers, we mean a single system performing multiple independent activities in parallel, rather than sequentially, or one after the other. It isn’t a new phenomenon: multitasking operating systems that allow a single computer to run multiple applications at the same time through task switching have been commonplace for many years, and high-end server machines with multiple processors that enable genuine concurrency have been available for even longer. What is new is the increased prevalence of computers that can genuinely run multiple tasks in parallel rather than just giving the illusion of doing so. Historically, most computers have had one processor, with a single processing unit or core, and this remains true for many desktop machines today. Such a machine can really only perform one task at a time, but it can switch between tasks many times per second. By doing a bit of one task and then a bit of another and so on, it appears that the tasks are happening concurrently. This is called task switching. We still talk about concurrency with such systems; because the task switches are so fast, you can’t tell at which point a task may be suspended as the processor switches to another one. The task switching provides an illusion of concurrency to both the user and the applications themselves. Because there is only an illusion of concurrency, the
What is concurrency? 3 behavior of applications may be subtly different when executing in a single-processor task-switching environment compared to when executing in an environment with true concurrency.In particular,incorrect assumptions about the memory model (covered in chapter 5)may not show up in such an environment.This is discussed in more depth in chapter 10. Computers containing multiple processors have been used for servers and high- performance computing tasks for a number of years,and now computers based on processors with more than one core on a single chip(multicore processors)are becom- ing increasingly common as desktop machines too.Whether they have multiple proces- sors or multiple cores within a processor (or both),these computers are capable of genuinely running more than one task in parallel.We call this hardware concurrenc Figure 1.1 shows an idealized scenario of a computer with precisely two tasks to do, each divided into 10 equal-size chunks.On a dual-core machine (which has two pro- cessing cores),each task can execute on its own core.On a single-core machine doing task switching,the chunks from each task are interleaved.But they are also spaced out a bit (in the diagram this is shown by the gray bars separating the chunks being thicker than the separator bars shown for the dual-core machine);in order to do the interleaving,the system has to perform a context switch every time it changes from one task to another,and this takes time.In order to perform a context switch,the OS has to save the CPU state and instruction pointer for the currently running task,work out which task to switch to,and reload the CPU state for the task being switched to.The CPU will then potentially have to load the memory for the instructions and data for the new task into cache,which can prevent the CPU from executing any instructions, causing further delay. Though the availability of concurrency in the hardware is most obvious with multi- processor or multicore systems,some processors can execute multiple threads on a single core.The important factor to consider is really the number of hardware threads: the measure of how many independent tasks the hardware can genuinely run concur- rently.Even with a system that has genuine hardware concurrency,it's easy to have more tasks than the hardware can run in parallel,so task switching is still used in these cases.For example,on a typical desktop computer there may be hundreds of tasks Core 1 Dual core Core 2 Single core Figure 1.1 Two approaches to concurrency:parallel execution on a dual-core machine versus task switching on a single-core machine
What is concurrency? 3 behavior of applications may be subtly different when executing in a single-processor task-switching environment compared to when executing in an environment with true concurrency. In particular, incorrect assumptions about the memory model (covered in chapter 5) may not show up in such an environment. This is discussed in more depth in chapter 10. Computers containing multiple processors have been used for servers and highperformance computing tasks for a number of years, and now computers based on processors with more than one core on a single chip (multicore processors) are becoming increasingly common as desktop machines too. Whether they have multiple processors or multiple cores within a processor (or both), these computers are capable of genuinely running more than one task in parallel. We call this hardware concurrency. Figure 1.1 shows an idealized scenario of a computer with precisely two tasks to do, each divided into 10 equal-size chunks. On a dual-core machine (which has two processing cores), each task can execute on its own core. On a single-core machine doing task switching, the chunks from each task are interleaved. But they are also spaced out a bit (in the diagram this is shown by the gray bars separating the chunks being thicker than the separator bars shown for the dual-core machine); in order to do the interleaving, the system has to perform a context switch every time it changes from one task to another, and this takes time. In order to perform a context switch, the OS has to save the CPU state and instruction pointer for the currently running task, work out which task to switch to, and reload the CPU state for the task being switched to. The CPU will then potentially have to load the memory for the instructions and data for the new task into cache, which can prevent the CPU from executing any instructions, causing further delay. Though the availability of concurrency in the hardware is most obvious with multiprocessor or multicore systems, some processors can execute multiple threads on a single core. The important factor to consider is really the number of hardware threads: the measure of how many independent tasks the hardware can genuinely run concurrently. Even with a system that has genuine hardware concurrency, it’s easy to have more tasks than the hardware can run in parallel, so task switching is still used in these cases. For example, on a typical desktop computer there may be hundreds of tasks Figure 1.1 Two approaches to concurrency: parallel execution on a dual-core machine versus task switching on a single-core machine
4 CHAPTER 1 Hello,world of concurrency in C++! running,performing background operations,even when the computer is nominally idle.It's the task switching that allows these background tasks to run and allows you to run your word processor,compiler,editor,and web browser (or any combination of applications)all at once.Figure 1.2 shows task switching among four tasks on a dual- core machine,again for an idealized scenario with the tasks divided neatly into equal- size chunks.In practice,many issues will make the divisions uneven and the scheduling irregular.Some of these issues are covered in chapter 8 when we look at factors affect- ing the performance of concurrent code. All the techniques,functions,and classes covered in this book can be used whether your application is running on a machine with one single-core processor or on a machine with many multicore processors and are not affected by whether the concur- rency is achieved through task switching or by genuine hardware concurrency.But as you may imagine,how you make use of concurrency in your application may well depend on the amount of hardware concurrency available.This is covered in chapter 8, where I cover the issues involved with designing concurrent code in C++. 11.2 Approaches to concurrency Imagine for a moment a pair of programmers working together on a software project. If your developers are in separate offices,they can go about their work peacefully, without being disturbed by each other,and they each have their own set of reference manuals.However,communication is not straightforward;rather than just turning around and talking to each other,they have to use the phone or email or get up and walk to each other's office.Also,you have the overhead of two offices to manage and mul- tiple copies of reference manuals to purchase. Now imagine that you move your developers into the same office.They can now talk to each other freely to discuss the design of the application,and they can easily draw diagrams on paper or on a whiteboard to help with design ideas or explanations. You now have only one office to manage,and one set of resources will often suffice. On the negative side,they might find it harder to concentrate,and there may be issues with sharing resources("Where's the reference manual gone now?"). These two ways of organizing your developers illustrate the two basic approaches to concurrency.Each developer represents a thread,and each office represents a pro- cess.The first approach is to have multiple single-threaded processes,which is similar to having each developer in their own office,and the second approach is to have mul- tiple threads in a single process,which is like having two developers in the same office. Core Dual core Core 2 Figure 1.2 Task switching of four tasks on two cores
4 CHAPTER 1 Hello, world of concurrency in C++! running, performing background operations, even when the computer is nominally idle. It’s the task switching that allows these background tasks to run and allows you to run your word processor, compiler, editor, and web browser (or any combination of applications) all at once. Figure 1.2 shows task switching among four tasks on a dualcore machine, again for an idealized scenario with the tasks divided neatly into equalsize chunks. In practice, many issues will make the divisions uneven and the scheduling irregular. Some of these issues are covered in chapter 8 when we look at factors affecting the performance of concurrent code. All the techniques, functions, and classes covered in this book can be used whether your application is running on a machine with one single-core processor or on a machine with many multicore processors and are not affected by whether the concurrency is achieved through task switching or by genuine hardware concurrency. But as you may imagine, how you make use of concurrency in your application may well depend on the amount of hardware concurrency available. This is covered in chapter 8, where I cover the issues involved with designing concurrent code in C++. 1.1.2 Approaches to concurrency Imagine for a moment a pair of programmers working together on a software project. If your developers are in separate offices, they can go about their work peacefully, without being disturbed by each other, and they each have their own set of reference manuals. However, communication is not straightforward; rather than just turning around and talking to each other, they have to use the phone or email or get up and walk to each other’s office. Also, you have the overhead of two offices to manage and multiple copies of reference manuals to purchase. Now imagine that you move your developers into the same office. They can now talk to each other freely to discuss the design of the application, and they can easily draw diagrams on paper or on a whiteboard to help with design ideas or explanations. You now have only one office to manage, and one set of resources will often suffice. On the negative side, they might find it harder to concentrate, and there may be issues with sharing resources (“Where’s the reference manual gone now?”). These two ways of organizing your developers illustrate the two basic approaches to concurrency. Each developer represents a thread, and each office represents a process. The first approach is to have multiple single-threaded processes, which is similar to having each developer in their own office, and the second approach is to have multiple threads in a single process, which is like having two developers in the same office. Figure 1.2 Task switching of four tasks on two cores
What is concurrency? 5 You can combine these in an arbitrary fashion and have multiple processes,some of which are multithreaded and some of which are single-threaded,but the principles are the same.Let's now have a brief look at these two approaches to concurrency in an application. CONCURRENCY WITH MULTIPLE PROCESSES The first way to make use of concurrency within an appli- cation is to divide the application into multiple,separate, single-threaded processes that are run at the same time, Process 1 much as you can run your web browser and word proces- Thread sor at the same time.These separate processes can then pass messages to each other through all the normal inter- process communication channels (signals,sockets,files, Interprocess pipes,and so on),as shown in figure 1.3.One downside is communication that such communication between processes is often Operating either complicated to set up or slow or both,because system operating systems typically provide a lot of protection Thread between processes to avoid one process accidentally modi- fying data belonging to another process.Another down- Process 2 side is that there's an inherent overhead in running Figure 1.3 Communication multiple processes:it takes time to start a process,the between a pair of processes operating system must devote internal resources to man- running concurrently aging the process,and so forth. Of course,it's not all downside:the added protection operating systems typically provide between processes and the higher-level communication mechanisms mean that it can be easier to write safe concurrent code with processes rather than threads. Indeed,environments such as that provided for the Erlang programming language use processes as the fundamental building block of concurrency to great effect. Using separate processes for concurrency also has an additional advantage-you can run the separate processes on distinct machines connected over a network.Though this increases the communication cost,on a carefully designed system it can be a cost- effective way of increasing the available parallelism and improving performance. CONCURRENCY WITH MULTIPLE THREADS The alternative approach to concurrency is to run multiple threads in a single pro- cess.Threads are much like lightweight processes:each thread runs independently of the others,and each thread may run a different sequence of instructions.But all threads in a process share the same address space,and most of the data can be accessed directly from all threads-global variables remain global,and pointers or ref- erences to objects or data can be passed around among threads.Although it's often possible to share memory among processes,this is complicated to set up and often hard to manage,because memory addresses of the same data aren't necessarily the same in different processes.Figure 1.4 shows two threads within a process communi- cating through shared memory
What is concurrency? 5 You can combine these in an arbitrary fashion and have multiple processes, some of which are multithreaded and some of which are single-threaded, but the principles are the same. Let’s now have a brief look at these two approaches to concurrency in an application. CONCURRENCY WITH MULTIPLE PROCESSES The first way to make use of concurrency within an application is to divide the application into multiple, separate, single-threaded processes that are run at the same time, much as you can run your web browser and word processor at the same time. These separate processes can then pass messages to each other through all the normal interprocess communication channels (signals, sockets, files, pipes, and so on), as shown in figure 1.3. One downside is that such communication between processes is often either complicated to set up or slow or both, because operating systems typically provide a lot of protection between processes to avoid one process accidentally modifying data belonging to another process. Another downside is that there’s an inherent overhead in running multiple processes: it takes time to start a process, the operating system must devote internal resources to managing the process, and so forth. Of course, it’s not all downside: the added protection operating systems typically provide between processes and the higher-level communication mechanisms mean that it can be easier to write safe concurrent code with processes rather than threads. Indeed, environments such as that provided for the Erlang programming language use processes as the fundamental building block of concurrency to great effect. Using separate processes for concurrency also has an additional advantage—you can run the separate processes on distinct machines connected over a network. Though this increases the communication cost, on a carefully designed system it can be a costeffective way of increasing the available parallelism and improving performance. CONCURRENCY WITH MULTIPLE THREADS The alternative approach to concurrency is to run multiple threads in a single process. Threads are much like lightweight processes: each thread runs independently of the others, and each thread may run a different sequence of instructions. But all threads in a process share the same address space, and most of the data can be accessed directly from all threads—global variables remain global, and pointers or references to objects or data can be passed around among threads. Although it’s often possible to share memory among processes, this is complicated to set up and often hard to manage, because memory addresses of the same data aren’t necessarily the same in different processes. Figure 1.4 shows two threads within a process communicating through shared memory. Figure 1.3 Communication between a pair of processes running concurrently