当前位置：和泉文库 > 计算机 > 浏览文档

《电脑编程》教学参考书籍文献（C++编程书籍）Addison Wesley - Effcient C++ Programming Techniques

文件格式：PDF，文件大小：1.67MB，售价：29.92元

文档详细内容（约205页）

13 Imagine that sharedCounter was an integer variable accessible to multiple threads and needing serialization. We provided mutual exclusion by inserting a lock object into the local scope: { MutexLock myLock(theKey, LogSource(__FILE__, __LINE__)); sharedCounter++; } The creation of the MutexLock and LogSource objects triggered the invocations of their respective base classes as well. This short fragment invoked a number of constructors: • BaseLogSource • LogSource • BaseLock • MutexLock After the sharedCounter variable was incremented, we encountered the end of the scope that triggers the four corresponding destructors: • MutexLock • BaseLock • LogSource • BaseLogSource All told, the protection of the shared resource had cost us eight constructors and destructors. The tension between reuse and performance is a topic that keeps popping up. It would be interesting to find out what the cost would be if we abandoned all these objects and developed a hand-crafted version that would narrow down by doing exactly what we need and nothing else. Namely, it will just lock around the sharedCounter update: { pthread_mutex_lock(&theKey); sharedCounter++; pthread_mutex_unlock(&theKey); } By inspection alone, you can tell that the latter version is more efficient than the former one. Our objectbased design had cost us additional instructions. Those instructions were entirely dedicated to construction and destruction of objects. Should we worry about those instructions? That depends on the context; if we are in a performance critical flow, we might. In particular, additional instructions become significant if the total cost of the computation is small and the fragment that executes those instructions is called often enough. It is the ratio of instructions wasted divided by the total instruction count of the overall computation that we care about. The code sample we just described was taken out of a gateway implementation that routed data packets from one communication adapter to another. It was a critical path that consisted of roughly 5,000 instructions. The MutexLock object was used a few times on that path. That amounted to enough instruction-overhead to make up 10% of the overall cost, which was significant. If we are going to use C++ and OO in a performance-critical application, we cannot afford such luxury. Before we present a C++ based fix, we would quickly like to point out an obvious design overkill. If the critical section is as simple as a one-statement integer increment, why do we need all this object machinery? The advantages to using lock objects are • Maintenance of complex routines containing multiple return points. • Recovery from exceptions. • Polymorphism in locking

14 • Polymorphism in logging. All those advantages were not extremely important in our case. The critical section had a clearly defined single exit point and the integer increment operation was not going to throw an exception. The polymorphism in locking and logging was also something we could easily live without. Interestingly, as this code segment reveals, developers are actually doing this in practice, which indicates that the cost of object construction and destruction is seriously overlooked. So what about a complex routine where the use of the lock object actually makes sense? We would still like to reduce its cost. First let's consider the LogSource object. That piece of information had cost us four function calls: base and derived class constructors and destructors. This is a luxury we cannot afford in this context. Often, when C++ performance is discussed, inlining is offered as a cure. Although inlining could help here, it does not eliminate the problem. In the best-case scenario, inlining will eliminate the function call overhead for all four constructors and destructors. Even then, the LogSource object still imposes some performance overhead. First, it is an extra argument to the MutexLock constructor. Second, there is the assignment of the LogSource pointer member of MutexLock. Furthermore, when the LogSource object is created, some additional instructions are required to set up its virtual table pointer. In a critical performance path, a common sense trade-off is called for. You trade away marginal functionality for valuable performance. The LogSource object has to go. In a constructor, the assignment of a member data field costs a small number of instructions even in the case of a built-in type. The cost per member data field may not be much but it adds up. It grows with the number of data members that are initialized by the constructor. The fact that the code using the LogSource object was enclosed in an #ifdef DEBUG bracket provides further evidence that using this object was not essential. The DEBUG compile flag was used only during development test; the code that was shipped to customers was compiled with DEBUG turned off. When executing in a production environment, we paid the price imposed by the LogSource object, but never actually used it. This was pure overhead. The LogSource should have been completely eliminated by careful #ifdef of all remnants of it. That would include elimination of the pointer member of MutexLock as well as the constructor argument. The partial #ifdef of the LogSource object was an example of sloppy development. This is not terribly unusual; it is just that your chances of getting away with sloppy programming in C++ are slim. The next step is to eliminate the BaseLock root of the lock class hierarchy. In the case of BaseLock, it doesn't contribute any data members and, with the exception of the constructor signature, does not provide any meaningful interface. The contribution of BaseLock to the overall class design is debatable. Even if inlining takes care of the call overhead, the virtual destructor of BaseLock imposes the cost of setting the virtual table pointer in the MutexLock object. Saving a single assignment may not be much, but every little bit helps. Inlining the remaining MutexLock constructor and destructor will eliminate the remaining two function calls. The combination of eliminating the LogSource class, the BaseLock class, and inlining MutexLock constructor and destructor will significantly cut down the instruction count. It will generate code that is almost as efficient as hand-coded C. The compiler-generated code with the inlined MutexLock will be equivalent to something like the following pseudocode: { MutexLock::theKey = key; pthread_mutex_lock(&MutexLock::theKey); sharedCounter++; pthread_mutex_unlock(&MutexLock::theKey); }

点击进入文档下载页（PDF格式）

共205页，可试读40页，点击继续阅读 ↓↓

您可能感兴趣的文档

点击购买下载（PDF）

下载及服务说明

购买前请先查看本文档预览页，确认内容后再进行支付；
如遇文件无法下载、无法访问或其它任何问题，可发送电子邮件反馈，核实后将进行文件补发或退款等其它相关操作；
邮箱：

文档浏览记录