DATABASE MANAGEMENT SYSTEMS The book can be used in several kinds of introductory or second courses by choosing topics appropriately, or in a two-course sequence by supplementing the material with some advanced readings in the second course. Examples of appropriate introductory courses include courses on file organizations and introduction to database management ystems, especially if the course focuses on relational database design or implementa- tion. Advanced courses can be built around the later chapters, which contain detailed bibliographies with ample pointers for further study. Supplementary Material Each chapter contains several exercises designed to test and expand the readers un derstanding of the material. Students can obtain solutions to odd-numbered chapter exercises and a set of lecture slides for each chapter through the Web in Postscript and Adobe pdf formats The following material is available online to instructors Lecture slides for all chapters in MS Powerpoint, Postscript, and PDF formats. 2. Solutions to all chapter exercises. 3. SQL queries and programming assignments with solutions.(This is new for the second edition. 4. Supplementary project software(Minibase) with sample assignments and solu- tions, as described in Appendix B. The text itself does not refer to the project software, however, and can be used independently in a course that presents the rinciples of database management sys a practical perspective, bu out a project componen The supplementary material on SQL is new for the second edition. The remaining material has been extensively revised from the first edition versions For more information The home page for this book is at URL http://www.cs.wisc.edu/dbbook This page is frequently updated and contains a link to all known errors in the book, the accompanying slides, and the supplements. Instructors should visit this site periodically or register at this site to be notified of important changes by email
xxvi Database Management Systems The book can be used in several kinds of introductory or second courses by choosing topics appropriately, or in a two-course sequence by supplementing the material with some advanced readings in the second course. Examples of appropriate introductory courses include courses on file organizations and introduction to database management systems, especially if the course focuses on relational database design or implementation. Advanced courses can be built around the later chapters, which contain detailed bibliographies with ample pointers for further study. Supplementary Material Each chapter contains several exercises designed to test and expand the reader’s understanding of the material. Students can obtain solutions to odd-numbered chapter exercises and a set of lecture slides for each chapter through the Web in Postscript and Adobe PDF formats. The following material is available online to instructors: 1. Lecture slides for all chapters in MS Powerpoint, Postscript, and PDF formats. 2. Solutions to all chapter exercises. 3. SQL queries and programming assignments with solutions. (This is new for the second edition.) 4. Supplementary project software (Minibase) with sample assignments and solutions, as described in Appendix B. The text itself does not refer to the project software, however, and can be used independently in a course that presents the principles of database management systems from a practical perspective, but without a project component. The supplementary material on SQL is new for the second edition. The remaining material has been extensively revised from the first edition versions. For More Information The home page for this book is at URL: http://www.cs.wisc.edu/˜dbbook This page is frequently updated and contains a link to all known errors in the book, the accompanying slides, and the supplements. Instructors should visit this site periodically or register at this site to be notified of important changes by email
Preface Acknowledgments This book grew out of lecture notes for CS564, the introductory(senior/graduate level) database course at Uw-Madison. David De witt developed this course and the minirel project, in which students wrote several well-chosen parts of a relational DBMS. My thinking about this material was shaped by teaching CS564, and Minirel was the inspiration for Minibase, which is more comprehensive(e.g, it has a query optimizer and includes visualization software) but tries to retain the spirit of Minirel. Mike Carey and I jointly designed much of Minibase. My lecture notes(and in turn this book) were influenced by Mike's lecture notes and by Yannis loannidis's lecture slides Joe Hellerstein used the beta edition of the book at Berkeley and provided invaluable feedback, assistance on slides, and hilarious quotes. Writing the chapter on object database systems with Joe was a lot of fun C. Mohan provided invaluable assistance, patiently answering a number of questions about implementation techniques used in various commercial systems, in particular in- dexing, concurrency control, and recovery algorithms. Moshe Aloof answered numerous questions about QBE semantics and commercial systems based on QBE. Ron Fagin, Krishna Kulkarni, Len Shapiro, Jim Melton, Dennis Shasha, and Dirk Van gucht re- viewed the book and provided detailed feedback, greatly improving the content and presentation. Michael Goldweber at Beloit College, Matthew Haines at Wyoming, Michael Kifer at SUNY Stony Brook, Jeff Naughton at Wisconsin, Praveen Seshadri at Cornell and Stan Zdonik at Brown also used the beta edition in their database courses d offered feedback and bug reports. In particular, Michael Kifer pointed out an er ror in the(old)algorithm for computing a minimal cover and suggested covering some SQL features in Chapter 2 to improve modularity. Gio Wiederhold's bibliography converted to Latex format by S Sudarshan, and Michael Ley's online bibliography on databases and logic programming were a great help while compiling the chapter bibli ographies. Shaun Flisakowski and Uri Shaft helped me frequently in my never-ending battles with latex I owe a special thanks to the many, many students who have contributed to the mini base software. Emmanuel Ackaouy, Jim Pruyne, Lee Schumacher, and Michael Lee worked with me when I developed the first version of Minibase(much of which was subsequently discarded, but which influenced the next version). Emmanuel Ackaouy and Bryan So were my TAs when I taught CS564 using this version and went well be yond the limits of a TAship in their efforts to refine the project. Paul Aoki struggled with a version of Minibase and offered lots of useful comments as a TA at Berkeley. An entire class of CS764 students(our graduate database course) developed much of the current version of Minibase in a large class project that was led and coordinated by like Carey and me. Amit Shukla and Michael Lee were my TAs when I first taught CS564 using this version of Minibase and developed the software further
Preface xxvii Acknowledgments This book grew out of lecture notes for CS564, the introductory (senior/graduate level) database course at UW-Madison. David DeWitt developed this course and the Minirel project, in which students wrote several well-chosen parts of a relational DBMS. My thinking about this material was shaped by teaching CS564, and Minirel was the inspiration for Minibase, which is more comprehensive (e.g., it has a query optimizer and includes visualization software) but tries to retain the spirit of Minirel. Mike Carey and I jointly designed much of Minibase. My lecture notes (and in turn this book) were influenced by Mike’s lecture notes and by Yannis Ioannidis’s lecture slides. Joe Hellerstein used the beta edition of the book at Berkeley and provided invaluable feedback, assistance on slides, and hilarious quotes. Writing the chapter on objectdatabase systems with Joe was a lot of fun. C. Mohan provided invaluable assistance, patiently answering a number of questions about implementation techniques used in various commercial systems, in particular indexing, concurrency control, and recovery algorithms. Moshe Zloof answered numerous questions about QBE semantics and commercial systems based on QBE. Ron Fagin, Krishna Kulkarni, Len Shapiro, Jim Melton, Dennis Shasha, and Dirk Van Gucht reviewed the book and provided detailed feedback, greatly improving the content and presentation. Michael Goldweber at Beloit College, Matthew Haines at Wyoming, Michael Kifer at SUNY StonyBrook, Jeff Naughton at Wisconsin, Praveen Seshadri at Cornell, and Stan Zdonik at Brown also used the beta edition in their database courses and offered feedback and bug reports. In particular, Michael Kifer pointed out an error in the (old) algorithm for computing a minimal cover and suggested covering some SQL features in Chapter 2 to improve modularity. Gio Wiederhold’s bibliography, converted to Latex format by S. Sudarshan, and Michael Ley’s online bibliography on databases and logic programming were a great help while compiling the chapter bibliographies. Shaun Flisakowski and Uri Shaft helped me frequently in my never-ending battles with Latex. I owe a special thanks to the many, many students who have contributed to the Minibase software. Emmanuel Ackaouy, Jim Pruyne, Lee Schumacher, and Michael Lee worked with me when I developed the first version of Minibase (much of which was subsequently discarded, but which influenced the next version). Emmanuel Ackaouy and Bryan So were my TAs when I taught CS564 using this version and went well beyond the limits of a TAship in their efforts to refine the project. Paul Aoki struggled with a version of Minibase and offered lots of useful comments as a TA at Berkeley. An entire class of CS764 students (our graduate database course) developed much of the current version of Minibase in a large class project that was led and coordinated by Mike Carey and me. Amit Shukla and Michael Lee were my TAs when I first taught CS564 using this version of Minibase and developed the software further
DATABASE MANAGEMENT SYSTEMS Several students worked with me on independent projects, over a long period of to develop Minibase components. These include visualization packages for the b manager and B+ trees(Huseyin Bektas, Harry Stavropoulos, and Weiqing Huang):a query optimizer and visualizer(Stephen Harris, Michael Lee, and Donko Donjerkovic) an ER diagram tool based on the Opossum schema editor(Eben Haber); and a gUI based tool for normalization(Andrew Prock and Andy Therber). In addition, Bill Kimmel worked to integrate and fix a large body of code(storage manager, buffer manager, files and access methods, relational operators, and the query plan executor) produced by the CS764 class project. Ranjani Ramamurty considerably extended Bills work on cleaning up and integrating the various modules. Luke blanshard, Uri Shaft, and Shaun Flisakowski worked on putting together the release version of the code and developed test suites and exercises based on the Minibase software. Krishna Kunchithapadam tested the optimizer and developed part of the Minibase gUi Clearly, the Minibase software would not exist without the contributions of a great many talented people. With this software available freely in the public domain, I hope that more instructors will be able to teach a systems-oriented database course with a blend of implementation and experimentation to complement the lecture material Id like to thank the many students who helped in developing and checking the solu tions to the exercises and provided useful feedback on draft versions of the book. In alphabetical order: X. Bao, S. Biao, M. Chakrabarti, C. Chan, W. Chen, N. Cheung D. Colwell, C. Fritz, V Ganti, J Gehrke, G. Glass, V. Gopalakrishnan, M. Higgins, Jasmin, M. Krishnaprasad, Y. Lin, C. Liu, M. Lusignan, H. Modi, S. Narayanan, D. Randolph, A. Ranganathan, J. Reminga, A. Therber, M. Thomas, Q. Wang, R. Wang, Z. Wang, and J. Yuan. Arcady grenader, James Harrington, and Martin Reames at Wisconsin and Nina Tang at Berkeley provided especially detailed feedback Charlie Fischer, Avi Silberschatz, and Jeff Ullman gave me invaluable advice on work ing with a publisher. My editors at McGraw-Hill, Betsy Jones and Eric Munson really helped me to stay on top of thinl ms cm. K obtained extensive reviews and guided this book in its early stages. Emily gray and Brad Kosirog were there whenever probler pped up. At Wisconsin, Ginny Werner Finally, this book was a thief of time, and in many ways it was harder on my family than on me. My sons expressed themselves forthrightly. From my(then) five-yea old, Ketan: Dad, stop working on that silly book. You don't have any time for me. Two-year-old Vivek: You working boook? No nono come play basketball me! All the seasons of their discontent were visited upon my wife, and Apu nonetheless cheerfully kept the family going in its usual chaotic, happy way all the many evenings nd weekends I was wrapped up in this book.(Not to mention the days when I was wrapped up in being a faculty member! ) As in all things, I can trace my parents hand in much of this: my father, with his love of learning, and my mother, with her love of us, shaped me. My brother Kartik's contributions to this book consisted chiefly of
xxviii Database Management Systems Several students worked with me on independent projects, over a long period of time, to develop Minibase components. These include visualization packages for the buffer manager and B+ trees (Huseyin Bektas, Harry Stavropoulos, and Weiqing Huang); a query optimizer and visualizer (Stephen Harris, Michael Lee, and Donko Donjerkovic); an ER diagram tool based on the Opossum schema editor (Eben Haber); and a GUIbased tool for normalization (Andrew Prock and Andy Therber). In addition, Bill Kimmel worked to integrate and fix a large body of code (storage manager, buffer manager, files and access methods, relational operators, and the query plan executor) produced by the CS764 class project. Ranjani Ramamurty considerably extended Bill’s work on cleaning up and integrating the various modules. Luke Blanshard, Uri Shaft, and Shaun Flisakowski worked on putting together the release version of the code and developed test suites and exercises based on the Minibase software. Krishna Kunchithapadam tested the optimizer and developed part of the Minibase GUI. Clearly, the Minibase software would not exist without the contributions of a great many talented people. With this software available freely in the public domain, I hope that more instructors will be able to teach a systems-oriented database course with a blend of implementation and experimentation to complement the lecture material. I’d like to thank the many students who helped in developing and checking the solutions to the exercises and provided useful feedback on draft versions of the book. In alphabetical order: X. Bao, S. Biao, M. Chakrabarti, C. Chan, W. Chen, N. Cheung, D. Colwell, C. Fritz, V. Ganti, J. Gehrke, G. Glass, V. Gopalakrishnan, M. Higgins, T. Jasmin, M. Krishnaprasad, Y. Lin, C. Liu, M. Lusignan, H. Modi, S. Narayanan, D. Randolph, A. Ranganathan, J. Reminga, A. Therber, M. Thomas, Q. Wang, R. Wang, Z. Wang, and J. Yuan. Arcady Grenader, James Harrington, and Martin Reames at Wisconsin and Nina Tang at Berkeley provided especially detailed feedback. Charlie Fischer, Avi Silberschatz, and Jeff Ullman gave me invaluable advice on working with a publisher. My editors at McGraw-Hill, Betsy Jones and Eric Munson, obtained extensive reviews and guided this book in its early stages. Emily Gray and Brad Kosirog were there whenever problems cropped up. At Wisconsin, Ginny Werner really helped me to stay on top of things. Finally, this book was a thief of time, and in many ways it was harder on my family than on me. My sons expressed themselves forthrightly. From my (then) five-yearold, Ketan: “Dad, stop working on that silly book. You don’t have any time for me.” Two-year-old Vivek: “You working boook? No no no come play basketball me!” All the seasons of their discontent were visited upon my wife, and Apu nonetheless cheerfully kept the family going in its usual chaotic, happy way all the many evenings and weekends I was wrapped up in this book. (Not to mention the days when I was wrapped up in being a faculty member!) As in all things, I can trace my parents’ hand in much of this; my father, with his love of learning, and my mother, with her love of us, shaped me. My brother Kartik’s contributions to this book consisted chiefly of
Preface XXIX phone calls in which he kept me from working, but if I dont acknowledge liable to be annoyed. I'd like to thank my family for being there and giving to everything I do.(There! I knew I'd find a legitimate reason to thank Kartik Acknowledgments for the second edition Emily gray and Betsy Jones at Mc Graw-Hill obtained extensive reviews and provided guidance and support as we prepared the second edition. Jonathan Goldstein helped with the bibliography for spatial databases. The following reviewers provided valuable feedback on content and organization: Liming Cai at Ohio University, Costas Tsat- soulis at University of Kansas, Kwok-Bun Yue at University of Houston, Clear Lake William Grosky at Wayne State University, Sang H. Son at University of Vi James M. Slack at Minnesota State University, Mankato, Herman Balsters at rersity of Twente, Netherlands, Karen C. Davis at University of Cincinnati, Joachim Hammer at University of Florida, Fred Petry at Tulane University, Gregory Speegle at Baylor University, Salih Yurttas at Texas A&M University, and David Chao at San Francisco State University A number of people reported bugs in the first edition. In particular, we wish to thank the following: Joseph Albert at Portland State University, Han-yin Chen at University of Wisconsin, Lois Delcambre at Oregon Graduate Institute, Maggie Eich at South ern Methodist University, Raj Gopalan at Curtin University of Technology, Davood Thomasian at University of Connecticut, and Scott Vandenberg at Siena College ex Rafiei at University of Toronto, Michael Schrefl at University of South Australia, A special thanks to the many people who answered a detailed survey about how com- mercial systems support various features: At IBM, Mike Carey, Bruce Lindsay, C Mohan. and James Teng: at Informix. M. Muralikrishna and Michael ubel: at Mi- rosoft, David Campbell, Goetz Graefe, and Peter Spiro: at Oracle, Hakan Jacobsson Jonathan D. Klein, Muralidhar Krishnaprasad, and M. Ziauddin: and at Sybase, Marc Chanliau, Lucien Dimino, Sangeeta Doraiswamy, Hanuma Kodavalla, Roger MacNicol nd Tirumanjanam Rengarajan After reading about himself in the acknowledgment to the first edition, Ketan(now 8) question: "How come you didnt dedicate the book to us? mom Ketan, I took care of this inexplicable oversight. Vivek(now 5) was more concerned about the extent of his fame: "Daddy, is my name in evvy copy of your book? D hey have it in evuy compooter science department in the world?" Vivek, I hope so Finally, this revision would not have made it without Apus and Keiko s support
Preface xxix phone calls in which he kept me from working, but if I don’t acknowledge him, he’s liable to be annoyed. I’d like to thank my family for being there and giving meaning to everything I do. (There! I knew I’d find a legitimate reason to thank Kartik.) Acknowledgments for the Second Edition Emily Gray and Betsy Jones at McGraw-Hill obtained extensive reviews and provided guidance and support as we prepared the second edition. Jonathan Goldstein helped with the bibliography for spatial databases. The following reviewers provided valuable feedback on content and organization: Liming Cai at Ohio University, Costas Tsatsoulis at University of Kansas, Kwok-Bun Yue at University of Houston, Clear Lake, William Grosky at Wayne State University, Sang H. Son at University of Virginia, James M. Slack at Minnesota State University, Mankato, Herman Balsters at University of Twente, Netherlands, Karen C. Davis at University of Cincinnati, Joachim Hammer at University of Florida, Fred Petry at Tulane University, Gregory Speegle at Baylor University, Salih Yurttas at Texas A&M University, and David Chao at San Francisco State University. A number of people reported bugs in the first edition. In particular, we wish to thank the following: Joseph Albert at Portland State University, Han-yin Chen at University of Wisconsin, Lois Delcambre at Oregon Graduate Institute, Maggie Eich at Southern Methodist University, Raj Gopalan at Curtin University of Technology, Davood Rafiei at University of Toronto, Michael Schrefl at University of South Australia, Alex Thomasian at University of Connecticut, and Scott Vandenberg at Siena College. A special thanks to the many people who answered a detailed survey about how commercial systems support various features: At IBM, Mike Carey, Bruce Lindsay, C. Mohan, and James Teng; at Informix, M. Muralikrishna and Michael Ubell; at Microsoft, David Campbell, Goetz Graefe, and Peter Spiro; at Oracle, Hakan Jacobsson, Jonathan D. Klein, Muralidhar Krishnaprasad, and M. Ziauddin; and at Sybase, Marc Chanliau, Lucien Dimino, Sangeeta Doraiswamy, Hanuma Kodavalla, Roger MacNicol, and Tirumanjanam Rengarajan. After reading about himself in the acknowledgment to the first edition, Ketan (now 8) had a simple question: “How come you didn’t dedicate the book to us? Why mom?” Ketan, I took care of this inexplicable oversight. Vivek (now 5) was more concerned about the extent of his fame: “Daddy, is my name in evvy copy of your book? Do they have it in evvy compooter science department in the world?” Vivek, I hope so. Finally, this revision would not have made it without Apu’s and Keiko’s support
PART BASICS
PART I BASICS