DATABASE MANAGEMENT SYSTEMS 19.1 Lock-Based Concurrency Control Revisited 19.1.1 2PL, Serializability, and Recoverability 19. 1.2 View Serializability 19.2 Lock Management 033 19.2.1 Implementing Lock and Unlock Request 19. 2.2 Deadlock 46 19.2.3 Performance of Lock-Based Concurrency Control 19.3 Specialized Locking Techniques 19.3.1 Dynamic Databases and the Phantom Problem 5905 19.3.2 Concurrency Control in B+ Trees 19.3.3 Multiple-Granularity Locking 19.4 Transaction Support in SQL-92* 9. 4.1 Transaction Characteristics 19. 4.2 Transactions and Constraints 19.5 Concurrency Control without Locking 19.5.1 Optimistic Concurrency Control 19.5.2 Timestamp-Based Concurrency Control 19.5.3 Multiversion Concurrency Control 19.6 Points to review 20 CRASH RECOVERY 0.1 Introduction to arIes 20.1.1 The log 573 20.1.2 Other Recovery-Related Data Structures 576 20.1.3 The Write-Ahead Log Protocol 20. 1.4 Checkpointing 20.2 Recovering from a System Crash 20.2.1 Analysis Phase 20.2.2 20.23 Undo phase media Recovery Other Algorithms and Interaction with Concurrency Control 20.5 Points to review Part VIl ADVANCED TOPICS 595 21 PARALLEL AND DISTRIBUTED DATABASES 21.1 Architectures for parallel databases 21.2 Parallel Query Evaluation 21.2.1 Data Partitioning 21.2.2 Parallelizing Sequential Operator Evaluation Code 21.3 Parallelizing Individual Operations 21.3.1 Bulk Loading and Scanning
xvi Database Management Systems 19.1 Lock-Based Concurrency Control Revisited 540 19.1.1 2PL, Serializability, and Recoverability 540 19.1.2 View Serializability 543 19.2 Lock Management 543 19.2.1 Implementing Lock and Unlock Requests 544 19.2.2 Deadlocks 546 19.2.3 Performance of Lock-Based Concurrency Control 548 19.3 Specialized Locking Techniques 549 19.3.1 Dynamic Databases and the Phantom Problem 550 19.3.2 Concurrency Control in B+ Trees 551 19.3.3 Multiple-Granularity Locking 554 19.4 Transaction Support in SQL-92 * 555 19.4.1 Transaction Characteristics 556 19.4.2 Transactions and Constraints 558 19.5 Concurrency Control without Locking 559 19.5.1 Optimistic Concurrency Control 559 19.5.2 Timestamp-Based Concurrency Control 561 19.5.3 Multiversion Concurrency Control 563 19.6 Points to Review 564 20 CRASH RECOVERY 571 20.1 Introduction to ARIES 571 20.1.1 The Log 573 20.1.2 Other Recovery-Related Data Structures 576 20.1.3 The Write-Ahead Log Protocol 577 20.1.4 Checkpointing 578 20.2 Recovering from a System Crash 578 20.2.1 Analysis Phase 579 20.2.2 Redo Phase 581 20.2.3 Undo Phase 583 20.3 Media Recovery 586 20.4 Other Algorithms and Interaction with Concurrency Control 587 20.5 Points to Review 588 Part VII ADVANCED TOPICS 595 21 PARALLEL AND DISTRIBUTED DATABASES 597 21.1 Architectures for Parallel Databases 598 21.2 Parallel Query Evaluation 600 21.2.1 Data Partitioning 601 21.2.2 Parallelizing Sequential Operator Evaluation Code 601 21.3 Parallelizing Individual Operations 602 21.3.1 Bulk Loading and Scanning 602
ntents XVIl 21.3.2 Sorting 213.3 Joins 21.4 Parallel Query Optimization 21.5 Introduction to Distributed Databases 21.5. 1 Types of Distributed Databases 21.6 Distributed DBMS Architectures 21.6.1 Client-Server Systems 21.6.2 Collaborating Server Systems 21.6.3 Middleware Systems 21.7 Storing Data in a Distributed DBMS 610 21.7.1 Fragmentation 21.7.2 Replication 21.8 Distributed Catalog Management 21.8.1 Naming Objects 612 21.8.2 Catalog Structure 21.8.3 Distributed Data Independence 613 21.9 Distributed Query Processing 614 21.9.1 Nonjoin Queries in a Distributed DBMs 614 21.9.2 Joins in a Distributed DBMS 21.9.3 Cost-Based Query Optimization 619 21.10 Updating Distributed Data 21.10. 1 Synchronous Replication 21.10.2 Asynchronous Replication 21.11 Introduction to Distributed Transactions 21.12 Distributed Concurrency Control 625 21.12.1 Distributed Deadlock 21.13 Distributed Recovery 21 13.1 Normal Execution and Commit Protocols 21.13.2 Restart after a Failure 21 13.3 Two-Phase Commit Revisited 21 13. 4 Three-Phase Commit 21.14 Points to review 22 INTERNET DATABASES 642 22.1 The World wide Web 22.1.1 Introduction to hTML 22.1.2 Databases and the Web 2 Architecture 22.2.1 Application Servers and Server-Side Java 3 Beyond HTML 22. 3.1 Introduction to XML 22.3.2 XML DTDs 22.3.3 Domain-Specific DTDs 22.3.4 XML-QL: Querying XML Data 659
Contents xvii 21.3.2 Sorting 602 21.3.3 Joins 603 21.4 Parallel Query Optimization 606 21.5 Introduction to Distributed Databases 607 21.5.1 Types of Distributed Databases 607 21.6 Distributed DBMS Architectures 608 21.6.1 Client-Server Systems 608 21.6.2 Collaborating Server Systems 609 21.6.3 Middleware Systems 609 21.7 Storing Data in a Distributed DBMS 610 21.7.1 Fragmentation 610 21.7.2 Replication 611 21.8 Distributed Catalog Management 611 21.8.1 Naming Objects 612 21.8.2 Catalog Structure 612 21.8.3 Distributed Data Independence 613 21.9 Distributed Query Processing 614 21.9.1 Nonjoin Queries in a Distributed DBMS 614 21.9.2 Joins in a Distributed DBMS 615 21.9.3 Cost-Based Query Optimization 619 21.10 Updating Distributed Data 619 21.10.1 Synchronous Replication 620 21.10.2 Asynchronous Replication 621 21.11 Introduction to Distributed Transactions 624 21.12 Distributed Concurrency Control 625 21.12.1 Distributed Deadlock 625 21.13 Distributed Recovery 627 21.13.1 Normal Execution and Commit Protocols 628 21.13.2 Restart after a Failure 629 21.13.3 Two-Phase Commit Revisited 630 21.13.4 Three-Phase Commit 632 21.14 Points to Review 632 22 INTERNET DATABASES 642 22.1 The World Wide Web 643 22.1.1 Introduction to HTML 643 22.1.2 Databases and the Web 645 22.2 Architecture 645 22.2.1 Application Servers and Server-Side Java 647 22.3 Beyond HTML 651 22.3.1 Introduction to XML 652 22.3.2 XML DTDs 654 22.3.3 Domain-Specific DTDs 657 22.3.4 XML-QL: Querying XML Data 659
DATABASE MANAGEMENT SYSTEMS 22.3.5 The Semistructured Data Model 22.3.6 Implementation Issues for Semistructured Data 22.4 Indexing for Text Search 22.4.1 Inverted Files 22.4.2 Signat 22.5 Ranked Keyword Searches on the Web 22.5. 1 An Algorithm for Ranking Web Pages 22.6 Points to review 23 DECISION SUPPORT 677 23.1 Introduction to Decision Support 678 23.2 Data Warehousing 23.2.1 Creating and Maintaining a Warehouse 23.3 OLAP 2 3.3.1 Multidimensional data model 682 3.2 OLAP Queries 23. 3.3 Database Design for OLAP 23.4 Implementation Techniques for OLAP 23.4.1 Bitmap Indexes 23.4.3 File Organizations 693 23.4.4 Additional OLAP Implementation Issues 23.5 Views and Decision Support 23.5.1 Views, OLAP, and Warehousing 23.5.2 Query Modification 23.5.3 View Materialization versus Computing on Demand 23. 5.4 Issues in View Materialization 23.6 Finding Answers Quickly 23.6. 1 Top N Queries 23.6.2 Online Aggregation 23.7 Points to review 24 DATA MINING 707 24.1 Introduction to Data minin 24.2 Counting Co-occurrences 24.2.2 Iceberg Queries 24.3 Mining for Rules 243.1 Association rules 714 4.3.2 An Algorithm for Finding Association rules 24.3.3 Association rules and isa hierarchies 15 24.3.4 Generalized Association rules 716 4.3.5 Sequential Patterns
xviii Database Management Systems 22.3.5 The Semistructured Data Model 661 22.3.6 Implementation Issues for Semistructured Data 663 22.4 Indexing for Text Search 663 22.4.1 Inverted Files 665 22.4.2 Signature Files 666 22.5 Ranked Keyword Searches on the Web 667 22.5.1 An Algorithm for Ranking Web Pages 668 22.6 Points to Review 671 23 DECISION SUPPORT 677 23.1 Introduction to Decision Support 678 23.2 Data Warehousing 679 23.2.1 Creating and Maintaining a Warehouse 680 23.3 OLAP 682 23.3.1 Multidimensional Data Model 682 23.3.2 OLAP Queries 685 23.3.3 Database Design for OLAP 689 23.4 Implementation Techniques for OLAP 690 23.4.1 Bitmap Indexes 691 23.4.2 Join Indexes 692 23.4.3 File Organizations 693 23.4.4 Additional OLAP Implementation Issues 693 23.5 Views and Decision Support 694 23.5.1 Views, OLAP, and Warehousing 694 23.5.2 Query Modification 695 23.5.3 View Materialization versus Computing on Demand 696 23.5.4 Issues in View Materialization 698 23.6 Finding Answers Quickly 699 23.6.1 Top N Queries 700 23.6.2 Online Aggregation 701 23.7 Points to Review 702 24 DATA MINING 707 24.1 Introduction to Data Mining 707 24.2 Counting Co-occurrences 708 24.2.1 Frequent Itemsets 709 24.2.2 Iceberg Queries 711 24.3 Mining for Rules 713 24.3.1 Association Rules 714 24.3.2 An Algorithm for Finding Association Rules 714 24.3.3 Association Rules and ISA Hierarchies 715 24.3.4 Generalized Association Rules 716 24.3.5 Sequential Patterns 717
Contents XIX 24.3.6 The Use of Association Rules for Prediction 2 4.3.7 Bavesian Networks 719 24.3.8 Classification and Regression Rules 24.4 Tree-Structured rules 24.41 Decision trees 4.4.2 An Algorithm to Build Decision Trees 24.5 Clustering 4.5. 1 A Clustering Algorithm 28 24.6 Similarity Search over Sequence 24.6.1 An Algorithm to Find Similar Sequences 24.7 Additional Data Mining Tasks 24.8 Points to review 25 OBJECT-DATABASE SYSTEMS 25.1 Motivating Example 25.1.1 New Data Types 25. 1.2 Manipulating the New Kinds of Data 39 25.2 User-Defined Abstract Data Types 25.2.1 Defining Methods of an ADT 25.3 Structured Types 3.1 Manipulating Data of Structured Types 25.4 Objects, Object Identity, and Reference Types 748 25.4.1 Notions of Equality 25. 4.2 Dereferencing Reference Type 25.5 Inheritance 25.5.1 Defining Types with Inheritance 25.5.2 Binding of methods 25.5.3 Collection Hierarchies, Type Extents, and Queries 25.6 Database Design for an ORDBMS 25.6.1 Structured Types and ADTs 753 25.6.2 Object Identity 25.6.3 Extending the ER Model 25.6.4 Using Nested Collections 5.7 New Challenges in Implementing an ORDBMS 759 25.7.1 Storage and Access Methods 25.7.2 Query Processing 25.7.3 Query Optimization 5. 8 OODBMS 258.1 The odmg data model and odl 25.8.2OQL 768 25.9 Comparing RDBMS with OODBMS and ORDBMS 769 25.91 RDBMS versus ORDBMS 25.9.2 OODBMS versus ORDBMS: Similarities 25.9. 3 OODBMS versus ORDBMS: Differences
Contents xix 24.3.6 The Use of Association Rules for Prediction 718 24.3.7 Bayesian Networks 719 24.3.8 Classification and Regression Rules 720 24.4 Tree-Structured Rules 722 24.4.1 Decision Trees 723 24.4.2 An Algorithm to Build Decision Trees 725 24.5 Clustering 726 24.5.1 A Clustering Algorithm 728 24.6 Similarity Search over Sequences 729 24.6.1 An Algorithm to Find Similar Sequences 730 24.7 Additional Data Mining Tasks 731 24.8 Points to Review 732 25 OBJECT-DATABASE SYSTEMS 736 25.1 Motivating Example 737 25.1.1 New Data Types 738 25.1.2 Manipulating the New Kinds of Data 739 25.2 User-Defined Abstract Data Types 742 25.2.1 Defining Methods of an ADT 743 25.3 Structured Types 744 25.3.1 Manipulating Data of Structured Types 745 25.4 Objects, Object Identity, and Reference Types 748 25.4.1 Notions of Equality 749 25.4.2 Dereferencing Reference Types 750 25.5 Inheritance 750 25.5.1 Defining Types with Inheritance 751 25.5.2 Binding of Methods 751 25.5.3 Collection Hierarchies, Type Extents, and Queries 752 25.6 Database Design for an ORDBMS 753 25.6.1 Structured Types and ADTs 753 25.6.2 Object Identity 756 25.6.3 Extending the ER Model 757 25.6.4 Using Nested Collections 758 25.7 New Challenges in Implementing an ORDBMS 759 25.7.1 Storage and Access Methods 760 25.7.2 Query Processing 761 25.7.3 Query Optimization 763 25.8 OODBMS 765 25.8.1 The ODMG Data Model and ODL 765 25.8.2 OQL 768 25.9 Comparing RDBMS with OODBMS and ORDBMS 769 25.9.1 RDBMS versus ORDBMS 769 25.9.2 OODBMS versus ORDBMS: Similarities 770 25.9.3 OODBMS versus ORDBMS: Differences 770
DATABASE MANAGEMENT SYSTEMS 25.10 Points to review 26 SPATIAL DATA MANAGEMENT 26.1 Types of Spatial Data and Queries 6.2 Applications Involving Spatial Data 26.3 Introduction to Spatial Indexes 26. 3.1 Overview of Proposed Index Structures 26.4 Indexing Based on Space-Filling Curves 26.4.1 Region Quad Trees and Z-Ordering: Region Data 26.4.2 Spatial Queries Using Z-Ordering 26.5 Grid Files 786 26.5.1 Adapting Grid Files to Handle Regions 26.6 R Trees: Point and Region Data 789 6.6.1 Queries 26.6.2 Insert and Delete Operations 792 6.6.3 Concurrency Control 793 26. 6.4 Generalized Search Trees 26.7 Issues in High-Dimensional Indexing 795 26.8 points to review 795 27 DEDUCTIVE DATABASES 799 27.1 Introduction to Recursive Queries 27.1.1 Datalog 27.2 Theoretical Foundations 27.2.1 Least Model semantics 27.2.2 Safe Datalog Programs 27 2.3 The Fixpoint Operator 27. 2.4 Least Model Least Fixpoint 27.3 Recursive Queries with Negation 7.3.1 Range-Restriction and Negation 27.32 Stratification 7.3.3 Aggregate Operations 27.4 Efficient Evaluation of Recursive Queries 7.4.1 Fixpoint Evaluation without Repeated Inferences 27. 4.2 Pushing Selections to Avoid Irrelevant Inferences 816 27.5 Points to review 818 28 ADDITIONAL TOPICS 28.1 Advanced Transaction Processing 1.1 Transaction Processing Monitors 8. 1. 2 New Transaction models 1 3 Real-Time DBMSs 8.2 Integrated Access to Multiple Data Sources
xx Database Management Systems 25.10 Points to Review 771 26 SPATIAL DATA MANAGEMENT 777 26.1 Types of Spatial Data and Queries 777 26.2 Applications Involving Spatial Data 779 26.3 Introduction to Spatial Indexes 781 26.3.1 Overview of Proposed Index Structures 782 26.4 Indexing Based on Space-Filling Curves 783 26.4.1 Region Quad Trees and Z-Ordering: Region Data 784 26.4.2 Spatial Queries Using Z-Ordering 785 26.5 Grid Files 786 26.5.1 Adapting Grid Files to Handle Regions 789 26.6 R Trees: Point and Region Data 789 26.6.1 Queries 790 26.6.2 Insert and Delete Operations 792 26.6.3 Concurrency Control 793 26.6.4 Generalized Search Trees 794 26.7 Issues in High-Dimensional Indexing 795 26.8 Points to Review 795 27 DEDUCTIVE DATABASES 799 27.1 Introduction to Recursive Queries 800 27.1.1 Datalog 801 27.2 Theoretical Foundations 803 27.2.1 Least Model Semantics 804 27.2.2 Safe Datalog Programs 805 27.2.3 The Fixpoint Operator 806 27.2.4 Least Model = Least Fixpoint 807 27.3 Recursive Queries with Negation 808 27.3.1 Range-Restriction and Negation 809 27.3.2 Stratification 809 27.3.3 Aggregate Operations 812 27.4 Efficient Evaluation of Recursive Queries 813 27.4.1 Fixpoint Evaluation without Repeated Inferences 814 27.4.2 Pushing Selections to Avoid Irrelevant Inferences 816 27.5 Points to Review 818 28 ADDITIONAL TOPICS 822 28.1 Advanced Transaction Processing 822 28.1.1 Transaction Processing Monitors 822 28.1.2 New Transaction Models 823 28.1.3 Real-Time DBMSs 824 28.2 Integrated Access to Multiple Data Sources 824