ntents 7. 7.1 Fixed-Length records 7.7.2 Variable-Length Records 7. 8 Points to review 8 FILE ORGANIZATIONS AND INDEXES 230 8. 1 Cost model 8.2 Comparison of Three File Organizations 232 8.2.1 Heap Files Sorted files Hashed files Choosing a File Organization 8. 3 Overview of Indexes 8. 3.1 Alternatives for Data Entries in an Index 4 Properties of Indexes 8. 4.1 Clustered versus Unclustered Indexe 8.4.2 Dense versus Sparse Indexes 994 8.4.3 Primary and Secondary Indexes 242 8.4.4 Indexes Using Composite Search Keys 8.5 Index Specification in SQL-g 8. 6 Points to review 9 TREE-STRUCTURED INDEXING 247 9.1 Indexed Sequential Access Method(ISAM) 248 9.2 B+ Trees: A Dynamic Index Structure 9.3 Format of a Node 9.5 Insert 9.6 Delete s 9.7 Duplicates 9.8 B+ Trees in Practice 9.8.1 Key Compression 9.8.2 Bulk-Loading a B+Tree The Order Concept he Effect of Inserts and Deletes on rids 9.9 Points to Review 10 HASH-BASED INDEXING 10.1 Static Hashing 23 .0.1.1 Notation and Conventions 10.2 Extendible Hashing 10.4 Extendible Hashing versus Linear Hashing 10.5 Points to review
Contents xi 7.7.1 Fixed-Length Records 222 7.7.2 Variable-Length Records 222 7.8 Points to Review 224 8 FILE ORGANIZATIONS AND INDEXES 230 8.1 Cost Model 231 8.2 Comparison of Three File Organizations 232 8.2.1 Heap Files 232 8.2.2 Sorted Files 233 8.2.3 Hashed Files 235 8.2.4 Choosing a File Organization 236 8.3 Overview of Indexes 237 8.3.1 Alternatives for Data Entries in an Index 238 8.4 Properties of Indexes 239 8.4.1 Clustered versus Unclustered Indexes 239 8.4.2 Dense versus Sparse Indexes 241 8.4.3 Primary and Secondary Indexes 242 8.4.4 Indexes Using Composite Search Keys 243 8.5 Index Specification in SQL-92 244 8.6 Points to Review 244 9 TREE-STRUCTURED INDEXING 247 9.1 Indexed Sequential Access Method (ISAM) 248 9.2 B+ Trees: A Dynamic Index Structure 253 9.3 Format of a Node 254 9.4 Search 255 9.5 Insert 257 9.6 Delete * 260 9.7 Duplicates * 265 9.8 B+ Trees in Practice * 266 9.8.1 Key Compression 266 9.8.2 Bulk-Loading a B+ Tree 268 9.8.3 The Order Concept 271 9.8.4 The Effect of Inserts and Deletes on Rids 272 9.9 Points to Review 272 10 HASH-BASED INDEXING 278 10.1 Static Hashing 278 10.1.1 Notation and Conventions 280 10.2 Extendible Hashing * 280 10.3 Linear Hashing * 286 10.4 Extendible Hashing versus Linear Hashing * 291 10.5 Points to Review 292
DATABASE MANAGEMENT SYSTEMS Part Iv QUERY EVALUATION 11 EXTERNAL SORTING 01 11.1 A Simple Two-Way Merge Sort 11.2 External Merge Sort 11.2.1 Minimizing the Number of runs 11.3 Minimizing I/O Cost versus Number of I/Os 11.3.1 Blocked I/O 310 11.3.2 Double Buffering 1.4 Using B+ Trees for Sorting 312 11.4.1 Clustered Index 312 11.4.2 Unclustered index 11.5 Points to review 315 12 EVALUATION OF RELATIONAL OPERATORS 319 12.1 Introduction to Query Processing 2.1.1 Access paths 12.1.2 Preliminaries: Examples and Cost Calculations 12.2 The Selection Operation 12.2.1 No Index. Unsorted Data 12.2.2 No Index Sorted Data 12.2.3 B+ Tree Index 12.2.4 Hash Index, Equality Selection 12.3 General Selection Conditions 12.3.1 CNF and Index Matching 325 12.3.2 Evaluating Selections without Disjunction 12.3.3 Selections with Disjunction 12.4 The Projection Operation 12.4.1 Projection Based on Sorting 7990 12.4.2 Projection Based on Hashing 12.4.3 Sorting versus Hashing for Projections 12.4.4 Use of Indexes for Projection 12.5 The Join Operation 12.5.1 Nested Loops join 12.5.2 Sort-Merge Join 12.5.3 Hash Join 12. 5. 4 General Join Conditions 12.6 The Set Operations 12.6.1 Sorting for Union and Difference 38990 12.6.2 Hashing for Union and Difference Aggregate Operations 12.7.1 Implementing Aggregation by Using an Index The Impact of Buffering 352
xii Database Management Systems Part IV QUERY EVALUATION 299 11 EXTERNAL SORTING 301 11.1 A Simple Two-Way Merge Sort 302 11.2 External Merge Sort 305 11.2.1 Minimizing the Number of Runs * 308 11.3 Minimizing I/O Cost versus Number of I/Os 309 11.3.1 Blocked I/O 310 11.3.2 Double Buffering 311 11.4 Using B+ Trees for Sorting 312 11.4.1 Clustered Index 312 11.4.2 Unclustered Index 313 11.5 Points to Review 315 12 EVALUATION OF RELATIONAL OPERATORS 319 12.1 Introduction to Query Processing 320 12.1.1 Access Paths 320 12.1.2 Preliminaries: Examples and Cost Calculations 321 12.2 The Selection Operation 321 12.2.1 No Index, Unsorted Data 322 12.2.2 No Index, Sorted Data 322 12.2.3 B+ Tree Index 323 12.2.4 Hash Index, Equality Selection 324 12.3 General Selection Conditions * 325 12.3.1 CNF and Index Matching 325 12.3.2 Evaluating Selections without Disjunction 326 12.3.3 Selections with Disjunction 327 12.4 The Projection Operation 329 12.4.1 Projection Based on Sorting 329 12.4.2 Projection Based on Hashing * 330 12.4.3 Sorting versus Hashing for Projections * 332 12.4.4 Use of Indexes for Projections * 333 12.5 The Join Operation 333 12.5.1 Nested Loops Join 334 12.5.2 Sort-Merge Join * 339 12.5.3 Hash Join * 343 12.5.4 General Join Conditions * 348 12.6 The Set Operations * 349 12.6.1 Sorting for Union and Difference 349 12.6.2 Hashing for Union and Difference 350 12.7 Aggregate Operations * 350 12.7.1 Implementing Aggregation by Using an Index 351 12.8 The Impact of Buffering * 352
Contents 12.9 Points to review 13 INTRODUCTION TO QUERY OPTIMIZATION 359 3.1 Overview of Relational Query Optimization 13.1.1 Query Evaluation Plans 13.1.2 Pipelined Evaluation 13.1.3 The Iterator Interface for Operators and Access Methods 13.1.4 The System R Optimizer 3.2 System Catalog in a Relational DBMS 365 3.2.1 Information Stored in the System Catalog 3.3 Alternative Plans: A Motivating Example 13.3.1 Pushing selections 13.3.2 Using Indexes 13.4 Points to review 14 A TYPICAL RELATIONAL QUERY OPTIMIZER 37 14.1 Translating SQL Queries into Algebra 14.1.1 Decomposition of a Query into Blocks 375 14.1.2 A Query Block as a Relational Algebra Expression 14.2 Estimating the Cost of a Plan 378 14.3 Relational Algebra Equivalences 383 14.3.1 Selections 14.3.2 Projections 14.3.3 Cross-Products and joins 384 14.3.4 Selects, Projects, and Joins 14.3.5 Other Equivalences 387 14.4 Enumeration of Alternative plans 14.4.1 Single-Relation Queries 14.4.2 Multiple-Relation Queries 14.5 Nested Subqueries 14.6 Other Approaches to Query Optimization 14.7 Points to review Part V DATABASE DESIGN 415 15 SCHEMA REFINEMENT AND NORMAL FORMS 417 15.1.1 Problems Caused by Redundancy 418 15.1.2 Use of Decompositions 5.1.3 Problems Related to Decomposition 15.2 Functional Dependencies 15.3 Examples Motivating Schema Refinement
Contents xiii 12.9 Points to Review 353 13 INTRODUCTION TO QUERY OPTIMIZATION 359 13.1 Overview of Relational Query Optimization 360 13.1.1 Query Evaluation Plans 361 13.1.2 Pipelined Evaluation 362 13.1.3 The Iterator Interface for Operators and Access Methods 363 13.1.4 The System R Optimizer 364 13.2 System Catalog in a Relational DBMS 365 13.2.1 Information Stored in the System Catalog 365 13.3 Alternative Plans: A Motivating Example 368 13.3.1 Pushing Selections 368 13.3.2 Using Indexes 370 13.4 Points to Review 373 14 A TYPICAL RELATIONAL QUERY OPTIMIZER 374 14.1 Translating SQL Queries into Algebra 375 14.1.1 Decomposition of a Query into Blocks 375 14.1.2 A Query Block as a Relational Algebra Expression 376 14.2 Estimating the Cost of a Plan 378 14.2.1 Estimating Result Sizes 378 14.3 Relational Algebra Equivalences 383 14.3.1 Selections 383 14.3.2 Projections 384 14.3.3 Cross-Products and Joins 384 14.3.4 Selects, Projects, and Joins 385 14.3.5 Other Equivalences 387 14.4 Enumeration of Alternative Plans 387 14.4.1 Single-Relation Queries 387 14.4.2 Multiple-Relation Queries 392 14.5 Nested Subqueries 399 14.6 Other Approaches to Query Optimization 402 14.7 Points to Review 403 Part V DATABASE DESIGN 415 15 SCHEMA REFINEMENT AND NORMAL FORMS 417 15.1 Introduction to Schema Refinement 418 15.1.1 Problems Caused by Redundancy 418 15.1.2 Use of Decompositions 420 15.1.3 Problems Related to Decomposition 421 15.2 Functional Dependencies 422 15.3 Examples Motivating Schema Refinement 423
DATABASE MANAGEMENT SYSTEMS 5.3.1 Constraints on an Entity Set 15.3.2 Constraints on a Relationship Set 15.3.3 Identifying Attributes of Entitie 15.3.4 Identifying Entity Sets 15.4 Reasoning about Functional Dependencies 15.4.1 Closure of a Set of FDs 15.4.2 Attribute Closure 15.5 Normal Forms 15.5.1 Boyce-Codd Normal Form 15.5.2 Third Normal Form 15.6 Decompositions 15.6.1 Lossless-Join Decomposition 15.6.2 Dependency-Preserving Decomposition 15.7 Normalization 15.7.1 Decomposition into BCNF 15.7.2 Decomposition into 3NF 440 15.8 Other Kinds of Dependencies 15.8.1 Multivalued Dependencies 15.8.2 Fourth Normal Form 15.8.3 Join Dependencies 449 15. 8.4 Fifth normal form 15.8.5 Inclusion Dependencies 449 15.9 Points to review 16 PHYSICAL DATABASE DESIGN AND TUNING 16.1 Introduction to Physical Database Desig 16.1.1 Database workloads 16.1.2 Physical Design and Tuning Decisions 459 16. 1.3 Need for Database Tuning 16.2 Guidelines for Index selection 16.3 Basic Examples of Index Selection 16.4 Clustering and Indexing 16.4.1 Co-clustering Two Relations 16.5 Indexes on Multiple-Attribute Search Keys 16.6 Indexes that Enable Index-Only Plans 16.7 Overview of Database Tuning 474 16.7.1 Tuning Indexes 16.7. 2 Tuning the Conceptual Schema 16.7.3 Tuning Queries and Views 16.8 Choices in Tuning the Conceptual Schema 16.8.1 Settling for a Weaker Normal Form 16.8.2 Denormalization 16.8.3 Choice of Decompositions 16.8.4 Vertical Decomposition
xiv Database Management Systems 15.3.1 Constraints on an Entity Set 423 15.3.2 Constraints on a Relationship Set 424 15.3.3 Identifying Attributes of Entities 424 15.3.4 Identifying Entity Sets 426 15.4 Reasoning about Functional Dependencies 427 15.4.1 Closure of a Set of FDs 427 15.4.2 Attribute Closure 429 15.5 Normal Forms 430 15.5.1 Boyce-Codd Normal Form 430 15.5.2 Third Normal Form 432 15.6 Decompositions 434 15.6.1 Lossless-Join Decomposition 435 15.6.2 Dependency-Preserving Decomposition 436 15.7 Normalization 438 15.7.1 Decomposition into BCNF 438 15.7.2 Decomposition into 3NF * 440 15.8 Other Kinds of Dependencies * 444 15.8.1 Multivalued Dependencies 445 15.8.2 Fourth Normal Form 447 15.8.3 Join Dependencies 449 15.8.4 Fifth Normal Form 449 15.8.5 Inclusion Dependencies 449 15.9 Points to Review 450 16 PHYSICAL DATABASE DESIGN AND TUNING 457 16.1 Introduction to Physical Database Design 458 16.1.1 Database Workloads 458 16.1.2 Physical Design and Tuning Decisions 459 16.1.3 Need for Database Tuning 460 16.2 Guidelines for Index Selection 460 16.3 Basic Examples of Index Selection 463 16.4 Clustering and Indexing * 465 16.4.1 Co-clustering Two Relations 468 16.5 Indexes on Multiple-Attribute Search Keys * 470 16.6 Indexes that Enable Index-Only Plans * 471 16.7 Overview of Database Tuning 474 16.7.1 Tuning Indexes 474 16.7.2 Tuning the Conceptual Schema 475 16.7.3 Tuning Queries and Views 476 16.8 Choices in Tuning the Conceptual Schema * 477 16.8.1 Settling for a Weaker Normal Form 478 16.8.2 Denormalization 478 16.8.3 Choice of Decompositions 479 16.8.4 Vertical Decomposition 480
Contents XV 16.8.5 Horizontal Decomposition 16.9 Choices in Tuning Queries and Views 16.10 Impact of Concurrency 16. 11 DBMS Benchmarking 16.11. 1 Well-Known DBMS Benchmarks 16. 11.2 Using a Benchmark 16. 12 Points to review 487 17 SECURITY 17.1 Introduction to Database Security 17.2 Access Control 17.3 Discretionary Access Control 17.3.1 Grant and Revoke on Views and Int Constraints 17.4 Mandatory Access Control 17.4.1 Multilevel Relations and Polyinstantiation 17.4.2 Covert Channels, DoD Security Levels 17.5 Additional Issues Related to Security 17.5.1 Role of the database administrator 17.5.2 Security in Statistical Databases 8012234 17.5.3 Encryptic 17.6 Points to review Part VI TRANSACTION MANAGEMENT 1 8 TRANSACTION MANAGEMENT OVERVIEW 18.1 The Concept of a Transaction 18.1.1 Consistency and Isolation 18.1.2 Atomicity and Durability 18.2 Transactions and Schedules 18.3 Concurrent Execution of Transactions 567 18.3.1 Motivation for Concurrent Execution 18.3.2 Serializability 18.3.3 Some Anomalies Associated with Interleaved Execution 18.3.4 Schedules Involving Aborted Transactions 18.4 Lock-Based Concurrency Control 18.4.1 Strict Two-Phase Locking(Strict 2PL 18.5 Introduction to Crash Recovery 18.5.1 Stealing Frames and Forcing Pages 8.5.2 Recovery-Related Steps during Normal Execution 18.5.3 Overview of ArIEs 18.6 Points to review 1 9 CONCURRENCY CONTROL
Contents xv 16.8.5 Horizontal Decomposition 481 16.9 Choices in Tuning Queries and Views * 482 16.10 Impact of Concurrency * 484 16.11 DBMS Benchmarking * 485 16.11.1 Well-Known DBMS Benchmarks 486 16.11.2 Using a Benchmark 486 16.12 Points to Review 487 17 SECURITY 497 17.1 Introduction to Database Security 497 17.2 Access Control 498 17.3 Discretionary Access Control 499 17.3.1 Grant and Revoke on Views and Integrity Constraints * 506 17.4 Mandatory Access Control * 508 17.4.1 Multilevel Relations and Polyinstantiation 510 17.4.2 Covert Channels, DoD Security Levels 511 17.5 Additional Issues Related to Security * 512 17.5.1 Role of the Database Administrator 512 17.5.2 Security in Statistical Databases 513 17.5.3 Encryption 514 17.6 Points to Review 517 Part VI TRANSACTION MANAGEMENT 521 18 TRANSACTION MANAGEMENT OVERVIEW 523 18.1 The Concept of a Transaction 523 18.1.1 Consistency and Isolation 525 18.1.2 Atomicity and Durability 525 18.2 Transactions and Schedules 526 18.3 Concurrent Execution of Transactions 527 18.3.1 Motivation for Concurrent Execution 527 18.3.2 Serializability 528 18.3.3 Some Anomalies Associated with Interleaved Execution 528 18.3.4 Schedules Involving Aborted Transactions 531 18.4 Lock-Based Concurrency Control 532 18.4.1 Strict Two-Phase Locking (Strict 2PL) 532 18.5 Introduction to Crash Recovery 533 18.5.1 Stealing Frames and Forcing Pages 535 18.5.2 Recovery-Related Steps during Normal Execution 536 18.5.3 Overview of ARIES 537 18.6 Points to Review 537 19 CONCURRENCY CONTROL 540