Tuning of Hardware Even well-tuned transactions typically require a few I/O operations Typical disk supports about 100 random I/O operations per second Suppose each transaction requires just 2 random I/O operations. Then to support n transactions per second,we need to stripe data across n/50 disks(ignoring skew) Number of I/O operations per transaction can be reduced by keeping more data in memory If all data is in memory,I/O needed only for writes Keeping frequently used data in memory reduces disk accesses, reducing number of disks required,but has a memory cost Database System Concepts,5th Ed. 23.7 @Silberschatz,Korth and Sudarshan
Database System Concepts, 5 23.7 ©Silberschatz, Korth and Sudarshan th Ed. 7 Tuning of Hardware Even well-tuned transactions typically require a few I/O operations Typical disk supports about 100 random I/O operations per second Suppose each transaction requires just 2 random I/O operations. Then to support n transactions per second, we need to stripe data across n/50 disks (ignoring skew) Number of I/O operations per transaction can be reduced by keeping more data in memory If all data is in memory, I/O needed only for writes Keeping frequently used data in memory reduces disk accesses, reducing number of disks required, but has a memory cost
Hardware Tuning:Five-Minute Rule Question:which data to keep in memory: If a page is accessed n times per second,keeping it in memory saves n* price-per-disk-drive accesses-per-second-per-disk Cost of keeping page in memory price-per-MB-of-memory ages-per-MB-of-memory Break-even point:value of n for which above costs are equal If accesses are more then saving is greater than cost Solving above equation with current disk and memory prices leads to: 5-minute rule:if a page that is randomly accessed is used more frequently than once in 5 minutes it should be kept in memory (by buying sufficient memory!) 8 Database System Concepts,5th Ed. 23.8 ©Silberschat乜,Korth and Sudarshan
Database System Concepts, 5 23.8 ©Silberschatz, Korth and Sudarshan th Ed. 8 Hardware Tuning: Five-Minute Rule Question: which data to keep in memory: If a page is accessed n times per second, keeping it in memory saves n * price-per-disk-drive accesses-per-second-per-disk Cost of keeping page in memory price-per-MB-of-memory ages-per-MB-of-memory Break-even point: value of n for which above costs are equal If accesses are more then saving is greater than cost Solving above equation with current disk and memory prices leads to: 5-minute rule: if a page that is randomly accessed is used more frequently than once in 5 minutes it should be kept in memory (by buying sufficient memory!)
Hardware Tuning:One-Minute Rule For sequentially accessed data,more pages can be read per second. Assuming sequential reads of 1MB of data at a time: 1-minute rule:sequentially accessed data that is accessed once or more in a minute should be kept in memory Prices of disk and memory have changed greatly over the years,but the ratios have not changed much so rules remain as 5 minute and 1 minute rules,not 1 hour or 1 second rules! Database System Concepts,5th Ed. 23.9 ©Silberschat乜,Korth and Sudarshan
Database System Concepts, 5 23.9 ©Silberschatz, Korth and Sudarshan th Ed. 9 Hardware Tuning: One-Minute Rule For sequentially accessed data, more pages can be read per second. Assuming sequential reads of 1MB of data at a time: 1-minute rule: sequentially accessed data that is accessed once or more in a minute should be kept in memory Prices of disk and memory have changed greatly over the years, but the ratios have not changed much so rules remain as 5 minute and 1 minute rules, not 1 hour or 1 second rules!
Hardware Tuning:Choice of RAID Level To use RAID 1 or RAID 5? Depends on ratio of reads and writes RAID 5 requires 2 block reads and 2 block writes to write out one data block If an application requires r reads and w writes per second RAID 1 requires r+2w l/O operations per second RAID 5 requires:r+4w l/O operations per second For reasonably large r and w,this requires lots of disks to handle workload RAID 5 may require more disks than RAID 1 to handle load! Apparent saving of number of disks by RAID 5(by using parity,as opposed to the mirroring done by RAID 1)may be illusory! Thumb rule:RAID 5 is fine when writes are rare and data is very large, but RAID 1 is preferable otherwise If you need more disks to handle I/O load,just mirror them since disk capacities these days are enormous! 10 Database System Concepts,5th Ed. 23.10 @Silberschatz,Korth and Sudarshan
Database System Concepts, 5 23.10 ©Silberschatz, Korth and Sudarshan th Ed. 10 Hardware Tuning: Choice of RAID Level To use RAID 1 or RAID 5? Depends on ratio of reads and writes RAID 5 requires 2 block reads and 2 block writes to write out one data block If an application requires r reads and w writes per second RAID 1 requires r + 2w I/O operations per second RAID 5 requires: r + 4w I/O operations per second For reasonably large r and w, this requires lots of disks to handle workload RAID 5 may require more disks than RAID 1 to handle load! Apparent saving of number of disks by RAID 5 (by using parity, as opposed to the mirroring done by RAID 1) may be illusory! Thumb rule: RAID 5 is fine when writes are rare and data is very large, but RAID 1 is preferable otherwise If you need more disks to handle I/O load, just mirror them since disk capacities these days are enormous!
Tuning the Database Design Schema tuning Vertically partition relations to isolate the data that is accessed most often--only fetch needed information. 。 E.g.,split account into two,(account-number,branch-name)and (account-number,balance). Branch-name need not be fetched unless required Improve performance by storing a denormalized relation E.g.,store join of account and depositor;branch-name and balance information is repeated for each holder of an account,but join need not be computed repeatedly. Price paid:more space and more work for programmer to keep relation consistent on updates better to use materialized views (more on this later.. Cluster together on the same disk page records that would match in a frequently required join, compute join very efficiently when required. 11 Database System Concepts,5th Ed. 23.11 ©Silberschat乜,Korth and Sudarshan
Database System Concepts, 5 23.11 ©Silberschatz, Korth and Sudarshan th Ed. 11 Tuning the Database Design Schema tuning Vertically partition relations to isolate the data that is accessed most often -- only fetch needed information. • E.g., split account into two, (account-number, branch-name) and (account-number, balance). • Branch-name need not be fetched unless required Improve performance by storing a denormalized relation • E.g., store join of account and depositor; branch-name and balance information is repeated for each holder of an account, but join need not be computed repeatedly. • Price paid: more space and more work for programmer to keep relation consistent on updates • better to use materialized views (more on this later..) Cluster together on the same disk page records that would match in a frequently required join, compute join very efficiently when required