Measures of Query Cost Cost is generally measured as total elapsed time for answering query Many factors contribute to time cost disk accesses,CPU,or even network communication Typically disk access is the predominant cost,and is also relatively easy to estimate.Measured by taking into account Number of seeks average-seek-cost Number of blocks read average-block-read-cost Number of blocks written average-block-write-cost Cost to write a block is greater than cost to read a block data is read back after being written to ensure that the write was successful Database System Concepts-6th Edition 12.7 ©Silberschat乜,Korth and Sudarshan
Database System Concepts - 6 12.7 ©Silberschatz, Korth and Sudarshan th Edition Measures of Query Cost Cost is generally measured as total elapsed time for answering query Many factors contribute to time cost disk accesses, CPU, or even network communication Typically disk access is the predominant cost, and is also relatively easy to estimate. Measured by taking into account Number of seeks * average-seek-cost Number of blocks read * average-block-read-cost Number of blocks written * average-block-write-cost Cost to write a block is greater than cost to read a block – data is read back after being written to ensure that the write was successful
Measures of Query Cost(Cont.) For simplicity we just use the number of block transfers from disk and the number of seeks as the cost measures -time to transfer one block ts-time for one seek Cost for b block transfers plus S seeks b *t+S *ts We ignore CPU costs for simplicity Real systems do take CPU cost into account We do not include cost to writing output to disk in our cost formulae Database System Concepts-6th Edition 12.8 @Silberschatz,Korth and Sudarshan
Database System Concepts - 6 12.8 ©Silberschatz, Korth and Sudarshan th Edition Measures of Query Cost (Cont.) For simplicity we just use the number of block transfers from disk and the number of seeks as the cost measures tT – time to transfer one block tS – time for one seek Cost for b block transfers plus S seeks b * tT + S * tS We ignore CPU costs for simplicity Real systems do take CPU cost into account We do not include cost to writing output to disk in our cost formulae
Measures of Query Cost(Cont.) Several algorithms can reduce disk IO by using extra buffer space Amount of real memory available to buffer depends on other concurrent queries and OS processes,known only during execution We often use worst case estimates,assuming only the minimum amount of memory needed for the operation is available Required data may be buffer resident already,avoiding disk I/O But hard to take into account for cost estimation Database System Concepts-6th Edition 12.9 @Silberschatz,Korth and Sudarshan
Database System Concepts - 6 12.9 ©Silberschatz, Korth and Sudarshan th Edition Measures of Query Cost (Cont.) Several algorithms can reduce disk IO by using extra buffer space Amount of real memory available to buffer depends on other concurrent queries and OS processes, known only during execution We often use worst case estimates, assuming only the minimum amount of memory needed for the operation is available Required data may be buffer resident already, avoiding disk I/O But hard to take into account for cost estimation
Selection Operation File scan Algorithm A1 (linear search).Scan each file block and test all records to see whether they satisfy the selection condition. Cost estimate=b,block transfers +1 seek b,denotes number of blocks containing records from relation r If selection is on a key attribute,can stop on finding record cost=(b,/2)block transfers +1 seek Linear search can be applied regardless of selection condition or ordering of records in the file,or availability of indices Note:binary search generally does not make sense since data is not stored consecutively except when there is an index available, and binary search requires more seeks than index search Database System Concepts-6th Edition 12.10 ©Silberschat乜,Korth and Sudarshan
Database System Concepts - 6 12.10 ©Silberschatz, Korth and Sudarshan th Edition Selection Operation File scan Algorithm A1 (linear search). Scan each file block and test all records to see whether they satisfy the selection condition. Cost estimate = br block transfers + 1 seek br denotes number of blocks containing records from relation r If selection is on a key attribute, can stop on finding record cost = (br /2) block transfers + 1 seek Linear search can be applied regardless of selection condition or ordering of records in the file, or availability of indices Note: binary search generally does not make sense since data is not stored consecutively except when there is an index available, and binary search requires more seeks than index search
Selections Using Indices Index scan-search algorithms that use an index selection condition must be on search-key of index. A2(primary index,equality on key).Retrieve a single record that satisfies the corresponding equality condition Cost=(hi+1)(t+ts) A3(primary index,equality on nonkey)Retrieve multiple records. Records will be on consecutive blocks Let b number of blocks containing matching records Cost=hi *(tT+ts)+ts+t*b Database System Concepts-6th Edition 12.11 @Silberschatz,Korth and Sudarshan
Database System Concepts - 6 12.11 ©Silberschatz, Korth and Sudarshan th Edition Selections Using Indices Index scan – search algorithms that use an index selection condition must be on search-key of index. A2 (primary index, equality on key). Retrieve a single record that satisfies the corresponding equality condition Cost = (hi + 1) * (tT + tS) A3 (primary index, equality on nonkey) Retrieve multiple records. Records will be on consecutive blocks Let b = number of blocks containing matching records Cost = hi * (tT + tS ) + tS + tT * b