Range Partitioning Redistribute using partitioning vector: 100,175,300,500,800 0-100) [100-175)[175-300)[300-500)[500-800)[800-1000) Database System Concepts-7th Edition 22.7 @Silberschatz,Korth and Sudarshan
Database System Concepts - 7 22.7 ©Silberschatz, Korth and Sudarshan th Edition Range Partitioning
Range-Partitioning Parallel Sort 1)Redistribute using partitioning vector: 100.175,300,500.800 0-100) [100-175) [175-300)[300-500)[500-800)[800-1000) 2)(External)sort locally at each node 3)Merge if output required at one node Database System Concepts-7th Edition 22.8 @Silberschatz,Korth and Sudarshan
Database System Concepts - 7 22.8 ©Silberschatz, Korth and Sudarshan th Edition Range-Partitioning Parallel Sort
Parallel Sort Local Sort Local Sort Merge r Local Sort Local Sort Merge Local Sort Local Sort Merge : Local Sort Local Sort Merge Im 1.Range Partition 2.Local Sort 1.Local Sort 2.Range Partition and Merge (a)Range Partitioning Sort (b)Parallel External Sort-Merge Database System Concepts-7th Edition 22.9 @Silberschatz,Korth and Sudarshan
Database System Concepts - 7 22.9 ©Silberschatz, Korth and Sudarshan th Edition Parallel Sort
Parallel Sort Range-Partitioning Sort ■ Choose nodes N,...,Nmm,where m n-1 to do sorting. Create range-partition vector with m-1 entries,on the sorting attributes Redistribute the relation using range partitioning Each node N sorts its partition of the relation locally. Example of data parallelism:each node executes same operation in parallel with other nodes,without any interaction with the others. Final merge operation is trivial:range-partitioning ensures that, if i<j,all key values in node Nare all less than all key values in N Database System Concepts-7th Edition 22.10 ©Silberscha乜,Korth and Sudarshan
Database System Concepts - 7 22.10 ©Silberschatz, Korth and Sudarshan th Edition Parallel Sort Range-Partitioning Sort ▪ Choose nodes N1 , ..., Nm, where m n -1 to do sorting. ▪ Create range-partition vector with m-1 entries, on the sorting attributes ▪ Redistribute the relation using range partitioning ▪ Each node Ni sorts its partition of the relation locally. • Example of data parallelism: each node executes same operation in parallel with other nodes, without any interaction with the others. ▪ Final merge operation is trivial: range-partitioning ensures that, if i < j, all key values in node Ni are all less than all key values in Nj
Parallel Sort (Cont.) Parallel External Sort-Merge Assume the relation has already been partitioned among nodes N1,...N (in whatever manner). Each node N locally sorts the data (using local disk as required) The sorted runs on each node are then merged in parallel: The sorted partitions at each node Ni are range-partitioned across the processors N1,...,N. Each node Ni performs a merge on the streams as they are received,to get a single sorted run. The sorted runs on nodes N1,...,N are concatenated to get the final result. Algorithm as described vulnerable to execution skew all nodes send to node 1,then all nodes send data to node 2, Can be modified so each node sends data to all other nodes in parallel (block at a time) Database System Concepts-7th Edition 22.11 ©Silberscha乜,Korth and Sudarshan
Database System Concepts - 7 22.11 ©Silberschatz, Korth and Sudarshan th Edition Parallel Sort (Cont.) Parallel External Sort-Merge ▪ Assume the relation has already been partitioned among nodes N1 , ..., Nn (in whatever manner). ▪ Each node Ni locally sorts the data (using local disk as required) ▪ The sorted runs on each node are then merged in parallel: • The sorted partitions at each node Ni are range-partitioned across the processors N1 , ..., Nn . • Each node Ni performs a merge on the streams as they are received, to get a single sorted run. • The sorted runs on nodes N1 , ..., Nn are concatenated to get the final result. ▪ Algorithm as described vulnerable to execution skew • all nodes send to node 1, then all nodes send data to node 2, … • Can be modified so each node sends data to all other nodes in parallel (block at a time)