Partitioned Parallel Join Partition using range or hash partitioning,on join attributes n ☒ 52 ☒ S3 S3 ☒ Step 1:Partition r Step 2:Partition s Step 3:Each node Ni computes ri Database System Concepts-7th Edition 22.12 @Silberschatz,Korth and Sudarshan
Database System Concepts - 7 22.12 ©Silberschatz, Korth and Sudarshan th Edition Partitioned Parallel Join Partition using range or hash partitioning, on join attributes
Partitioned Parallel Join (Cont.) For equi-joins and natural joins,it is possible to partition the two input relations across the processors,and compute the join locally at each processor. ■( Can use either range partitioning or hash partitioning. r and s must be partitioned on their join attributes r.A and s.B),using the same range-partitioning vector or hash function. Join can be computed at each site using any of Hash join,leading to partitioned parallel hash join Merge join,leading to partitioned parallel merge join Nested loops join,leading to partitioned parallel nested-loops join or partitioned parallel index nested-loops join Database System Concepts-7th Edition 22.13 ©Silberscha乜,Korth and Sudarshan
Database System Concepts - 7 22.13 ©Silberschatz, Korth and Sudarshan th Edition Partitioned Parallel Join (Cont.) ▪ For equi-joins and natural joins, it is possible to partition the two input relations across the processors, and compute the join locally at each processor. ▪ Can use either range partitioning or hash partitioning. ▪ r and s must be partitioned on their join attributes r.A and s.B), using the same range-partitioning vector or hash function. ▪ Join can be computed at each site using any of • Hash join, leading to partitioned parallel hash join • Merge join, leading to partitioned parallel merge join • Nested loops join, leading to partitioned parallel nested-loops join or partitioned parallel index nested-loops join
Partitioned Parallel Hash-Join Parallelizing partitioned hash join: A hash function h takes the join attribute value of each tuple in s and maps this tuple to one of the n nodes. As tuples of relation s are received at the destination nodes,they are partitioned further using another hash function,h2,which is used to compute the hash-join locally. Repeat above for each tuple in r. Each node N executes the build and probe phases of the hash-join algorithm on the local partitions r;and s;of r and s to produce a partition of the final result of the hash-join. Note:Hash-join optimizations can be applied to the parallel case e.g.,the hybrid hash-join algorithm can be used to cache some of the incoming tuples in memory and avoid the cost of writing them and reading them back in. Database System Concepts-7th Edition 22.14 ©Silberscha乜,Korth and Sudarshan
Database System Concepts - 7 22.14 ©Silberschatz, Korth and Sudarshan th Edition Partitioned Parallel Hash-Join Parallelizing partitioned hash join: ▪ A hash function h1 takes the join attribute value of each tuple in s and maps this tuple to one of the n nodes. ▪ As tuples of relation s are received at the destination nodes, they are partitioned further using another hash function, h2 , which is used to compute the hash-join locally. ▪ Repeat above for each tuple in r. ▪ Each node Ni executes the build and probe phases of the hash-join algorithm on the local partitions ri and si of r and s to produce a partition of the final result of the hash-join. ▪ Note: Hash-join optimizations can be applied to the parallel case • e.g., the hybrid hash-join algorithm can be used to cache some of the incoming tuples in memory and avoid the cost of writing them and reading them back in
Fragment-and-Replicate Joins Asymmetric and Symmetric Fragment-and-Replicate Joins S1 S2 S3 S4 Sm 1 N N N12 N13 N14 《 N2 r2 21 N2,2 N23 12 N3 5 N31 N32 '3 N YA Database System Concepts-7th Edition 22.15 @Silberschatz,Korth and Sudarshan
Database System Concepts - 7 22.15 ©Silberschatz, Korth and Sudarshan th Edition Fragment-and-Replicate Joins Asymmetric and Symmetric Fragment-and-Replicate Joins
Fragment-and-Replicate Join Partitioning not possible for some join conditions E.g.,non-equijoin conditions,such as r.A>s.B. For joins were partitioning is not applicable,parallelization can be accomplished by fragment and replicate technique ■ Special case-asymmetric fragment-and-replicate: One of the relations,say r,is partitioned;any partitioning technique can be used. The other relation,s,is replicated across all the processors. Node Ni then locally computes the join of ri with all of s using any join technique. Also referred to as broadcastjoin Database System Concepts-7th Edition 22.16 ©Silberscha乜,Korth and Sudarshan
Database System Concepts - 7 22.16 ©Silberschatz, Korth and Sudarshan th Edition Fragment-and-Replicate Join ▪ Partitioning not possible for some join conditions • E.g., non-equijoin conditions, such as r.A > s.B. ▪ For joins were partitioning is not applicable, parallelization can be accomplished by fragment and replicate technique ▪ Special case – asymmetric fragment-and-replicate: • One of the relations, say r, is partitioned; any partitioning technique can be used. • The other relation, s, is replicated across all the processors. • Node Ni then locally computes the join of ri with all of s using any join technique. • Also referred to as broadcast join