当前位置：和泉文库 > 计算机 > 《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 22 Parallel and Distributed Query Processing

《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 22 Parallel and Distributed Query Processing

▪ Overview ▪ Parallel Sort ▪ Parallel Join ▪ Other Operations ▪ Parallel Evaluation of Query Plans ▪ Query Processing on Shared Memory ▪ Query Optimization ▪ Distributed Query Processing

文件格式：PPTX，文件大小：2.24MB，售价：16.22元

共76页，可试读20页，点击往前阅读 ↑↑

文档详细内容（约76页）

Fragment-and-Replicate Join (Cont.) Both versions of fragment-and-replicate work with any join condition, since every tuple in r can be tested with every tuple in s. Usually has a higher cost than partitioning,since one of the relations (for asymmetric fragment-and-replicate)or both relations (for general fragment-and-replicate)have to be replicated. Sometimes asymmetric fragment-and-replicate is preferable even though partitioning could be used. E.g.,if s is small and ris large,and ris already partitioned,it may be cheaper to replicate s across all nodes,rather than repartition r and s on the join attributes. Question:how do you implement left outer join using above join techniques? Database System Concepts-7th Edition 22.17 ©Silberscha乜，Korth and Sudarshan

Database System Concepts - 7 22.17 ©Silberschatz, Korth and Sudarshan th Edition Fragment-and-Replicate Join (Cont.) ▪ Both versions of fragment-and-replicate work with any join condition, since every tuple in r can be tested with every tuple in s. ▪ Usually has a higher cost than partitioning, since one of the relations (for asymmetric fragment-and-replicate) or both relations (for general fragment-and-replicate) have to be replicated. ▪ Sometimes asymmetric fragment-and-replicate is preferable even though partitioning could be used. • E.g., if s is small and r is large, and r is already partitioned, it may be cheaper to replicate s across all nodes, rather than repartition r and s on the join attributes. ▪ Question: how do you implement left outer join using above join techniques?

Handling Skew Skew can significantly slow down parallel join ■Join skew avoidance Balanced partitioning vector Virtual node partitioning Dynamic handling ofjoin skew Detect overloaded physical nodes If a physical node has no remaining work,take on a waiting task (virtual node)currently assigned to a different physical node that is overloaded Example of work stealing Cheaper to implement in shared memory system,but can be used even in shared nothing/shared disk system Database System Concepts-7th Edition 22.18 ©Silberscha乜，Korth and Sudarshan

Database System Concepts - 7 22.18 ©Silberschatz, Korth and Sudarshan th Edition Handling Skew ▪ Skew can significantly slow down parallel join ▪ Join skew avoidance • Balanced partitioning vector • Virtual node partitioning ▪ Dynamic handling of join skew • Detect overloaded physical nodes • If a physical node has no remaining work, take on a waiting task (virtual node) currently assigned to a different physical node that is overloaded • Example of work stealing ▪ Cheaper to implement in shared memory system, but can be used even in shared nothing/shared disk system

Other Relational Operations Selection oe(r) If 0 is of the form a =v,where a;is an attribute and v a value. If r is partitioned on a;the selection is performed at a single node. If 0 is of the form I <=a<=u (i.e.,0 is a range selection)and the relation has been range-partitioned on a Selection is performed at each node whose partition overlaps with the specified range of values. In all other cases:the selection is performed in parallel at all the nodes. Database System Concepts-7th Edition 22.19 ©Silberscha乜，Korth and Sudarshan

Database System Concepts - 7 22.19 ©Silberschatz, Korth and Sudarshan th Edition Other Relational Operations Selection  (r) ▪ If  is of the form ai = v, where ai is an attribute and v a value. • If r is partitioned on ai the selection is performed at a single node. ▪ If  is of the form l <= ai <= u (i.e.,  is a range selection) and the relation has been range-partitioned on ai • Selection is performed at each node whose partition overlaps with the specified range of values. ▪ In all other cases: the selection is performed in parallel at all the nodes

Other Relational Operations (Cont.) Duplicate elimination Perform by using either of the parallel sort techniques eliminate duplicates as soon as they are found during sorting. Can also partition the tuples (using either range-or hash-partitioning) and perform duplicate elimination locally at each node. ■Projection Projection without duplicate elimination can be performed as tuples are read from disk,in parallel. If duplicate elimination is required,any of the above duplicate elimination techniques can be used. Database System Concepts-7th Edition 22.20 ©Silberscha乜，Korth and Sudarshan

Database System Concepts - 7 22.20 ©Silberschatz, Korth and Sudarshan th Edition Other Relational Operations (Cont.) ▪ Duplicate elimination • Perform by using either of the parallel sort techniques ▪ eliminate duplicates as soon as they are found during sorting. • Can also partition the tuples (using either range- or hash- partitioning) and perform duplicate elimination locally at each node. ▪ Projection • Projection without duplicate elimination can be performed as tuples are read from disk, in parallel. • If duplicate elimination is required, any of the above duplicate elimination techniques can be used

Grouping/Aggregation Step 1:Partition the relation on the grouping attributes ■ Step 2:Compute the aggregate values locally at each node. Optimization:Can reduce cost of transferring tuples during partitioning by partial aggregation before partitioning For distributive aggregate Can be done as part of run generation Consider the sum aggregation operation: Perform aggregation operation at each node N;on those tuples stored its local disk results in tuples with partial sums at each node. Result of the local aggregation is partitioned on the grouping attributes,and the aggregation performed again at each node Ni to get the final result. Fewer tuples need to be sent to other nodes during partitioning. Database System Concepts-7th Edition 22.21 @Silberschatz,Korth and Sudarshan

Database System Concepts - 7 22.21 ©Silberschatz, Korth and Sudarshan th Edition Grouping/Aggregation ▪ Step 1: Partition the relation on the grouping attributes ▪ Step 2: Compute the aggregate values locally at each node. ▪ Optimization: Can reduce cost of transferring tuples during partitioning by partial aggregation before partitioning • For distributive aggregate • Can be done as part of run generation • Consider the sum aggregation operation: ▪ Perform aggregation operation at each node Ni on those tuples stored its local disk • results in tuples with partial sums at each node. ▪ Result of the local aggregation is partitioned on the grouping attributes, and the aggregation performed again at each node Ni to get the final result. • Fewer tuples need to be sent to other nodes during partitioning

点击进入文档下载页（PPTX格式）

共76页，试读已结束，阅读完整版请下载

您可能感兴趣的文档

《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 21 Parallel and Distributed Storage
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 20 Database System Architectures
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 2 Intro to Relational Model
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 19 Recovery System
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 18 Concurrency Control
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 17 Transactions
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 16 Query Optimization
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 15 Query Processing
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 14 Indexing
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 13 Data Storage Structures
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 12 Physical Storage Systems
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 11 Data Analytics
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 23 Parallel and Distributed Transaction Processing
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 24 Advanced Indexing
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 25 Advanced Application Development
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 26 Blockchain Databases
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 27 Formal-Relational Query Languages
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 28 Advanced Relational Database Design
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 29 Object-Based Databases
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 3 Introduction to SQL
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 30 XML
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 31 Information Retrieval
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 4 Intermediate SQL
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 5 Advanced SQL

点击购买下载（PPTX）

下载及服务说明

购买前请先查看本文档预览页，确认内容后再进行支付；
如遇文件无法下载、无法访问或其它任何问题，可发送电子邮件反馈，核实后将进行文件补发或退款等其它相关操作；
邮箱：

文档浏览记录