当前位置：和泉文库 > 计算机 > 浏览文档

《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 10 Big Data

文件格式：PPTX，文件大小：1.54MB，售价：11.8元

文档详细内容（约53页）

Hadoop Distributed File System(HDFS) NameNode Maps a filename to list of Block IDs Maps each Block ID to DataNodes containing a replica of the block DataNode:Maps a Block ID to a physical location on disk Data Coherency Write-once-read-many access model Client can only append to existing files Distributed file systems good for millions of large files But have very high overheads and poor performance with billions of smaller tuples Database System Concepts-7th Edition 10.7 ©Silberscha乜，Korth and Sudarshan

Database System Concepts - 7 10.7 ©Silberschatz, Korth and Sudarshan th Edition Hadoop Distributed File System (HDFS) ▪ NameNode • Maps a filename to list of Block IDs • Maps each Block ID to DataNodes containing a replica of the block ▪ DataNode: Maps a Block ID to a physical location on disk ▪ Data Coherency • Write-once-read-many access model • Client can only append to existing files ▪ Distributed file systems good for millions of large files • But have very high overheads and poor performance with billions of smaller tuples

Sharding Sharding:partition data across multiple databases Partitioning usually done on some partitioning attributes(also known as partitioning keys or shard keys e.g.user ID E.g.,records with key values from 1 to 100,000 on database 1, records with key values from 100,001 to 200,000 on database 2,etc. Application must track which records are on which database and send queries/updates to that database Positives:scales well,easy to implement Drawbacks: Not transparent:application has to deal with routing of queries, queries that span multiple databases When a database is overloaded,moving part of its load out is not easy Chance of failure more with more databases need to keep replicas to ensure availability,which is more work for application Database System Concepts-7th Edition 10.8 @Silberschatz,Korth and Sudarshan

Database System Concepts - 7 10.8 ©Silberschatz, Korth and Sudarshan th Edition Sharding ▪ Sharding: partition data across multiple databases ▪ Partitioning usually done on some partitioning attributes (also known as partitioning keys or shard keys e.g. user ID • E.g., records with key values from 1 to 100,000 on database 1, records with key values from 100,001 to 200,000 on database 2, etc. ▪ Application must track which records are on which database and send queries/updates to that database ▪ Positives: scales well, easy to implement ▪ Drawbacks: • Not transparent: application has to deal with routing of queries, queries that span multiple databases • When a database is overloaded, moving part of its load out is not easy • Chance of failure more with more databases ▪ need to keep replicas to ensure availability, which is more work for application

Key Value Storage Systems Key-value storage systems store large numbers(billions or even more)of small (KB-MB)sized records Records are partitioned across multiple machines and Queries are routed by the system to appropriate machine Records are also replicated across multiple machines,to ensure availability even if a machine fails Key-value stores ensure that updates are applied to all replicas,to ensure that their values are consistent Database System Concepts-7th Edition 10.10 @Silberschatz,Korth and Sudarshan

Database System Concepts - 7 10.10 ©Silberschatz, Korth and Sudarshan th Edition Key Value Storage Systems ▪ Key-value storage systems store large numbers (billions or even more) of small (KB-MB) sized records ▪ Records are partitioned across multiple machines and ▪ Queries are routed by the system to appropriate machine ▪ Records are also replicated across multiple machines, to ensure availability even if a machine fails • Key-value stores ensure that updates are applied to all replicas, to ensure that their values are consistent

Key Value Storage Systems Key-value stores may store uninterpreted bytes,with an associated key E.g.,Amazon S3,Amazon Dynamo Wide-table(can have arbitrarily many attribute names)with associated key Google BigTable,Apache Cassandra,Apache Hbase, Amazon DynamoDB Allows some operations(e.g.,filtering)to execute on storage node ·JSON MongoDB,CouchDB(document model) Document stores store semi-structured data,typically JSON Some key-value stores support multiple versions of data,with timestamps/version numbers Database System Concepts-7th Edition 10.11 ©Silberscha乜，Korth and Sudarshan

Database System Concepts - 7 10.11 ©Silberschatz, Korth and Sudarshan th Edition Key Value Storage Systems ▪ Key-value stores may store • uninterpreted bytes, with an associated key ▪ E.g., Amazon S3, Amazon Dynamo • Wide-table (can have arbitrarily many attribute names) with associated key • Google BigTable, Apache Cassandra, Apache Hbase, Amazon DynamoDB • Allows some operations (e.g., filtering) to execute on storage node • JSON ▪ MongoDB, CouchDB (document model) ▪ Document stores store semi-structured data, typically JSON ▪ Some key-value stores support multiple versions of data, with timestamps/version numbers

Data Representation An example of a JSON object is: { "ID":"22222", "name": "firstname:"Albert", "lastname:"Einstein" }, "deptname":"Physics", "children": {"firstname":"Hans","lastname":"Einstein"}, "firstname":"Eduard","lastname":"Einstein"} } Database System Concepts-7th Edition 10.12 ©Silberscha乜，Korth and Sudarshan

Database System Concepts - 7 10.12 ©Silberschatz, Korth and Sudarshan th Edition Data Representation ▪ An example of a JSON object is: { "ID": "22222", "name": { "firstname: "Albert", "lastname: "Einstein" }, "deptname": "Physics", "children": [ { "firstname": "Hans", "lastname": "Einstein" }, { "firstname": "Eduard", "lastname": "Einstein" } ] }

点击进入文档下载页（PPTX格式）

共53页，可试读18页，点击继续阅读 ↓↓

您可能感兴趣的文档

《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 01 Introduction（Avi Silberschatz Henry F. Korth S. Sudarshan）
《数据库系统概念 Database System Concepts》原书教学资源（第五版，PPT课件讲稿，英文版）Chapter 9 Object-Based Databases
《数据库系统概念 Database System Concepts》原书教学资源（第五版，PPT课件讲稿，英文版）Chapter 8 Application Design and Development
《数据库系统概念 Database System Concepts》原书教学资源（第五版，PPT课件讲稿，英文版）Chapter 7 Relational Database Design
《数据库系统概念 Database System Concepts》原书教学资源（第五版，PPT课件讲稿，英文版）Chapter 6 Entity-Relationship Model
《数据库系统概念 Database System Concepts》原书教学资源（第五版，PPT课件讲稿，英文版）Chapter 5 Other Relational Languages
《数据库系统概念 Database System Concepts》原书教学资源（第五版，PPT课件讲稿，英文版）Chapter 4 Advanced SQL
《数据库系统概念 Database System Concepts》原书教学资源（第五版，PPT课件讲稿，英文版）Chapter 3 SQL
《数据库系统概念 Database System Concepts》原书教学资源（第五版，PPT课件讲稿，英文版）Chapter Advanced Transaction Processing
《数据库系统概念 Database System Concepts》原书教学资源（第五版，PPT课件讲稿，英文版）Chapter 24 Advanced Data Types
《数据库系统概念 Database System Concepts》原书教学资源（第五版，PPT课件讲稿，英文版）Chapter 23 Advanced Application Development
《数据库系统概念 Database System Concepts》原书教学资源（第五版，PPT课件讲稿，英文版）Chapter 22 Distributed Databases
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 11 Data Analytics
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 12 Physical Storage Systems
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 13 Data Storage Structures
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 14 Indexing
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 15 Query Processing
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 16 Query Optimization
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 17 Transactions
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 18 Concurrency Control
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 19 Recovery System
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 02 Intro to Relational Model
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 20 Database System Architectures
《数据库系统概念 Database System Concepts》原书教学资源（第七版，PPT课件讲稿，英文版）Chapter 21 Parallel and Distributed Storage

点击购买下载（PPTX）

下载及服务说明

购买前请先查看本文档预览页，确认内容后再进行支付；
如遇文件无法下载、无法访问或其它任何问题，可发送电子邮件反馈，核实后将进行文件补发或退款等其它相关操作；
邮箱：

文档浏览记录