当前位置：和泉文库 > 计算机 > 中国科学院：CERN专题计算学校《T-CSC数据存储》课程教学资源（讲义）Data storage and preservation-booklet

中国科学院：CERN专题计算学校《T-CSC数据存储》课程教学资源（讲义）Data storage and preservation-booklet

1 Storage devices Existing devices 2 Parallelizing files’ storage Striping Introduction to Map/Reduce 3 Risks of data loss and corruption 4 Data consistency Checksums Practical usage 5 Data safety Redundancy Parity Erasure coding 6 Conclusion

文件格式：PDF，文件大小：419.17KB，售价：16.79元

共69页，可试读20页，点击往前阅读 ↑↑

文档详细内容（约69页）

Data storage and preservation 4e// A generic solution for the stripe size Idea o disentangle "stripe size"from "object size" o"stripe size"is the size of one slice of data o"object size"is the size of one block of data on disk o several stripes are put together into one bigger object striping mapreduce 16/62 S.Ponce-CERN

Data storage and preservation 16 / 62 S. Ponce - CERN devices // risks consistency safety c/c striping mapreduce A generic solution for the stripe size Idea disentangle “stripe size” from “object size” “stripe size” is the size of one slice of data “object size” is the size of one block of data on disk several stripes are put together into one bigger object

Data storage and preservation Ceph striping Object 0 Object 0 Object 1 Object 2 Object 3 Object 4 Unit 0 Unit 1 Unit 2 Unit 3 Unit 4 Stripe Unit 4 Unit 5 Unit 6 Unit 7 Unit 8 Unit 9 Unit 10 Unit 11 Unit 12 Unit 13 Unit 14 Stripe 2 Object 0 Object 1 Object 2 Object 3 Object 4 Object 5 Object 6 Object 7 Object 8 Object 9 Unit 15 Unit 16 Unit 17 Unit 18 Unit 19 Unit 20 Unit 21 Unit 22 Unit 23 Unit 24 Object Set 1 Unit 25 Unit 26 Unit 27 Unit 28 Unit 29 Object 5 Object 6 Object 7 Object 8 Object 9 Disk 3 striping 17/62 S.Ponce-CERN

Data storage and preservation 17 / 62 S. Ponce - CERN devices // risks consistency safety c/c striping mapreduce Ceph striping Object 0 Unit 0 Unit 5 Unit 10 Object 0 Object 1 Unit 1 Unit 6 Unit 11 Object 1 Object 2 Unit 2 Unit 7 Unit 12 Object 2 Object 3 Unit 3 Unit 8 Unit 13 Object 3 Object 4 Unit 4 Unit 9 Unit 14 Object 4 Object 5 Unit 15 Unit 20 Unit 25 Object 5 Object 6 Unit 16 Unit 21 Unit 26 Object 6 Object 7 Unit 17 Unit 22 Unit 27 Object 7 Object 8 Unit 18 Unit 23 Unit 28 Object 8 Object 9 Unit 19 Unit 24 Unit 29 Object 9 Stripe Unit 4 Stripe 2 Object 0 Disk 3 Object Set 1

Data storage and preservation Practical striping-number of disks Why to have many o to increase parallelism o to get better performances Why to have few o to limit the risk of losing files o as losing a disk now means losing all files of all disks o if p is the probability to lose a disk the probability to lose one in n is pn=np(1-p)-1~np striping mapreduce 18/62 S.Ponce-CERN

Data storage and preservation 18 / 62 S. Ponce - CERN devices // risks consistency safety c/c striping mapreduce Practical striping - number of disks Why to have many to increase parallelism to get better performances Why to have few to limit the risk of losing files as losing a disk now means losing all files of all disks if p is the probability to lose a disk the probability to lose one in n is pn = np(1 − p) n−1 ∼ np

Data storage and preservation A generic solution for the number of disks Idea o disentangle "nb disks"from "nb stripes" o do not use all disks for all files o adapt your number of disks to each file more disks for high performance files o less disks for more safety HP.1 HP.2 HP.3 HP.4 HP.5 Safel.1 Safe1.2 Safe2.1 Safe2.2 Disk 1 Disk 2 Disk 3 Disk 4 Disk 5 striping mapreduce 19/62 S.Ponce-CERN

Data storage and preservation 19 / 62 S. Ponce - CERN devices // risks consistency safety c/c striping mapreduce A generic solution for the number of disks Idea disentangle “nb disks” from “nb stripes” do not use all disks for all files adapt your number of disks to each file more disks for high performance files less disks for more safety Disk 1 Disk 2 Disk 3 Disk 4 Disk 5 Safe1.1 Safe1.2 Safe2.1 Safe2.2 HP.1 HP.2 HP.3 HP.4 HP.5

Data storage and preservation Going further Map/Reduce What do we have with striping o striping allows to distribute server I/O on several devices but client still faces the total I/O o and CPU is not distributed Map/Reduce ldea o send computation to the data nodes o"the most efficient network I/O is the one you don't do" snping mapreduce 20/62 S.Ponce-CERN

Data storage and preservation 20 / 62 S. Ponce - CERN devices // risks consistency safety c/c striping mapreduce Going further : Map/Reduce What do we have with striping ? striping allows to distribute server I/O on several devices but client still faces the total I/O and CPU is not distributed Map/Reduce Idea send computation to the data nodes “the most efficient network I/O is the one you don’t do

点击进入文档下载页（PDF格式）

共69页，试读已结束，阅读完整版请下载

您可能感兴趣的文档

中国科学院：CERN专题计算学校《T-CSC数据存储》课程教学资源（讲义）Data storage and preservation-pres
中国科学院：CERN专题计算学校《T-CSC数据存储》课程教学资源（讲义）Key ingredients to achieve effective I/O-booklet
中国科学院：CERN专题计算学校《T-CSC数据存储》课程教学资源（讲义）Key ingredients to achieve effective I/O-pres
中国科学院：CERN专题计算学校《T-CSC数据存储》课程教学资源（讲义）Preserving data-booklet
中国科学院：CERN专题计算学校《T-CSC数据存储》课程教学资源（讲义）Optimizing existing large codebase-booklet
中国科学院：CERN专题计算学校《T-CSC数据存储》课程教学资源（讲义）Optimizing existing large codebase-pres
中国科学院：CERN专题计算学校《T-CSC数据存储》课程教学资源（讲义）Preserving data-pres
中国科学院：CERN专题计算学校《T-CSC数据存储》课程教学资源（讲义）Many ways to store data-booklet
中国科学院：CERN专题计算学校《T-CSC数据存储》课程教学资源（讲义）Many ways to store data-pres
中国科学院：CERN专题计算学校《T-CSC数据存储》课程教学资源（讲义）Structuring data for efficient I/O-booklet
中国科学院：CERN专题计算学校《T-CSC数据存储》课程教学资源（讲义）Structuring data for efficient I/O-pres
中国科学院：CERN专题计算学校《T-CSC数据存储》课程教学资源（讲义）Optimizing existing large codebase-booklet
西安电子科技大学：《数学图像处理 Digital Image Processing Digital Image Processing》课程教学资源（授课教案）第1章绪论（许录平）
西安电子科技大学：《数学图像处理 Digital Image Processing Digital Image Processing》课程教学资源（授课教案）第2章数字图像处理基础
西安电子科技大学：《数学图像处理 Digital Image Processing Digital Image Processing》课程教学资源（授课教案）第3章图像变换
西安电子科技大学：《数学图像处理 Digital Image Processing Digital Image Processing》课程教学资源（授课教案）第4章图像增强
西安电子科技大学：《数学图像处理 Digital Image Processing Digital Image Processing》课程教学资源（授课教案）第5章图象恢复
西安电子科技大学：《数学图像处理 Digital Image Processing Digital Image Processing》课程教学资源（授课教案）第6章图像压缩编码
西安电子科技大学：《数学图像处理 Digital Image Processing Digital Image Processing》课程教学资源（授课教案）第7章图像分割
西安电子科技大学：《数学图像处理 Digital Image Processing Digital Image Processing》课程教学资源（授课教案）第8章图像描述
西安电子科技大学：《数学图像处理 Digital Image Processing Digital Image Processing》课程教学资源（授课教案）第9章图像分类识别
西安电子科技大学：《数学图像处理 Digital Image Processing Digital Image Processing》课程教学资源（作业习题）各章要求及必做题参考答案
西安电子科技大学：《数学图像处理 Digital Image Processing Digital Image Processing》课程教学资源（实验指导）数字图像处理与Matlab
西安电子科技大学：《数学图像处理 Digital Image Processing Digital Image Processing》课程教学资源（实验指导）上机辅导讲义 - Matlab简介

点击购买下载（PDF）

下载及服务说明

购买前请先查看本文档预览页，确认内容后再进行支付；
如遇文件无法下载、无法访问或其它任何问题，可发送电子邮件反馈，核实后将进行文件补发或退款等其它相关操作；
邮箱：

文档浏览记录