Data storage and preservation Parallelizing through striping Main idea o use several devices in parallel for a single stream o moving the limitations up by summing performances Basic striping:Divide and conquer for storage o split data into chunks aka stripes on different devices ●access in parallel striping mapreduce 9/62 S.Ponce-CERN
Data storage and preservation 9 / 62 S. Ponce - CERN devices // risks consistency safety c/c striping mapreduce Parallelizing through striping Main idea use several devices in parallel for a single stream moving the limitations up by summing performances Basic striping : Divide and conquer for storage split data into chunks aka stripes on different devices access in parallel Disk 1 Disk 2 Disk 3 Disk 4 Disk 5 File C.4 File C.5 File C.6 File B.2 File B.3 File C.1 File C.2 File C.3 File A.1 File A.2 File A.3 File A.4 File B.1
Data storage and preservation Parallelizing through striping Main idea o use several devices in parallel for a single stream o moving the limitations up by summing performances Basic striping:Divide and conquer for storage o split data into chunks aka stripes on different devices o access in parallel File A.1 File A.2 File A.3 File A.4 File B.1 File B.2 File B.3 File C.1 File C.2 File C.3 File C.4 File C.5 File C.6 Disk 1 Disk 2 Disk 3 Disk 4 Disk 5 nping mapreduce 9/62 S.Ponce-CERN
Data storage and preservation 9 / 62 S. Ponce - CERN devices // risks consistency safety c/c striping mapreduce Parallelizing through striping Main idea use several devices in parallel for a single stream moving the limitations up by summing performances Basic striping : Divide and conquer for storage split data into chunks aka stripes on different devices access in parallel Disk 1 Disk 2 Disk 3 Disk 4 Disk 5 File C.4 File C.5 File C.6 File B.2 File B.3 File C.1 File C.2 File C.3 File A.1 File A.2 File A.3 File A.4 File B.1
Data storage and preservation RAID O RAID o stands to "Redundant Array of Inexpensive Disks" o set of configurations that employ the techniques of striping, mirroring,or parity to create large reliable data stores from multiple general-purpose computer hard disk drives(Wikipedia) Useful RAID levels RAID 0 striping RAID 1 mirroring RAID 5 parity RAID 6 double parity Can be implemented in hardware or software striping mapreduce 10/62 S.Ponce-CERN
Data storage and preservation 10 / 62 S. Ponce - CERN devices // risks consistency safety c/c striping mapreduce RAID 0 RAID stands to “Redundant Array of Inexpensive Disks” set of configurations that employ the techniques of striping, mirroring, or parity to create large reliable data stores from multiple general-purpose computer hard disk drives (Wikipedia) Useful RAID levels RAID 0 striping RAID 1 mirroring RAID 5 parity RAID 6 double parity See Data Safety part Can be implemented in hardware or software
Data storage and preservation RAID O RAID stands to "Redundant Array of Inexpensive Disks" o set of configurations that employ the techniques of striping, mirroring,or parity to create large reliable data stores from multiple general-purpose computer hard disk drives(Wikipedia) Useful RAID levels RAID 0 striping RAID 1 mirroring RAID 5 parity See Data Safety part RAID 6 double parity Can be implemented in hardware or software striping mapreduce 10/62 S.Ponce-CERN
Data storage and preservation 10 / 62 S. Ponce - CERN devices // risks consistency safety c/c striping mapreduce RAID 0 RAID stands to “Redundant Array of Inexpensive Disks” set of configurations that employ the techniques of striping, mirroring, or parity to create large reliable data stores from multiple general-purpose computer hard disk drives (Wikipedia) Useful RAID levels RAID 0 striping RAID 1 mirroring RAID 5 parity RAID 6 double parity See Data Safety part Can be implemented in hardware or software
Data storage and preservation 花5 RAID versus RAIN RAIN o Redundant Array of Inexpensive Nodes o similar to RAlD but across nodes Main interest o tackle also the network limitations o when used for redundancy,improves reliability o more on this in subsequent lecture striping mapreduce 11/62 S.Ponce-CERN
Data storage and preservation 11 / 62 S. Ponce - CERN devices // risks consistency safety c/c striping mapreduce RAID versus RAIN RAIN Redundant Array of Inexpensive Nodes similar to RAID but across nodes Main interest tackle also the network limitations when used for redundancy, improves reliability more on this in subsequent lecture