Many ways to store data 4 tfevices distr/c/ Many ways to store data Sebastien Ponce sebastien.ponce@cern.ch CERN Thematic CERN School of Computing 2018 1/42 S.Ponce-CERN
Many ways to store data 1 / 42 S. Ponce - CERN devices distrib // c/c Many ways to store data S´ebastien Ponce sebastien.ponce@cern.ch CERN Thematic CERN School of Computing 2018
Many ways to store data Overall Course Structure Many ways to Store Data o Storage devices and their specificities Distributing and parallelizing storage Preserving data ●Data consistency Data safety Key ingredients to achieve efficient I/O o Synchronous vs asynchronous I/O I/O optimizations and caching 2/42 S.Ponce-CERN
Many ways to store data 2 / 42 S. Ponce - CERN devices distrib // c/c Overall Course Structure Many ways to Store Data Storage devices and their specificities Distributing and parallelizing storage Preserving data Data consistency Data safety Key ingredients to achieve efficient I/O Synchronous vs asynchronous I/O I/O optimizations and caching
Many ways to store data Outline Storage devices ●Existing devices Hierarchical storage ② Distributed storage ●Data distribution ●Data federation ③ Parallelizing files'storage ●Striping Introduction to Map/Reduce Conclusion 3/42 S.Ponce-CERN
Many ways to store data 3 / 42 S. Ponce - CERN devices distrib // c/c Outline 1 Storage devices Existing devices Hierarchical storage 2 Distributed storage Data distribution Data federation 3 Parallelizing files’ storage Striping Introduction to Map/Reduce 4 Conclusion
Many ways to store data 4 devices distn他/∥c Storage devices ①Storage devices ● Existing devices oHierarchical storage Distributed storage Parallelizing files'storage Conclusion oo HSM 4/42 S.Ponce-CERN
Many ways to store data 4 / 42 S. Ponce - CERN devices distrib // c/c zoo HSM Storage devices 1 Storage devices Existing devices Hierarchical storage 2 Distributed storage 3 Parallelizing files’ storage 4 Conclusion
Many ways to store data devices distnb /cft A variety of storage devices Main differences o Capacities from 1 GB to 10TB per unit o Prices from 1 to 300 for the same capacity o Very different reliability oVery different speeds too HSM 5/42 S.Ponce-CERN
Many ways to store data 5 / 42 S. Ponce - CERN devices distrib // c/c zoo HSM A variety of storage devices Main differences Capacities from 1 GB to 10 TB per unit Prices from 1 to 300 for the same capacity Very different reliability Very different speeds Typical numbers in 2018 Capacity per unit Latency $/TB Speed reliability RAM 16 GB 5 ns 9000 ✩ 10 GB s −1 volatile SSD 500 GB 10 ➭s 300 ✩ 550 MB s −1 poor HD 6 TB 3 ms 25 ✩ 150 MB s −1 average Tape 10 TB 100 s 20 ✩ 500 MB s −1 good