Many ways to store data devices Tape efficiency 80% 10% 1% 1%0 10MB 1GB 200GB 1TB size(GB) → No mount for less than 100 GB o HSM 11/42 S.Ponce-CERN
Many ways to store data 11 / 42 S. Ponce - CERN devices distrib // c/c zoo HSM Tape efficiency 10 MB 1 GB 200 GB 1 TB 1h 1% 10% 80% size(GB) efficiency No mount for less than 100 GB !
Many ways to store data 4 devices distn/fct Hierarchical storage Layers o tape as primary storage o disks used as a cache in front of the tape system SSD used as cache in front of disks (or inside them) CERN's case(2018) 。tape capacity:330PB tape usage:240 PB disk raw capacity:250 PB o disk usage:182PB for 91PB of data 20o HSM 12/42 S.Ponce-CERN
Many ways to store data 12 / 42 S. Ponce - CERN devices distrib // c/c zoo HSM Hierarchical storage Layers tape as primary storage disks used as a cache in front of the tape system SSD used as cache in front of disks (or inside them) CERN’s case (2018) tape capacity : 330 PB tape usage : 240 PB disk raw capacity : 250 PB disk usage : 182 PB for 91 PB of data
Many ways to store data 4 devices distn/fc 花5 Some consequences Disk cache management is needed o the disk cache needs garbage collection o different algorithm used depending on usage FIFO-First In First Out LRU-Least Recently Used User need to adapt their usage o data need to be prefetched from tape before access preferably in bulk 20o HSM 13/42 S.Ponce-CERN
Many ways to store data 13 / 42 S. Ponce - CERN devices distrib // c/c zoo HSM Some consequences Disk cache management is needed the disk cache needs garbage collection different algorithm used depending on usage FIFO - First In First Out LRU - Least Recently Used User need to adapt their usage data need to be prefetched from tape before access preferably in bulk
Many ways to store data 4 tevices distrib/fc/ 花5 Distributed storage Storage devices ②Distributed storage Data distribution ●Data federation 3 Parallelizing files'storage Conclusion 4世Hib federation 14/42 S.Ponce-CERN
Many ways to store data 14 / 42 S. Ponce - CERN devices distrib // c/c distrib federation Distributed storage 1 Storage devices 2 Distributed storage Data distribution Data federation 3 Parallelizing files’ storage 4 Conclusion
Many ways to store data tevices distrib / Handling large distributed storage Standard issues o failures are very common o data distribution is hard to balance o congestions are frequent 4世rhf4 ration 15/42 S.Ponce-CERN
Many ways to store data 15 / 42 S. Ponce - CERN devices distrib // c/c distrib federation Handling large distributed storage Standard issues failures are very common data distribution is hard to balance congestions are frequent