Structuring data for efficient I/O format compress addr state c/c Structuring data for efficient 1/O Sebastien Ponce sebastien.ponce@cern.ch CERN Thematic CERN School of Computing 2015 1/42 S.Ponce-CERN
Structuring data for efficient I/O 1 / 42 S. Ponce - CERN format compress addr state c/c Structuring data for efficient I/O S´ebastien Ponce sebastien.ponce@cern.ch CERN Thematic CERN School of Computing 2015
Structuring data for efficient l/O format compreas addr state c/c Overall Course Structure Structuring Data for efficient I/O o Data formats,data compression oData addressing Many ways to Store Data o Storage devices and their specificities o Distributing and parallelizing storage Preserving data o Data consistency o Data safety Key ingredients to achieve efficient I/O Synchronous vs asynchronous I/O I/O optimizations and caching 2
Structuring data for efficient I/O 2 / 42 S. Ponce - CERN format compress addr state c/c Overall Course Structure Structuring Data for efficient I/O Data formats, data compression Data addressing Many ways to Store Data Storage devices and their specificities Distributing and parallelizing storage Preserving data Data consistency Data safety Key ingredients to achieve efficient I/O Synchronous vs asynchronous I/O I/O optimizations and caching
Structuring data for efficient I/O format compress addr state c/c Outline ① Data format Row vs Column Compressing data oCompression algorithms Efficiency and use cases Data addressing o Hierarchical namespaces ●Limitations ●Flat namespaces Stateful interfaces ●POSIX ●Limitations o Stateless interfaces Conclusion 3/42 S.Ponce CERN
Structuring data for efficient I/O 3 / 42 S. Ponce - CERN format compress addr state c/c Outline 1 Data format Row vs Column 2 Compressing data Compression algorithms Efficiency and use cases 3 Data addressing Hierarchical namespaces Limitations Flat namespaces 4 Stateful interfaces POSIX Limitations Stateless interfaces 5 Conclusion
Structuring data for efficient I/O format compress addr state c/c Data format 0 Data format o Row vs Column 2 Compressing data Data addressing 年 Stateful interfaces Conclusion row/col 4/42 S.Ponce-CERN
Structuring data for efficient I/O 4 / 42 S. Ponce - CERN format compress addr state c/c row/col Data format 1 Data format Row vs Column 2 Compressing data 3 Data addressing 4 Stateful interfaces 5 Conclusion
Structuring data for efficient I/O format compreas addr state c/c Data structure by example-scenario Scenario o You are measuring temperatures within a piece of detector o You have 10K captors and you take one measure every minute o After a month,you got 432M measures o That is 1.6GB if you take single precision floats(32bits) row/cal 5/42 S.Ponce-CERN
Structuring data for efficient I/O 5 / 42 S. Ponce - CERN format compress addr state c/c row/col Data structure by example - scenario Scenario You are measuring temperatures within a piece of detector You have 10K captors and you take one measure every minute After a month, you got 432M measures That is 1.6GB if you take single precision floats (32bits)