Structuring data for efficient l/O format compreas addr state c/e Data structure by example-access Find out overheated devices at a given time o find the offset of that time in the file ●read10 K numbers ●apply simple filter read seek row/cal 7/42 S.Ponce-CERN
Structuring data for efficient I/O 7 / 42 S. Ponce - CERN format compress addr state c/c row/col Data structure by example - access Find out overheated devices at a given time find the offset of that time in the file read 10K numbers apply simple filter seek read Cost one seek one read of 10K ints This is efficient !
Structuring data for efficient l/O 4 format compress addr state c/e Data structure by example-access Find out overheated devices at a given time o find the offset of that time in the file ●read10 Knumbers o apply simple filter read seek Cost 。one seek o one read of 10K ints This is efficient row/cal 7/42 S.Ponce-CERN
Structuring data for efficient I/O 7 / 42 S. Ponce - CERN format compress addr state c/c row/col Data structure by example - access Find out overheated devices at a given time find the offset of that time in the file read 10K numbers apply simple filter seek read Cost one seek one read of 10K ints This is efficient !
Structuring data for efficient I/O ◇ format compress addr.state c/c Data structure by example access (2) Graph the temperature evolution of a given device o read 43.2K numbers from the file,every 40K bytes o graph them → → ead "read read see seek seek row/cal 8/42 S.Ponce-CERN
Structuring data for efficient I/O 8 / 42 S. Ponce - CERN format compress addr state c/c row/col Data structure by example - access (2) Graph the temperature evolution of a given device read 43.2K numbers from the file, every 40K bytes graph them seekread seekread seekread Cost 43.2K reads of 4 bytes and 43.2K seeks ! on top typical block size in a filesystem is 8k you will probably read effectively 20% of the file ! actually reading the whole file will be more efficient Here the structure of our data is a killer
Structuring data for efficient I/O format compre= Data structure by example access (2 Graph the temperature evolution of a given device o read 43.2K numbers from the file,every 40K bytes ●graph them → 下→ ead "read read see seek seek Cost o43.2K reads of 4 bytes and 43.2K seeks o on top typical block size in a filesystem is 8k you will probably read effectively 20%of the file o actually reading the whole file will be more efficient Here the structure of our data is a killer 8/42 S.Ponce-CERN
Structuring data for efficient I/O 8 / 42 S. Ponce - CERN format compress addr state c/c row/col Data structure by example - access (2) Graph the temperature evolution of a given device read 43.2K numbers from the file, every 40K bytes graph them seekread seekread seekread Cost 43.2K reads of 4 bytes and 43.2K seeks ! on top typical block size in a filesystem is 8k you will probably read effectively 20% of the file ! actually reading the whole file will be more efficient Here the structure of our data is a killer
Structuring data for efficient I/O 4 format compr addr statr c/c 8 Column storage Time (mn) Captor 1 Captor 2 Captor c a0 bo 20 1 a1 b1 21 n an bn Zn row/cal 9/42 S.Ponce-CERN
Structuring data for efficient I/O 9 / 42 S. Ponce - CERN format compress addr state c/c row/col Column storage Time (mn) Captor 1 Captor 2 ... Captor c 0 a0 b0 ... z0 1 a1 b1 ... z1 ... ... ... ... ... n an bn ... zn File content a0 a1 ... an b0 b1 ... bn ... z0 z1 ... zn Back to efficient read seek read