Preserving data Most basic checksum data size Computation a1 a2 a3 a4 ....aj... an n b=8bit w=64bit CS=n Pros and Contra ●easy to compute o detects erasures and additions o does not detect any corruption cksum block 11/43 S.Ponce-CERN
Preserving data 11 / 43 S. Ponce - CERN risks consistency safety c/c cksum block Most basic checksum : data size Computation a1 a2 a3 a4 ... ai ... an n b = 8 bit w = 64 bit CS = n Pros and Contra easy to compute detects erasures and additions does not detect any corruption
Preserving data Basic checksum sum/xor Computation a1 a2 a3 a4 .ai… an ∑a面 b w=b( CS=ai i=1 Pros and Contra ●easy to compute o detects most corruptions o does not detect any inversions/change of order cksum block 12/43 S.Ponce-CERN
Preserving data 12 / 43 S. Ponce - CERN risks consistency safety c/c cksum block Basic checksum : sum/xor Computation a1 a2 a3 a4 ... ai ... an Pai b w = b CS = Xn i=1 ai Pros and Contra easy to compute detects most corruptions does not detect any inversions/change of order
Preserving data Adler like checksums Computation a1 a2 a3 a4 ...ai... an → ∑auia b=8 bit w=32 bit CShigh= ∑ac5ow=∑ =1 Pros and Contra ●easy to compute o detects most corruptions and inversions o weak for small files o easy to fake in case of intentional corruption cksum block 13/43 S.Ponce-CERN
Preserving data 13 / 43 S. Ponce - CERN risks consistency safety c/c cksum block Adler like checksums Computation a1 a2 a3 a4 ... ai ... an Pai Piai b = 8 bit w = 32 bit CShigh = Xn i=1 ai CSlow = Xn i=1 iai Pros and Contra easy to compute detects most corruptions and inversions weak for small files easy to fake in case of intentional corruption
Preserving data rislis consistency safety c/c (Crypt)Analysis of adler Weaknesses 32 bits is short one per 4 billion corruption will go through o it's actually worse for small files all bits of the sum are not even used for less than 256 bytes o they can be easily bypassed one can easily change the last 16 bytes and reach any checksum o so intentional corruptions are not covered cksum block 14/43 S.Ponce-CERN
Preserving data 14 / 43 S. Ponce - CERN risks consistency safety c/c cksum block (Crypt)Analysis of adler Weaknesses 32 bits is short one per 4 billion corruption will go through it’s actually worse for small files all bits of the sum are not even used for less than 256 bytes they can be easily bypassed one can easily change the last 16 bytes and reach any checksum so intentional corruptions are not covered
Preserving data rislis consistency Cryptographic checksums What is it o checksums that cannot be faked (easily) o they are based on non reversible cryptographic functions Most used ones md5 1991,128 bits,by Rivest.Not considered secure anymore as complete collisions have been discovered. shal 1995,160 bits,by NSA.Collision in 261 operations sha256 2001,256 bits,by NSA.Collision in 2128 operations sha512 2001,512 bits,by NSA.Collision in 2256 operations Drawback o more costful to compute o although modern processors have dedicated instructions 15/43 S.Ponce-CERN
Preserving data 15 / 43 S. Ponce - CERN risks consistency safety c/c cksum block Cryptographic checksums What is it ? checksums that cannot be faked (easily) they are based on non reversible cryptographic functions Most used ones md5 1991, 128 bits, by Rivest. Not considered secure anymore as complete collisions have been discovered. sha1 1995, 160 bits, by NSA. Collision in 261 operations sha256 2001, 256 bits, by NSA. Collision in 2128 operations sha512 2001, 512 bits, by NSA. Collision in 2256 operations Drawback more costful to compute although modern processors have dedicated instructions