Introduction to Algorithms 6.046J/18.401J/SMA5503 Lecture 8 Prof charles e. leiserson
Introduction to Algorithms 6.046J/18.401J/SMA5503 Lecture 8 Prof. Charles E. Leiserson
A weakness of hashing Problem: For any hash function h, a set of keys exists that can cause the average access time of a hash table to skyrocket An adversary can pick all keys from kEU: h(k)=i for some slot i IDEA: Choose the hash function at random independently of the keys Even if an adversary can see your code, he or she cannot find a bad set of keys since he or she doesn ' t know exactly which hash function will be chosen o 2001 by Charles E Leiserson Introduction to Algorithms Day 12 L8.2
© 2001 by Charles E. Leiserson Introduction to Algorithms Day 12 L8.2 A weakness of hashing Problem: For any hash function h, a set of keys exists that can cause the average access time of a hash table to skyrocket. IDEA: Choose the hash function at random, independently of the keys. • Even if an adversary can see your code, he or she cannot find a bad set of keys, since he or she doesn’t know exactly which hash function will be chosen. • An adversary can pick all keys from {k ∈ U : h(k) = i} for some slot i
Universal hashing Definition. let u a universe of keys, and let be a finite collection of hash functions each mapping ( to (0,1,., m-1. We say 升 is universal if for allx,y∈ where x≠ we have{h∈升:h(x)=h(y)}=1H|m That is the chance of a collision th: h(x)=ho)) between x and y is 1/m, if we choose h randomly from H 升 o 2001 by Charles E Leiserson Introduction to Algorithms Day 12 L8.3
© 2001 by Charles E. Leiserson Introduction to Algorithms Day 12 L8.3 Universal hashing Definition. Let U be a universe of keys, and let H be a finite collection of hash functions, each mapping U to {0, 1, …, m–1}. We say H is universal if for all x, y ∈ U, where x ≠ y, we have |{h ∈ H : h(x) = h(y)}| = |H|/m. That is, the chance of a collision between x and y is 1/m if we choose h randomly from H. {h : h(x) = h(y)} H |H| m
Universality is good Theorem. let h be a hash function chosen uniformly ) at random from a universal set H of hash functions. Suppose h is used to hash n arbitrary keys into the m slots of a table T. Then, for a given key x, we have E#collisions with x <n/m o 2001 by Charles E Leiserson Introduction to Algorithms Day 12 L8. 4
© 2001 by Charles E. Leiserson Introduction to Algorithms Day 12 L8.4 Universality is good Theorem. Let h be a hash function chosen (uniformly) at random from a universal set H of hash functions. Suppose h is used to hash n arbitrary keys into the m slots of a table T. Then, for a given key x, we have E[#collisions with x] < n/m
Proof of theorem Proof. Let Cr be the random variable denoting the total number of collisions of keys in Twith x and let 1 if h(x)=h(v) 0 otherwise Note:Ecn]=1 /m and c=∑ Xy y∈7-{x} o 2001 by Charles E Leiserson Introduction to Algorithms Day 12 L8.5
© 2001 by Charles E. Leiserson Introduction to Algorithms Day 12 L8.5 Proof of theorem Proof. Let Cx be the random variable denoting the total number of collisions of keys in T with x, and let cxy = 1 if h(x) = h(y), 0 otherwise. Note: E[cxy] = 1/m and ∑ ∈ − = y T {x} x xy C c