clMSc5706TopicsinTheoreticalcomputerScience Week 3:Streaming and Sketching Instructor:Shengyu Zhang 1
Instructor: Shengyu Zhang 1
Map Motivations and model Problem 1:Missing numbers Problem 2:Count-Min sketch Lower bounds 0( Communication complexity 2
Map ◼ Motivations and model ◼ Problem 1: Missing numbers ◼ Problem 2: Count-Min sketch ◼ Lower bounds ❑ Communication complexity 2
Motivations Big mass of data. Data comes as a stream. Cannot see future data Relatively small space."sketch" Cannot store past data Need to process each item fast. Quick update time. Examples:Phone calls,Internet packets, satellite pictures,.. 3
Motivations ◼ Big mass of data. ◼ Data comes as a stream. ❑ Cannot see future data. ◼ Relatively small space. “sketch” ❑ Cannot store past data ◼ Need to process each item fast. ❑ Quick update time. ◼ Examples: Phone calls, Internet packets, satellite pictures, … 3
Problem 1:Missing numbers A set of numbers S={1,2,...,n} n-1 of them come in a stream x1,x2,...,one number is missing. 3,25,6,19,1,10,… Task:identify which one is missing. ▣Using small space
Problem 1: Missing numbers ◼ A set of numbers 𝑆 = {1,2, … , 𝑛} ◼ 𝑛 − 1 of them come in a stream 𝑥1, 𝑥2, …, 𝑥𝑛−1; one number is missing. ◼ Task: identify which one is missing. ❑ Using small space. 4 3, 25, 6, 19,1,10,…
A simple algorithm Maintain the sum of the input numbers. sum =0 ■fori=1ton-1 sumsum xi return n(n+1 2 .-sum 5
A simple algorithm ◼ Maintain the sum of the input numbers. ◼ 𝑠𝑢𝑚 = 0 ◼ for 𝑖 = 1 to 𝑛 − 1 𝑠𝑢𝑚 = 𝑠𝑢𝑚 + 𝑥𝑖 ◼ return 𝑛 𝑛+1 2 − 𝑠𝑢𝑚 5