How about Map+ Reduce+join+.? Uniform stages aren 't really uniform
How about Map+Reduce+Join+…? • “Uniform” stages aren’t really uniform
Graph complexity composes ·Non- trees common E.g. data-dependent re-partitioning Combine this with merge trees etc Distribute to equal-sized ranges Sample to estimate histogram Randomly partitioned inputs
Graph complexity composes • Non-trees common • E.g. data-dependent re-partitioning – Combine this with merge trees etc. Distribute to equal-sized ranges Sample to estimate histogram Randomly partitioned inputs
Scheduler state machine Scheduling is independent of semantics Vertex can run anywhere once all its inputs are read Constraints/hints place it near its inputs Fault tolerance If A fails, run it again If a's inputs are gone run upstream vertices again recursive If a is slow, run another copy elsewhere and use output from whichever finishes first
Scheduler state machine • Scheduling is independent of semantics – Vertex can run anywhere once all its inputs are ready • Constraints/hints place it near its inputs – Fault tolerance • If A fails, run it again • If A’s inputs are gone, run upstream vertices again (recursively) • If A is slow, run another copy elsewhere and use output from whichever finishes first