Steps in the query var logentries= from line in logs Go through logs and keep only lines where !line. StartsWith ("# " that are not comments Parse select new Log Entry(line) each line into a LogEntry object var user from access in logentries Go through logentries and keep there access user. EndsWith (@"lulfar") only entries that are accesses select access by ulfar var accesses from access in user group access by access. page into pages select new UserPage Count("ulfar" pages. Key, pages. Count(); var htmaccesses Group ulfar's accesses according from access in accesses where access page EndsWith " htm") to what page they correspond to orderby access. count descending For each page, count the occurrences select access Sort the pages ulfar has accessed according to access frequency
Steps in the query var logentries = from line in logs where !line.StartsWith("#") select new LogEntry(line); var user = from access in logentries where access.user.EndsWith(@"\ulfar") select access; var accesses = from access in user group access by access.page into pages select new UserPageCount("ulfar", pages.Key, pages.Count()); var htmAccesses = from access in accesses where access.page.EndsWith(".htm") orderby access.count descending select access; Go through logs and keep only lines that are not comments. Parse each line into a LogEntry object. Go through logentries and keep only entries that are accesses by ulfar. Group ulfar’s accesses according to what page they correspond to. For each page, count the occurrences. Sort the pages ulfar has accessed according to access frequency
Serial execution var logentrie from line in logs For each line in logs do where !line. StartsWith ("# " select new Log Entry(line) var user from access in logentries For each entry in logentries, do there access user. EndsWith (@"lulfar") select access var accesses from access in user group access by access page into pages select new UserPage Count("ulfar" pages. Key, pages. Count(); var htmaccesses Sort entries in user by page. Then from access in accesses where access page EndsWith " htm") iterate over sorted list, counting orderby access. count descending the occurrences of each page as select access you go Re-sort entries in access by page frequency
Serial execution var logentries = from line in logs where !line.StartsWith("#") select new LogEntry(line); var user = from access in logentries where access.user.EndsWith(@"\ulfar") select access; var accesses = from access in user group access by access.page into pages select new UserPageCount("ulfar", pages.Key, pages.Count()); var htmAccesses = from access in accesses where access.page.EndsWith(".htm") orderby access.count descending select access; For each line in logs, do… For each entry in logentries, do.. Sort entries in user by page. Then iterate over sorted list, counting the occurrences of each page as you go. Re-sort entries in access by page frequency
Parallel execution var logentries= from line in logs where !line. StartsWith ("# " select new Log Entry(line) ○○ var user from access in logentries there access user. EndsWith (@"lulfar") select access var accesses from access in user group access by access page into pages select new User Page Count("ulfar", pages. Key, pages. Coul var htmaccesses from access in accesses where access page EndsWith " htm") orderby access. count descending select access
Parallel execution var logentries = from line in logs where !line.StartsWith("#") select new LogEntry(line); var user = from access in logentries where access.user.EndsWith(@"\ulfar") select access; var accesses = from access in user group access by access.page into pages select new UserPageCount("ulfar", pages.Key, pages.Count()); var htmAccesses = from access in accesses where access.page.EndsWith(".htm") orderby access.count descending select access;
How does Dryad fit in? Many programs can be represented as a distributed execution graph The programmer may not have to know this “ SQL-like” queries:L|NQ Spark(oSDI 12)utilizes the same idea Dryad will run them for you
How does Dryad fit in? • Many programs can be represented as a distributed execution graph – The programmer may not have to know this • “SQL-like” queries: LINQ – Spark (OSDI’12) utilizes the same idea. • Dryad will run them for you
Talk outline Computational model · Dryad architecture Some case studies DryadLINQ overview Summary
Talk outline • Computational model • Dryad architecture • Some case studies • DryadLINQ overview • Summary