当前位置：和泉文库 > 计算机 > 浏览文档

Apache Spark：Intro to Spark（Lightning-fast cluster computing）

A Brief History Spark Deconstructed Spark Essential Simple Spark Demo Spark SQL

文件格式：PPTX，文件大小：2.18MB，售价：12.75元

文档详细内容（约100页）

A Brief History: Benefits Of Spark Speed Run programs up to 100x faster than Hadoop MapReduce in memory, or l Ox faster on disk Ease of use Write applications quickly in Java, Scala or Python G enera ity Spark Spark MLlibGraphX SQL Streaming(machine(graph) Combine S naytIcs Apache Spark

A Brief History: Benefits Of Spark Speed Run programs up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk. Ease of Use Write applications quickly in Java, Scala or Python. Generality Combine SQL, streaming, and complex analytics

A Brief History: Key distinctions for Spark vs MapReduce handles batch interactive, and real-time within a single framework programming at a higher level of abstraction more general: map/reduce is just one set of supported constructs functional programming /ease of use reduction in cost to maintain large apps lower overhead for starting jobs ess expensive shuttles Soak

A Brief History: Key distinctions for Spark vs. MapReduce • handles batch, interactive, and real-time within a single framework • programming at a higher level of abstraction • more general: map/reduce is just one set of supported constructs • functional programming / ease of use ⇒ reduction in cost to maintain large apps • lower overhead for starting jobs • less expensive shuffles …

TL, DR: Smashing The Previous Petabyte Sort Record databricks. com/blog/2014/11105spark-officially- sets-a-new-record-in-large-scale-sorting. html Hadoop mr Spark Spark Record Record 1 PB Data size 1025TB 100TB 1000TB Elapsed Time72 mins 23 mins 234 mins Nodes 2100 206 190 Cores 50400 physical 6592 virtualized 6080 virtualized Cluster disk 3150GB/5 618 GB/s 570 GB/s throughput (est Sort Benchmark Yes Yes No Daytona Rules Network dedicated data virtualized(EC2)virtualized(EC2) center, 10Gbps 10Gbps network 10Gbps network Sort rate 1.42 TB/min4.27TB/min4.27TB/min Sort rate/node 0.67 GB/min 20.7 GB/min 22.5 GB/min 」 Spark

TL;DR: Smashing The Previous Petabyte Sort Record databricks.com/blog/2014/11/05/spark-officiallysets-a-new-record-in-large-scale-sorting.html

TL, DR: Sustained Exponential Growth Spark is one of the most active Apache projects ohloh. net/orgs/apache Number of contributors who made changes to the project source code each month 2012 2013 2014 Soak

TL;DR: Sustained Exponential Growth Spark is one of the most active Apache projects ohloh.net/orgs/apache

TL, DR: Spark just Passed Hadoop in Popularity on Web datanami. com/2014/1/217spark-just-passed- hadoop-popularity-web-heres/ √ News headlines Forecast In October Apache Spark blue line) passed Apache Hadoop( red line) in popularity according to Google Trends G 2009 2011 2013 Soak

TL;DR: Spark Just Passed Hadoop in Popularity on Web datanami.com/2014/11/21/spark-just-passedhadoop-popularity-web-heres/ In October Apache Spark (blue line) passed Apache Hadoop (red line) in popularity according to Google Trends

点击进入文档下载页（PPTX格式）

共100页，试读已结束，阅读完整版请下载

您可能感兴趣的文档

Acknowledged Broadcasting and Gossiping in ad hoc radio networks
中国科学技术大学：《计算机体系结构》课程教学资源（PPT课件讲稿）第7章多处理器及线程级并行 7.3 分布式共享存储器体系结构 7.4 Models of Memory Consistency
《大数据挖掘与应用技术》课程教学资源（PPT课件讲稿）第12章 Hibernate持久化技术
南京航空航天大学：《数据结构》课程教学资源（PPT课件讲稿）第五章数组和广义表
上海交通大学：传感器网络研究 Research On Sensor Nets（主讲：伍民友）
《计算机软件技术基础》课程电子教案（PPT课件讲稿）第9章存储管理
四川大学：《计算机操作系统 Operating System Principles》课程教学资源（PPT课件讲稿）第7章虚拟存储器管理
《The C++ Programming Language》课程教学资源（PPT课件讲稿）Lecture 05 Object-Oriented Programming
山东大学：《微机原理及单片机接口技术》课程教学资源（PPT课件讲稿）第二章微型计算机基础知识
四川大学：《计算机操作系统 Operating System Principles》课程教学资源（PPT课件讲稿）第6章存储器管理
《计算机系统和系统结构》课程教学资源（PPT课件讲稿）第四章流水线技术
《计算机算法基础》课程教学资源（PPT课件讲稿）分枝－限界法
中国科学技术大学：《网络信息安全 NETWORK SECURITY》课程教学资源（PPT课件讲稿）第三章局域网安全技术及应用
《操作系统原理》课程教学考试大纲
面向服务的业务流程管理（PPT讲稿）Business Process Analysis and Modeling
中国铁道出版社：《局域网技术与组网工程》课程教学资源（PPT课件讲稿）第6章 Internet
《计算机视觉》课程教学资源（PPT课件讲稿）第二章视觉的基本知识第二节视觉物理学特性
北京航空航天大学：《程序设计语言原理》课程教学资源（PPT课件）第0章绪论（主讲：吕卫锋）程序语言设计方法学 The Methodology Of Programming Language
《单片机原理及应用》课程PPT教学课件（C语言版）第1章单片机基础知识概述
山西管理职业学院：《Excel 教程》课程教学资源（PPT课件讲稿，共九部分）
《文献信息检索与利用》课程教学资源（PPT课件）第三章文献信息检索基本理论
南京大学：《操作系统》课程教学资源（PPT课件讲稿）文件管理（主讲：徐锋）
南京大学：《面向对象技术 OOT》课程教学资源（PPT课件讲稿）敏捷软件开发 Agile Software Development
计算机的维修（PPT课件讲稿）计算机维修的基本知识与实例

点击购买下载（PPTX）

下载及服务说明

购买前请先查看本文档预览页，确认内容后再进行支付；
如遇文件无法下载、无法访问或其它任何问题，可发送电子邮件反馈，核实后将进行文件补发或退款等其它相关操作；
邮箱：

文档浏览记录