湾区同学技术沙龙

(Bay Area) Large-scale data science and engineering with Spark (Reynold Xin)

1 March 2015

1:30PM ~ 4:00PM, 03/01/2015, Sunday

Registration

Registration link: tiny.cc/signup-20150301
Event link: Large-scale data science and engineering with Spark

Event Info

Language: Chinese
Time: 1:30PM ~ 4:00PM, 03/01/2015, Sunday
Location: 1601 McCarthy Boulevard, Milpitas, CA 95035 (TIPark Silicon Valley)

Agenda

1:30pm – 2:00pm: Reception and social time
2:00pm – 3:30pm: Talk and QA
3:30pm – 4:00pm: offline networking

Abstract

Apache Spark has taken Big Data by storm, subsuming Hadoop MapReduce. In this talk, Reynold Xin from Databricks will give a quick introduction to Spark, with a focus on the latest development activities aimed at making large-scale data science and engineering more approachable. In particular, the following will be discussed:

Spark's basic programming API
the new DataFrame API for big data
machine learning pipeline integration
Databricks Cloud

Speaker’ bio

Reynold Xin is a committer and PMC member on Apache Spark. He is also a co-founder of Databricks. He has been instrumental in the development of Spark as the maintainer of many components. He recently led an effort to scale up Spark and set a new world record in 100 TB sorting (Daytona Gray). Before Databricks, he was pursuing a PhD at UC Berkeley AMPLab. He wrote the two highest cited papers in SIGMOD 2011 and SIGMOD 2013.

主办

湾区同学技术沙龙 (tech-meetup.com)

协办

TIPark Silicon Valley（感谢TIPark赞助场地）
南京大学硅谷校友会
硅谷清华联网
中国科技大学校友会创业俱乐部
浙江大学校友会海纳创新创业俱乐部
北京大学北加州校友会
武汉大学北加州校友会
东南大学硅谷校友会