湾区同学技术沙龙
(Bay Area) 云端数据中台:数据编排与平台运维
15 September 2019
1:30PM ~ 4:00PM, 9/15/2019, Sunday
Registration
- Registration link: tech-meetup-9-15-2019.eventbrite.com/
- Event link: (Bay Area) 云端数据中台:数据编排与平台运维
Join tech-meetup community:
- LinkedIn group: www.linkedin.com/groups/8362423
- 微信群/Google group: tech-meetup.com/groups
Event Info
- Time: 1:30PM ~ 4:00PM, 9/15/2019, Sunday
- Location: 1st Floor Pitch Room, 4500 Great America Parkway, Santa Clara 95054 (ZGC Innovation Center)
- Language: Chinese
Agenda
- 1:30pm - 2:00pm: Reception and social time
- 2:00pm - 3:30pm: Talk+ Q&A
- 3:30pm - 4:30pm: Offline networking
Talk1: 如何把Alerts减少两个数量级: EA数据平台在云端运维的探索与演进
Abstract: Today, one can easily launch or terminate services with hundreds or thousands of compute instances in just a few seconds on cloud services such as AWS. However, operating, monitoring and maintaining those resources could also easily become a nightmare if the corresponding tooling systems were not designed in a cloud-native way.
Detail: In this talk, we share our lessons in building and rebuilding a cloud-native monitoring system to solve this problem at Electronic Arts (EA). In the first generation of the monitoring system, configurations were manually created for many individual software components and spread over all the resources. As services were started and terminated rapidly over time, it was extremely difficult to keep all the configurations up to date. Consequently, on average we received over 1,000 alerts from thousands of machines on a daily basis, which stressed the operations team. We redesigned the system in late 2018 in a project called Monitoring As Code (MAC) emphasizing on version control and automation. MAC manages all the configurations using a GIT project in the same way as software code. Moreover, it establishes standards so that the configurations are automatically generated and deployed to keep everything in sync. As a result, it reduced the daily average number of alerts by two orders of magnitude. A big data problem is reduced to a small data problem for human productivity and operational efficiency.
Speaker Bio: The speaker, Du Li, is currently an Architect of Big Data Infrastructure at Electronic Arts. He earned his BS from Wuhan University, MS from Peking University, and PhD from UCLA. He worked in academia and industrial labs for many years. Prior to joining EA in mid-2018, he worked at Yahoo and Apple as a senior software engineer.
Talk2: 基于Raft,gRPC,RocksDB和高并发算法打造存储10亿个文件的高效元数据服务
Abstract: 起源于UC Berkeley AMPLab的Alluxio是一个被百度、腾讯、华为、沃尔玛、Twosigma等行业巨头在云端广泛应用的开源数据编排系统。它的设计初衷之一包括能够高效存储并服务超大规模的文件系统中所有文件和目录的元数据。
Detail: 本演讲将分享Alluxio元数据服务(master节点)的架构,实现和优化,以解决可扩展性挑战。我们将特别介绍如何设计和应用合多个前沿工程技术和实践,包括基于堆外KV store RocksDB实现分层级的元数据存储,细粒度文件系统inode树锁方案,基于Raft分布式共识协议的master多节点高可用方案,以及社区对gRPC系统的探索和实践。结合上述技术,Alluxio 2.0的元数据服务能够存储至少10亿个文件,并在显着降低内存需求的同时扩展到3000个worker节点并为30000个客户端提供服务。
Speaker Bio: 范斌是Alluxio公司的创始成员与, Alluxio开源项目的PMC成员. 加入Alluxio团队前, 范斌在Google从事下一代大规模分布式存储系统的研究与开发. 范斌博士毕业于卡内基梅隆大学计算机系, 博士期间在分布式系统算法和系统实现等方向发表多篇包括SIGCOMM, SOSP, NSDI等顶级论文, 设计和实现了CuckooFilter,MemC3以及libCuckoo.
主办
- 湾区同学技术沙龙(TechM)
- ZGC Innovation Center
协办
- 硅谷新创汇
- 南京大学湾区校友会
- 东南大学硅谷校友会
- 中国科大硅谷校友会
- 北加州清华校友会
- 硅谷清华联网
- 浙江大学校友会海纳创新创业俱乐部
- 北京大学北加州校友会
- 武汉大学北加州校友会
- 吉林大学硅谷校友会会
- 复旦大学北加州校友会
- 华南理工大学美国校友会
- 北加州华中科技大学校友会
- 北京航空航天大学硅谷校友会
- 北京邮电大学北美校友会
- 上海交通大学硅谷校友会
- 兰州大学北加州校友会
- 电子科技大学硅谷校友会
- 安徽大学北美校友会
- 湖南大学北美校友会
- 湘潭大学北美校友会
- 哈工大硅谷校友会
- 中山大学海外校友联网
- 华人事业互助会
- 长城会 RobotX Space
Related articles
- (Bay Area) Snowflake / Databricks / OceanBase
- (Bay Area) Google Doc 是如何炼成的 - 深入浅出协同编辑/Deep Dive Collaborative Editing
- (Bay Area) An introduction of Analytics Zoo and how to use it at Uber
- (Bay Area) Tensorflow.JS: Bringing Machine Learning To The Web And Beyond
- (Bay Area) Weakly Supervised Natural Language Understanding / 基于弱监督学习的自然语言理解 By Mosaix.ai
- (Bay Area) Data Extraction Revolution in Bloomberg, From Human Typing To Deep Learning Excerpting
- (Bay Area) Next-Generation AI Powered Operation System
- (Bay Area) Power Blockchain with Hardware Innovations
- (Bay Area) 区块链产业现状及技术发展(阿里巴巴技术日)
- (Bay Area) Anatomizing Blockchain through Many Views(区块链折叠)
- (Bay Area) Deep Dive of Alluxio and Google gVisor
- (Bay Area) 技术创造新商业:阿里巴巴搜索推荐&计算平台事业部硅谷开放日
- (Bay Area) Google Translate助力自然语言理解
- (Bay Area) Alibaba Tech Open Day – AI, Cloud, Infrastructure and More
- (Bay Area) 通向区块链3.0的未来之路
- (Bay Area) Alibaba New Retail / Hema Tech Day (盒马生鲜技术日)
- (Bay Area) exGoogle Leaders, leap.ai co-founders share their career stories & insights (Richard Liu, Yunkai Zhou)
- (Bay Area) Augmented Intelligence to Improve Health Care Consumer Experience
- (Bay Area) GrowingIO 湾区技术同学见面会
- (Bay Area) Alibaba Technology Forum, Stanford University
- (Bay Area) How Pinterest Perfected New User Onboarding
- (Bay Area) Tencent Tech Day - Silicon Valley
- (Bay Area) Deep dive of DeepMap (Wei Luo)
- (Bay Area) Apache Kafka: The Rise of Real-time
- (Bay Area) 苏宁机器学习平台及Buddy AI人工智能自动客服系统技术分享
- (Bay Area) JD.com Tech Day - Leverage Technology to empower business intelligence
- (Shanghai) 采用超低功耗AI技术的小MU机器人的实现与应用
- (Bay Area) Transwarp(星环科技) && DistributedLog
- (Bay Area) AI in Service robotics and Mini Robot
- (Shanghai) Google SRE如何管理数据中心
- (Bay Area) 如何用1/6000的训练数据击败深度学习——文字识别实验讨论
- (Shanghai) Twitter Heron Streaming at Scale
- (Bay Area) AI大牛谈深度学习最新进展
- (Bay Area) 新一代创新搜索技术架构讨论专场
- (Bay Area) CAINIAO Technology Forum, Silicon Valley
- (Bay Area) How to build a NewSQL database? (Qi Liu)
- (Bay Area) The Evolution of Big Data APIs in Spark (Reynold Xin)
- (Bay Area) TensorFlow: A Large-Scale Machine Learning System (Zhifeng Chen)
- (Bay Area) Ant Financial Tech Forum (2016蚂蚁金服技术湾区论坛)
- (Bay Area) Espresso: LinkedIn’s Distributed Database (Yun Sun)
- (Bay Area) Virtual Reality & Augmented Reality (Guodong Rong)
- (Bay Area) Etcd: A key-value store Open Source for Data consistency, Data persistency, Data synchronization in Distributed system (Xiang Li)
- (Bay Area) Introduction To OpenStack (Weidong Shao & Xin Wu)
- (Bay Area) A Journey of AI: from Silicon Valley to Beijing, from Big Name to Startup (Kai Yu)
- (Bay Area) CoreOS rkt, a Container Runtime (Yifan Gu)
- (Bay Area) Borg: Large-scale Cluster Management at Google (Xiao Zhang)
- (Bay Area) Spark MLlib: Past, Present and Future (Xiangrui Meng)
- (Bay Area) Cassandra: an open source distributed database (Charles Cao)
- (Bay Area) Tachyon: an open source memory-centric distributed storage system (Bin Fan / Shaoshan Liu / Haoyuan Li)
- (Bay Area) Apache Samza: a distributed stream processing framework (Yi Pan)
- (Bay Area) 大数据时代的金融服务创新 (Li Cheng)
- (Bay Area) 大数据人工智能 (Kai Yu)
- (Bay Area) Photon: Fault-tolerant and scalable joining of continuous data streams (Tianhao Qiu)
- (Bay Area) Large-scale data science and engineering with Spark (Reynold Xin)
- (Bay Area) Building a real time data platform with Apache Kafka (Jun Rao)
- (Bay Area) Kubernetes: Google’s secret weapon for Cloud computing (Dawn Chen)
- (Bay Area) Tachyon: A Reliable Memory-Centric Distributed Storage System