FlowCon 2013 San Francisco, November 1

Presentation: "Cloud Operations at Netflix: Optimizing Innovation Speed While Supporting Availability."

Time: Friday 13:30 - 14:00 / Location: Robertson 1

This talk is about the role CORE (Cloud Operations and Reliability Engineering) plays at Netflix and its support of Netflix's core software delivery goals (largely focused on maximizing speed of innovation).
In most organizations, groups like CORE end up playing either a gatekeeping role to changes in production (slowing down innovation and being in conflict with continuous delivery approaches) or a manual, runbook-oriented approach to problem solving (which externalizes the costs of, for example, not having an automated deployment pipeline, not to mention the risk of rolling out changes of insufficient quality).  At Netflix, CORE's role is as a high-value enabling group focusing on making developers faster and both minimizing and resolving production outages.

Download slides

Roy Rapoport, Manager, Monitoring Engineering at Netflix

Roy Rapoport

Biography: Roy Rapoport

Roy Rapoport manages the Monitoring Engineering group at Netflix, responsible for building Netlix's internal cloud telemetry and alerting systems.  He originally joined Netflix as part of its datacenter-based IT/Ops group, and prior to transferring over to Product Engineering, was managing Service Delivery for IT/Ops.  He provided input into the forming of the Cloud Operations and Reliability Engineering (CORE) group at Netflix, and continues to play an advisory role to the group and its members.  He also built the majority of the python infrastructure libraries to allow developers at Netflix access cloud systems.  

Roy has been in tech for about 20 years with positions in IT engineering and operations, software development, and software quality engineering, but his first loves were operations and automation.

Twitter: @royrapoport