Metaflow: Supercharging Our Data Scientist Productivity // Ravi Kiran Chirravuri // MLOps Meetup #41
MLOps.community - A podcast by Demetrios Brinkmann
Categories:
MLOps community meetup #41! Last Wednesday was an exciting episode that some attendees couldn't help to ask when is the next season of their favorite series! The conversation was around Metaflow: Supercharging Data Scientist Productivity with none other than Netflix’s very own Ravi Kiran Chirravuri. // Abstract: Netflix's unique culture affords its data scientists an extraordinary amount of freedom. They are expected to build, deploy, and operate large machine learning workflows autonomously without the need to be significantly experienced with systems or data engineering. Metaflow, our ML framework (now open-source at metaflow.org), provides them with delightful abstractions to manage their project's lifecycle end-to-end, leveraging the strengths of the cloud: elastic compute and high-throughput storage. In this talk, we preface with our experience working alongside data scientists, present our human-centric design principles when building Machine Learning Infrastructure, and showcase how you can adopt these yourself with ease with open-source Metaflow. // Bio: Ravi is an individual contributor to the Machine Learning Infrastructure (MLI) team at Netflix. With almost a decade of industry experience, he has been building large-scale systems focusing on performance, simplified user journeys, and intuitive APIs in MLI and previously Search Indexing and Tensorflow at Google. ----------- Connect With Us ✌️------------- Join our Slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Ravi on LinkedIn: https://www.linkedin.com/in/seeravikiran/ Timestamps: [00:00] - Introduction to Ravi Kiran Chirravuri [02:21] - Ravi's background [05:19] - Metaflow: Supercharging Data Scientist Productivity [05:31] - Why do we have to build Metaflow? [06:14] - Infographic of a very simplified view of a machine learning workflow [07:01] - "An idea is typically meaningless without execution." [07:38] - Scheduling [08:14] - Life is great! [08:24] - Life happens and things are crashing and burning! [09:04] - What is Metaflow? [12:01] - How much data scientist cares [12:25] - How infrastructure is needed [13:03] - What Metaflow does [13:44] - How can you go about using Metaflow for your data science needs? [14:20] - People love DAG's [16:00] - Baseline [16:16] - Architecture [17:28] - Syntax [19:00] - Vertical Scalability [21:10] - Horizontal Scalability [22:59] - Failures are a feature [23:57] - State Transfer and Persistence [27:05] - Dependencies [30:57] - Model Ops: Versioning [33:19] - Monitoring in Notebooks [35:16] - Decouple Orchestration [36:48] - AWS Step Functions [37:16] - Export to AWS Step Functions [38:10] - From Prototype to Production and Back [42:07] - What are the prerequisites to use Metaflow? [43:32] - Where does Metaflow store everything? [45:10] - Are there any tutorials available? [45:22] - Have the tutorials been updated? [47:27] - How do you deploy Metaflow? [49:02] - Do you see Metaflow becoming a tool to develop and support auto ML. [50:34] - What were some of the biggest learnings that you saw people doing that they're not doing on Netflix? [52:19] - Does Metaflow exist to help data scientists to orchestrate everything? [54:30] - What do you version?