How to Avoid Suffering in Mlops/Data Engineering Role // Igor Lushchyk // MLOps Meetup #55

MLOps.community - A podcast by Demetrios Brinkmann

Categories:

MLOps community meetup #55! Last Wednesday we talked to Igor Lushchyk, Data Engineer, Adyen.   // Abstract: Building Data Science and Machine Learning platforms at a scale-up. Having the main difficulty in finding correct processes and basically being a toddler who learns how to walk on a steep staircase. The transition from homegrown platform to open source solutions, supporting old solutions and maturing them with making data scientists happy.   // Bio: Igor is a software engineer with more than 10 years of experience. With a background in bioinformatics, he even started PhD but didn't finish it. As a data engineer, Igor has been working for the last 6 or 7 years, or maybe more - because he was doing almost the same data engineering stuff but his position was named differently. Igor has been doing a lot of MLOps in 4-5 years now. He doesn't know what he was doing more then - Data Engineering or MLOps. And that’s how this topic came about.   ----------- Connect With Us ✌️-------------    Join our Slack community:  https://go.mlops.community/slack Follow us on Twitter:  @mlopscommunity Sign up for the next meetup:  https://go.mlops.community/register Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Igor on LinkedIn: https://www.linkedin.com/in/igor-lushchyk/ Timestamps: [00:00] Introduction to Igor Lushchyk [02:05] Igor's background in tech [07:42] Tips you can pass on [11:05] How these tools work and how they play together and what is underneath? [13:18] Dedicated MLOps team [13:55] Central Data Infrastructure Section [16:57] Transfer over to open-source [20:24] If you don't plan for production from the beginning, then it's going to be painful trying to go from POC to production. [22:08] Ho do you handle data lineage? [25:09] You chose that back in the day but you're regretting it. [26:34] "Try to use tools which solve 80% of your use cases and maybe 20% you'll have the suffering but at least it's not 100% suffering." [27:27] Friction points [28:53] Interaction with Data Scientists [29:21] "We have alignment sessions. We have different levels of representations. We share our progress." [32:42] Build verse by decisions [34:04] When to build or grab an open-source tool [35:51] Build your own or buy open-source? [37:11] Certain maturity and a certain number of engineers [38:11] Startup to go with open-source [40:14] Correct transition process [40:56] "There are no other ways but to communicate with data scientists. Your team needs to have a close loop for future priorities, what to take with you and what to leave behind." [44:51] What to use in monitoring piece [45:36] Prometheus and Grafana [48:07] Do you automatic retriggering monitoring of Models set up? [51:55] Hardware for on Prim model training [52:38] "Machine Learning model prediction is a spear bomb." [53:55] War or horror stories [54:15] "Guys, don't do context switching!" [55:54] "I won't say that Adyen is a company that allows you to make mistakes but you can make mistakes."

Visit the podcast's native language site