Real-Time Exactly-Once Event Processing with Apache Flink, Kafka, and Pinot //Jacob Tsafatinos // MLOps Coffee Sessions #97

MLOps.community - A podcast by Demetrios Brinkmann

Categories:

MLOps Coffee Sessions #97 with Jacob Tsafatinos, Real-Time Exactly-Once Event Processing with Apache Flink, Kafka, and Pinot co-hosted by Mihail Eric. // Abstract A few years ago Uber set out to create an ads platform for the Uber Eats app that relied heavily on three pillars; Speed, Reliability, and Accuracy. Some of the technical challenges they were faced with included exactly-once semantics in real-time. To accomplish this goal, they created the architecture diagram above with lots of love from Flink, Kafka, Hive, and Pinot. You can dig into the whole paper (https://go.mlops.community/k8gzZd) to see all the reasoning for their design decisions. // Bio Jacob Tsafatinos is a Staff Software Engineer at Elemy. He led the efforts of the Ad Events Processing system at Uber and has previously worked on a range of problems including data ingestion for search and machine learning recommendation pipelines. In his spare time, he can be found playing lead guitar in his band Good Kid. // MLOps Jobs board   https://mlops.pallet.xyz/jobs // Related Links Uber blog https://eng.uber.com/author/jacob-tsafatinos/ https://eng.uber.com/real-time-exactly-once-ad-event-processing/ --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Mihail on LinkedIn: https://www.linkedin.com/in/mihaileric/ Connect with Jacob on LinkedIn: https://www.linkedin.com/in/jacobtsaf/ Timestamps: [00:00] Introduction to Jacob Tsafatinos [00:40] Takeaways [04:25] Jacob's band [05:29] Lyrics about software engineers or artistic stuff [06:20] Connection of hobby and real-time system [08:43] How to game Spotify Algorithm? [10:00] Data stack for analytics [13:28] Uber blog [16:28] Video mess up [17:04] Considerations and importance of the Uber System [21:22] Challenges encountered through the Uber System journey [26:06] Crucial to building the system [28:13] Not exactly real-time [30:22] Design decisions main questions [34:23] Testament to OSS   [36:58] Real-time processing systems for analytical use cases vs Real-time processing systems for predictive use cases [38:46] Real-time systems necessity [41:04] Potential that opens up new doors [41:40] Runaway or learn it? [46:09] Real-time use case target [49:31] Resource constrained [50:48] ML Oops stories [52:45] Wrap up

Visit the podcast's native language site