First, an apology: due to some technical issues, the release of the second “Best of ML Engineered in 2020” episode is delayed until this weekend. Sorry about that! Until then, you can check out last week’s compilation episode of the best ML engineering highlights:
Click here to listen to the episode, or find it in your podcast player of choice: https://www.mlengineered.com/listen
Onto this week’s newsletter!
Only very recently have I worked with models that needed to learn from streams of data instead of batches. This article from mid last-year by Max Halford was one of the most helpful resources that I came across when getting up to speed on the topic.
With lots of code examples and explanatory graphs, he starts with standard cross-validation and then walks through two more evaluation methods specific to online learning. Read it here: https://maxhalford.github.io/blog/online-learning-evaluation/
In last week’s newsletter, I featured Chip Huyen’s article on the rise of real-time machine learning in industry. This week, Eugene Yan published an excellent blog post drawing from his own experience working on real-time recommendation systems.
He goes over the system design from five different companies (Alibaba, Tencent, Youtube, Instagram, and Netflix), and then shows a surprisingly simple implementation of an MVP.
I highly recommend you check it out, whether you’re interested in RecSys or ML systems at huge scale. Read it here: https://eugeneyan.com/writing/real-time-recommendations/
In addition to being a prolific blogger and ML engineer at Snorkel, Chip is also teaching a course at Stanford (!) this semester on ML system design. While the lectures and assignments are private, the notes and slides are available to all of us. The semester started this week and I will definitely be following along.
Check it out here: https://stanford-cs329s.github.io/syllabus.html
This is a massive blog post featuring all the different research Google AI has done in 2020. They managed to publish over 800 papers in all different topics:
![]() |
No matter what your interests are, I’m sure you can find something in here to dig onto: https://ai.googleblog.com/2021/01/google-research-looking-back-at-2020.html
…according to Sandeep Uttamchandami, CDO and VP of Engineering at Unravel Data.
In this Medium post, he lists some of his battle-scars from implementing ML products and features. Read it here: https://medium.com/wrong-ml/51-things-that-can-go-wrong-in-a-real-world-ml-project-c36678065a75
I mentioned last week that I had to skip an episode release because of guest reschedulings. Since then, I’ve recorded five episodes, which was fun but extremely tiring. Rest assured, I won’t be missing another release! Also, in case you missed it, I wrote up a study guide for aspiring ML engineers that lays out a clear starting path and contains a list of resources that I and my friends have learned from. Read the Study Guide In this week's edition: My Interview on the MLOps Community podcast...
There’s no podcast episode this week due to an unfortunate coincidence of multiple guests needing to reschedule. My apologies, I’ll be doing my best in the future to not let this happen again. That doesn’t mean I don’t have any new content for this week, though! Today I’m releasing an article that answers one of the most common questions I get: “I want to learn machine learning, where do I start / what do I do?” When I was first getting started in ML, it was pretty straightforward: there was...
After a month off from releasing original interviews on the podcast feed, I’m so excited to be sharing this episode with all of you! Aether Biomachines is one of the most interesting machine learning startups I’ve ever come across and I was thrilled to interview the founder, Pavle Jeremic. Building a Post-Scarcity Future using Machine Learning “How can we make sure that the economy is so productive that the desperation that leads people to commit atrocities never happens?” In this episode,...