Looks like you missed last week’s ML Engineered newsletter, so I wanted to send it again in case it got lost in your inbox.
Have a great week!
In this week’s episode, I interviewed the Luigi Patruno, the man behind ML in Production, my favorite blog on the topic of building machine learning systems for the real world.
He discusses best practices for putting ML into production, how to make sure your data science efforts are actually adding business value, and what the future of building software might be (“Code 2.0”).
I wrote out the best quotes and takeaways from the episode in this Twitter thread. Check it out and like/re-tweet if you found it helpful!
Click here to listen to the episode, or find it in your podcast player of choice: https://www.mlengineered.com/listen
People online like to argue what year the state of ML tooling is in compared to traditional software. But whether we’re 5 or 15 years behind, everyone can agree that we’ve got a looooong way to go.
Which is why I’m so excited whenever I see a new tool come out targeted specifically for people actually using ML in the real world, especially when they’re open source!
So today I’m highlighting two of the most recent ones I’ve seen released that I’ll be trying out when the use-case comes up.
I’ve tried using various experiment tracking tools before (comet, DVC), but came to the same conclusions as Ben and Andreas: they were too heavyweight and inflexible. It’s great to see that instead of accepting that experiments will always be tracked in a spreadsheet (guilty!), they decided to do something about it.
They have the basic functionality working already and are building the rest of it with the community’s input. Ben ran community meetings when working at Docker and has started to do the same here. They have an open Discord and are very responsive to feedback!
Check it out here: https://replicate.ai/
![]() |
One of the recurring themes on the podcast and this newsletter is the need for monitoring of both data and models in a ML pipeline. Evidently's first release deals with the former, tackling the issue of knowing when your production data has drifted away from training data.
Their tool takes two pandas dataframes as input (reference and test) and produces an interactive report either in the form of a notebook cell or a stand-alone html page. If you deal with numerical or categorical features, this is bound to be extremely useful!
Check out their release blog post here: https://evidentlyai.com/blog/evidently-001-open-source-tool-to-analyze-data-drift
They've also written an excellent series of articles on ML monitoring here: https://evidentlyai.com/blog/machine-learning-monitoring-what-it-is-and-how-it-differs
Goku Mohandas released Made With ML and it quickly made a splash in the community with over 20k people signing up within months. Since then, he made the difficult decision to pivot away from the project sharing platform it started as:
The first project he’s working on is a free online course, “Applied ML in Production”:
There’s a dearth of online courses for practical ML, especially from people who’ve done it before, with Full Stack Deep Learning being the only exception. This pivot certainly gets my 👍!
Goku’s released the videos for the first two sections and they’re phenomenally useful. Check it out here: https://madewithml.com/courses/applied-ml-in-production/
I mentioned last week that I had to skip an episode release because of guest reschedulings. Since then, I’ve recorded five episodes, which was fun but extremely tiring. Rest assured, I won’t be missing another release! Also, in case you missed it, I wrote up a study guide for aspiring ML engineers that lays out a clear starting path and contains a list of resources that I and my friends have learned from. Read the Study Guide In this week's edition: My Interview on the MLOps Community podcast...
There’s no podcast episode this week due to an unfortunate coincidence of multiple guests needing to reschedule. My apologies, I’ll be doing my best in the future to not let this happen again. That doesn’t mean I don’t have any new content for this week, though! Today I’m releasing an article that answers one of the most common questions I get: “I want to learn machine learning, where do I start / what do I do?” When I was first getting started in ML, it was pretty straightforward: there was...
After a month off from releasing original interviews on the podcast feed, I’m so excited to be sharing this episode with all of you! Aether Biomachines is one of the most interesting machine learning startups I’ve ever come across and I was thrilled to interview the founder, Pavle Jeremic. Building a Post-Scarcity Future using Machine Learning “How can we make sure that the economy is so productive that the desperation that leads people to commit atrocities never happens?” In this episode,...