Back to archive

Thread

9 tweets

2
Just to provide some examples, an ML model could be - a computer vision object detection system - a fraud detection system - a recommendation system - a sales forecast
3
For each of these examples, you have widely different requirements when it comes to amounts of data, types of data, complexity of preprocessing, complexity of model, how the model is deployed.
4
Sales forecast might be a classical multivariate time series model like ARMA that fetches a year of sales data from a database, trains a model, computes results and stores them back in a database. The data isn’t very big, doesn’t change often, you don’t need realtime predictions.
5
Object detection systems require a lot of hand labelled data, are deep learning systems that require GPUs to train, you have lots of data, but it doesn’t change that much. You will probably want a low latency realtime evaluation of your model (eg in autonomous cars).
6
Recommendation systems typically need to process a large amount of click data with some Big Data type of system, need to be retrained often (daily) because the data changes. Predictions should be low latency, but you can practically also precompute them and store them in a db.
7
What’s peculiar about fraud detection is that it is quite hard to measure the quality of your system. After each prediction it might take weeks or months to know whether the bill has been paid or not. That poses real challenges to tracking model drift etc.
8
I hope that illustrates how hard it is to have a one-size-fits-all MLOps product out there. I’d love to see products differentiate and target application areas, but currently, it seems like most products promise to be general.
9
But yeah, I think what really helps if you have a concrete application is to figure out where you are in these different dimensions and then pick the tools that fit your profile.