Model monitoring with Seldon Alibi

In our previous blog, we looked at how to serve models with Seldon Core.

While Seldon makes it easy to deploy and serve a model in production, once it is deployed, how do we know if that model is doing the right thing?

Good performance during training doesn’t necessarily mean good performance after a few months of running in production. Things happen in the real world that we can’t account for, like the input data gradually drifting away from the training data, as well as outliers and biases.

In this blog we’re going to look at Seldon’s open source monitoring tool, Alibi Detect. This tool is designed to work in tandem with Seldon model serving capabilities, making it easy to detect performance degradation early.

This is the second of a 3-part series on Seldon, covering serving, monitoring, and finally, integration into ZenML. If you missed the first instalment, click here.

How does model monitoring work?

Typically when we talk about model monitoring, we’re interested in one or more of the following things:

Data drift: data in the real-world starts to differ from the training data. The further it drifts, the more likely it is for the model to behave in unexpected ways.
Outliers: inputs that are outside the range we expect. In this case, the model may produce outputs that make no sense.
Bias: the model shows a bias towards certain predictions in a way that we didn't predict during training. Dealing with bias is a complex topic in its own right, but identifying it is the first step in resolving it.

We can use this kind of monitoring to make decisions, such as when to re-train or get new data.

You may already be familiar with monitoring in a more traditional setting such as for web applications, where the task is to catch errors and server crashes. For models, monitoring is a bit more complicated, and this is primarily because every model is different, so the monitoring system needs to be built around the particular models and training data that we want to work with.

About Alibi detect

Like Seldon Core, Alibi Detect runs on top of Kubernetes. In order to monitor a model, you need to create something called a detector. Each model needs its own detector, deployed specifically for that model.

As with models, detectors run inside containers. Once again, Seldon takes advantage of Kubernetes’ strengths as a container orchestration engine, and as a consequence of this, your detectors will run with the same reliability as your models.

Out of the box Alibi can monitor for data drift and outliers, two of the most common things that can trip up your models.

It also supports something known as adversarial detection. The documentation around this feature is a bit obscure, so allow me to try a simple explanation: imagine that somebody with bad intentions is trying to make your model do something that you didn’t intend i.e., trying to hack your model. Adversarial detection is a technique that automatically discovers inputs that an attacker could exploit, and that helps you to avoid these kinds of attack. Because ordinarily it’s difficult to predict how a model is going to behave, it’s also difficult to find security flaws without using adversarial detection.

All of the documentation for Alibi is available here. Now let’s see it in action.

Monitoring a model

There’s nothing too mystical about a detector: it’s really a special kind of model which, like any model, needs to be trained using some data. Alibi Detect includes a Python library to help you do that, along with very good documentation.

But often you don’t need to train a detector from scratch, because Alibi also comes with a bunch of pre-trained detectors for various use-cases, and these are the simplest way to get started. To illustrate how this works, let’s walk through an example, using an image classifier with a drift detector.

You can find all of the code for this example in the Seldon documentation. Rather than repeating the code here, I’ll just give you the outline.

The image classifier

The first thing we’ll deploy is an image classification model. This example uses a dataset called CIFAR-10 dataset, which is a collection of 60,000 images commonly used for demonstration purposes.

The model runs on Seldon’s pre-built Docker image for Tensorflow. To deploy it, we first create a Kubernetes manifest (the contents of which are on the-above mentioned Seldon docs page), containing all of the model’s configuration, and then run:<pre><code>kubectl apply -f my_config.yaml
</pre></code>

Having done that, we have a model up and running. It’s a relatively straightforward model: whenever you show it an image, it will respond with a class like car, bird, plane, house, etc.

The drift detector

We can be reasonably confident that the model performs well on images that resemble the training images. For example, the training data has lots of pictures of animals; horses, cows, sheep, and so on. If most of those pictures are taken in good light, outdoors, in rural landscapes with lots of grass, then similar pictures shouldn’t present a problem.

On the other hand, a horse wearing a hat in a darkly-lit swimming pool might throw the model off. And while a one-off thing like that isn’t a big deal, if there’s a large drift in what kind of image the model sees on average, then we may be in trouble.

To detect drift, we need to look at a large number of cases. The Seldon drift detector works by collecting a bunch of observations using a specified batch size. In this example, it’s configured to look for drift using the most recent 5000 observations.

To set it up we need to do two things:

First deploy the drift detector. This is just as easy as deploying a model. Alibi comes with a Docker image, alibi-detect-server, from which we deploy the detector as a container on Seldon.

Second, we need to arrange it so that every time the model receives an input, the detector receives an event. This is how the drift detector is able to observe what the model is doing.

It’s worth knowing that Seldon itself doesn’t have the functionality to stream events, instead it relies on another open source project called Knative to do this. So to use Alibi, you’ll need to set up Knative on your Seldon instance too. With Knative installed, the final set is to create something called a trigger, which ensures that every time the model receives an inference request, the detector will get an event.

The setup steps for all of the above are included in the earlier-mentioned Seldon documentation page.

Getting metrics out of Alibi

Alibi detect will log an alert to its application log whenever it detects something like drift or an outlier, and those logs can be accessed through Kubernetes using the kubectl logs command.

Logs on their own are of limited use. Typically, you’ll want to build dashboards and alerting systems on top of this. For instance, you might want to have some way to send notifications to the team if model drift has been detected.

The good news is that these things can all be done with standard tooling for alerting and monitoring. For instance you might opt to use whatever your cloud provider has, such as Google Cloud Monitoring.

However, building dashboards and doing alerting for models is a big enough topic to warrant a separate blog post, so we won’t get into all that stuff just now.

Is Alibi right for you?

To begin with, Alibi is a natural choice if you’re also using Seldon to serve models, because of how well integrated the two tools are.

It usually doesn’t make sense to use Alibi on its own, that is to say if your models are being served with something other than Seldon. That’s because to use Alibi, you’ll need to set up Kubernetes and Seldon anyway, even if you don’t intend to use it for serving.

Alibi Detect is mature with a wide set of features. We’ve only discussed images in the example above, but it works well across a lot of different types of data, including tables, images, time series, and text.

This amount of choice makes it readily applicable to the vast majority of machine learning applications. And while choosing the right algorithm for you does take a good amount of data science expertise, Seldon have thought carefully about making it easy to configure, and reconfigure, its detectors, so you can focus on the data science and the machine learning without messing around too much with infrastructure.

In the next instalment of this series, we'll combine Seldon with ZenML and show how it's possible to integrate Seldon deployment into your ML training pipelines.

Model monitoring with Seldon Alibi

How does model monitoring work?

About Alibi detect

Monitoring a model

The image classifier

The drift detector

Getting metrics out of Alibi

Is Alibi right for you?

More like this

LLM Fine-tuning: Old school, new school, and everything in between

The Art of Testing Machine Learning Pipelines

HuggingFace in Space 🤗 : Unlocking the Power of Pre-trained Embeddings with HuggingFace

Sign up to our newsletter