LogoLogo
DiscordGitHub
  • Welcome!
  • ML OBSERVABILITY COURSE
    • Module 1: Introduction
      • 1.1. ML lifecycle. What can go wrong with ML in production?
      • 1.2. What is ML monitoring and observability?
      • 1.3. ML monitoring metrics. What exactly can you monitor?
      • 1.4. Key considerations for ML monitoring setup
      • 1.5. ML monitoring architectures
    • Module 2: ML monitoring metrics
      • 2.1. How to evaluate ML model quality
      • 2.2. Overview of ML quality metrics. Classification, regression, ranking
      • 2.3. Evaluating ML model quality [CODE PRACTICE]
      • 2.4. Data quality in machine learning
      • 2.5. Data quality in ML [CODE PRACTICE]
      • 2.6. Data and prediction drift in ML
      • 2.7. Deep dive into data drift detection [OPTIONAL]
      • 2.8. Data and prediction drift in ML [CODE PRACTICE]
    • Module 3: ML monitoring for unstructured data
      • 3.1. Introduction to NLP and LLM monitoring
      • 3.2. Monitoring data drift on raw text data
      • 3.3. Monitoring text data quality and data drift with descriptors
      • 3.4. Monitoring embeddings drift
      • 3.5. Monitoring text data [CODE PRACTICE]
      • 3.6. Monitoring multimodal datasets
    • Module 4: Designing effective ML monitoring
      • 4.1. Logging for ML monitoring
      • 4.2. How to prioritize ML monitoring metrics
      • 4.3. When to retrain machine learning models
      • 4.4. How to choose a reference dataset in ML monitoring
      • 4.5. Custom metrics in ML monitoring
      • 4.6. Implementing custom metrics in Evidently [OPTIONAL]
      • 4.7. How to choose the ML monitoring deployment architecture
    • Module 5: ML pipelines validation and testing
      • 5.1. Introduction to data and ML pipeline testing
      • 5.2. Train and evaluate an ML model [OPTIONAL CODE PRACTICE]
      • 5.3. Test input data quality, stability and drift [CODE PRACTICE]
      • 5.4. Test ML model outputs and quality [CODE PRACTICE]
      • 5.5. Design a custom test suite with Evidently [CODE PRACTICE]
      • 5.6. Run data drift and model quality checks in an Airflow pipeline [OPTIONAL CODE PRACTICE]
      • 5.7. Run data drift and model quality checks in a Prefect pipeline [OPTIONAL CODE PRACTICE]
      • 5.8. Log data drift test results to MLflow [CODE PRACTICE]
    • Module 6: Deploying an ML monitoring dashboard
      • 6.1. How to deploy a live ML monitoring dashboard
      • 6.2. ML model monitoring dashboard with Evidently. Batch architecture [CODE PRACTICE]
      • 6.3. ML model monitoring dashboard with Evidently. Online architecture [CODE PRACTICE]
      • 6.4. ML monitoring with Evidently and Grafana [OPTIONAL CODE PRACTICE]
      • 6.5. Connecting the dots: full-stack ML observability
Powered by GitBook
On this page
  • Evaluations in ML model lifecycle
  • What can go wrong in production?
  • Summing up
  1. ML OBSERVABILITY COURSE
  2. Module 1: Introduction

1.1. ML lifecycle. What can go wrong with ML in production?

What can go wrong with data and machine learning services in production. Data quality issues, data drift, and concept drift.

PreviousModule 1: IntroductionNext1.2. What is ML monitoring and observability?

Last updated 1 year ago

Video 1. , by Emeli Dral

Evaluations in ML model lifecycle

Building a successful ML model involves the following stages:

  • Data preparation,

  • Feature engineering,

  • Model training,

  • Model evaluation,

  • Model deployment.

You can perform different types of evaluations at each of these stages. For example,

  • During data preparation, exploratory data analysis (EDA) helps to understand the dataset and validate the problem statement.

  • At the experiment stage, performing cross-validation and holdout testing helps validate and test if ML models are useful.

However, the work does not stop here! Once the best model is deployed to production and starts bringing business value, every erroneous prediction has its costs. It is crucial to ensure that this model functions stably and reliably. To do that, one must continuously monitor the production ML model and data.

What can go wrong in production?

Many things can go wrong once you deploy an ML model to the real world. Here are some examples.

Training-serving skew. Model degrades if training data is very different from production data.

Data quality issues. In most cases, when something is wrong with the model, this is due to data quality and integrity issues. These can be caused by:

  • Data processing issues, e.g., broken pipelines or infrastructure updates.

  • Data schema changes in the upstream system, third-party APIs, or catalogs.

  • Data loss at source when dealing with broken sensors, logging errors, database outages, etc.

Broken upstream model. Often, not one model but a chain of ML models operates in production. If one model gives wrong outputs, it can affect downstream models.

Concept drift. Gradual concept drift occurs when the target function continuously changes over time, leading to model degradation. If the change is sudden – like the recent pandemic – you’re dealing with sudden concept drift.

Data drift. Distribution changes in the input features may signal data drift and potentially cause ML model performance degradation. For example, a significant number of users coming from a new acquisition channel can negatively affect the model trained on user data. Chances are that users from different channels behave differently. To get back on track, the model needs to learn new patterns.

Underperforming segments. A model might perform differently on diverse data segments. It is crucial to monitor performance across all segments.

Adversarial adaptation. In the era of neural networks, models might face adversarial attacks. Monitoring helps detect these issues on time.

Summing up

Many factors can impact the performance of an ML model in production. ML monitoring and observability are crucial to ensure that models perform as expected and provide value.

ML lifecycle. What can go wrong with ML in production