5.3. Test input data quality, stability and drift [CODE PRACTICE]

A code example walkthrough of running test suites for data quality, data stability, and data drift on raw and pre-processed data.

Video 3. Test input data quality, stability and drift [CODE PRACTICE], by Emeli Dral

In this video, we run test suites for data quality, data stability, and data drift on raw and pre-processed data. We also get the output as a Python dictionary to show how to integrate conditional checks in the prediction pipelines.

Want to go straight to code? Here is the example notebook to follow along.

Outline: 00:00 Introduction 01:10 Imports and data preparation 03:50 Test data stability on raw data 06:40 Run the test suite and explore the results 11:07 Test data quality on raw data 12:34 Test data drift on raw data 15:47 Run tests and interpret data drift on pre-processed data 19:28 Whether to run tests on raw or pre-processed data 20:07 Get output as JSON or Python dictionary and create conditions

Previous5.2. Train and evaluate an ML model [OPTIONAL CODE PRACTICE]Next5.4. Test ML model outputs and quality [CODE PRACTICE]

Last updated 1 year ago