5.3. Test input data quality, stability and drift [CODE PRACTICE]
A code example walkthrough of running test suites for data quality, data stability, and data drift on raw and pre-processed data.
Last updated
A code example walkthrough of running test suites for data quality, data stability, and data drift on raw and pre-processed data.
Last updated
Video 3. Test input data quality, stability and drift [CODE PRACTICE], by Emeli Dral
In this video, we run test suites for data quality, data stability, and data drift on raw and pre-processed data. We also get the output as a Python dictionary to show how to integrate conditional checks in the prediction pipelines.
Want to go straight to code? Here is the example notebook to follow along.
Outline: 00:00 Introduction 01:10 Imports and data preparation 03:50 Test data stability on raw data 06:40 Run the test suite and explore the results 11:07 Test data quality on raw data 12:34 Test data drift on raw data 15:47 Run tests and interpret data drift on pre-processed data 19:28 Whether to run tests on raw or pre-processed data 20:07 Get output as JSON or Python dictionary and create conditions