5.3. Test input data quality, stability and drift [CODE PRACTICE]
A code example walkthrough of running test suites for data quality, data stability, and data drift on raw and pre-processed data.
Last updated
A code example walkthrough of running test suites for data quality, data stability, and data drift on raw and pre-processed data.
Last updated
Video 3. , by Emeli Dral
In this video, we run test suites for data quality, data stability, and data drift on raw and pre-processed data. We also get the output as a Python dictionary to show how to integrate conditional checks in the prediction pipelines.
Want to go straight to code? Here is the to follow along.
Outline: Introduction Imports and data preparation Test data stability on raw data Run the test suite and explore the results Test data quality on raw data Test data drift on raw data Run tests and interpret data drift on pre-processed data Whether to run tests on raw or pre-processed data Get output as JSON or Python dictionary and create conditions