PAW Business Expert Round 1: Data Science Development & Operations
Thursday, June 17, 2021
1. Gaining Efficiencies through Scalable Data Science Assets (Tobias Lampert)
Putting Data Science projects into production is one of the main challenges organisations struggle with nowadays. Projects often get stuck in the PoC phase and need to be fully reimplemented to meet production quality requirements. Development frequently starts from scratch and rarely benefits from previous work. Tobias shows how these issues can be overcome by making components reusable, which criteria scalable components need to meet and how organisational challenges can be solved.
2. Beyond Data Sanity Checks: Machine Learning Data Quality Assurance (Lisa Maag and Lana Caldarevic)
GfK is hosting the world’s largest retail panel to track products and deliver insights based on actual sales data. One essential aspect of the reliability of our insights is the continuous quality assurance of our ML lifecycle, particularly the quality of the models, which differs from ordinary quality checks. In this talk you will learn how we built an extensive data validation pipeline that goes far beyond usual panel data health checking routines to ensure the precision of the models.
3. Code Quality – Bridging the Gap between Data Science & Engineering (Jean Metz and Pamela Hathway)
One major challenge of deploying machine learning models to production is technical debt created during the research phases. In this talk, we show how GfK speeds up the transition from proof-of-concept to production by ensuring easy access to data, high-quality coding standards, and automation. We present our learnings, discuss challenges, and how to find the right balance of code quality improvements whilst respecting core responsibilities and different skills- and mindsets.