Session Previews 2023
Mine Your Own Business: How Object-Centric Process Mining Improves the Things You Do Not See
Prof. Dr. Wil van der Aalst
The keynote introduces Object-Centric Process Mining (OCPM) which can be seen as a major breakthrough in process mining. Traditional approaches for process modeling and process analysis tend to focus on one type of objects and each event refers to precisely one such object. OCPM takes a more holistic and comprehensive approach to process analysis and improvement by considering multiple object types and events that involve any number of objects. The keynote presents the basic concepts and the need for object-centric process mining techniques using many examples. OCPM is rapidly being adopted in commercial systems, showing its practical relevance. Process mining experts expect that OCPM will become the “normal” way of doing process mining and therefore of value to any organization that wants to improve its processes.
Workshop: Organizing & Managing Analytic Projects with David Stephenson
Dr. David Stephenson
This workshop covers the key stages of scoping and executing data and analytics projects. This is especially critical due to the inherent uncertainty involved in this highly innovative field. We start by discussing in detail how to align on goals, expectations and resources at the very beginning of the project, and then describe the importance of and process behind kickoff meetings. We focus on stakeholder alignment, not only at the beginning stages but also on keeping alignment throughout the project and dealing with complications such as delays and setbacks. We’ll touch on common tooling and frameworks, and how these can help along the way, and we’ll zoom out to see the big picture of product lifecycle, which starts with business goals and ends with ML ops, model updates, and communicating your success stories.
ML Week Europe: Unveiling the Director's Favorites Sessions
Martin Szugat
Discover the personal favorites of Martin Szugat, the esteemed program director of Machine Learning Week Europe!
Prepare to be captivated as he presents a selection of must-attend sessions. Dive into these remarkable talks to unravel the reasons behind his enthusiasm and why you simply can’t afford to miss them!
Workshop: Data Design Thinking with Martin Szugat
Martin Szugat
1. Learn what are the critical factors for a successful data strategy and a data-driven business.
2. Get to know the Data Strategy Design method to develop an individual data strategy for your company on your own.
3. Explore the power of analytical solutions and discover how to design, evaluate and prioritize analytics projects efficiently and effectively.
How to sell AI products to internal customers?
Jack Lampka
70% of AI projects fail. Why? Is there a chance that the “people” aspect is ignored, including people selling AI products to internal customers? Why then not leverage the marketing framework of 5 Ps of marketing to address this challenge and learn how to market data & AI solutions? This keynote will show how this concept is being applied at a pharma company to market data products, addressing the product, price, place, promotion, and people … all of those elements essential for successful use of data & AI in any organization.
Improving Text-to-Image Multilingual Search at Toloka and Jina
Evgeniya Sukhodolskaya
In this talk, Evgeniya discusses how multilingual CLIP-style models are changing the game in multimodal AI, particularly in Text-to-Image search applications. She will share her experience and techniques for fine-tuning multilingual and monolingual CLIP models on non-English data, specifically German data gathered through crowdsourcing. Additionally, she will demonstrate how to gather data in over 40 languages for fine-tuning such models. If you are interested in improving the performance of Text-to-Image search across multiple languages, this talk is for you!
Reducing Customer Onboarding Time for Data Mapping from Weeks to Minutes at 6Sense
Rohit Kewalramani
Most B2B SaaS companies spend weeks onboarding their customers, spending the majority of time in data integrations. As there is no agreed way of data organization, the data mapping becomes manual & deliberative. In this session, Rohit presents a solution, that combines NLP, transformers and Deep Learning and that reduced customer onboarding time at 6sense from weeks to minutes and made the process self-served. Unlike conventional classifiers that predict at a data point level, the developed classifier predicts at a “distribution” level, making it a rare of its kind.
Conformal Prediction: a Universal Method for Uncertainty Quantification
Dr. Michael Allgöwer
Conformal prediction, once a small research niche, is now drawing attention from the machine learning main stream, e.g. Amazon’s implementation within the new Fortuna library. Conformal prediction adds uncertainty quantification to any machine learning model, the confidence regions it produces are provably reliable, and it does not need unrealistically strong assumptions. Applications to classification, time series, and causal machine learning will be shown in a hands-on manner.
Data Mesh Beyond Theory
Norbert Wirth
Data Mesh has grown beyond theory and buzzword. The initial concept as introduced by Zhamak Dehghani in 2019 is described as a socio-technological approach. It has implications on how data is organized and worked with. This keynote will reflect on a Data Mesh implementation journey, on pitfalls and how they can be avoided. In addition to the organizational transformation, Norbert will in particular describe the foundational data platform development, because it’s one of the key success factors.
Handling Multiple Languages in Text Classification Problems at ING
Katharina Wenzel
Text classification is a well-known use case. But how do you approach this classification task if the text is in multiple different languages? In this session Katharina presents different options for solving multi-language classification problems like language embeddings as well as different translation models and APIs and discuss the challenges to use them in practice. Finally, she presents a new approach that works especially well on short texts and which is now used to classify banking transaction at ING.
Innovative Condition Monitoring in the Packaging and Paper Industry at Mondi Group
Günter Röhrich
Industrial paper manufacturing is a highly technological process that requires continuous effort to prevent interruptions. Early detection and proper identification of problems are crucial for timely servicing, optimal personnel allocation, and avoiding breakdowns. Mondi Group and d-fine developed a smart condition monitoring solution that uses the existing technical infrastructure as a maintenance assistant
Understanding and Visualizing Your Online Marketing Audiences using Machine Learning
Dr. Christoph Best
Targeting the right audience is central to successful online marketing. However, traditional audience targeting (e.g. age and gender) is severely limited. Modern audience targeting incorporates and predicts real user behavior, and uses machine learning to create audience that are truly relevant for online advertising. Christoph will discuss the data sources and machine-learning methods to construct relevant audiences for online advertisers, and the tools to explore and visualize such audiences.
Accelerating Public Consultations with Large Language Models (LLMs) for the UK Planning Inspectorate
Dr. Michele Dallachiesa
This talk discusses the challenges faced by the UK Local Planning Authorities (LPAs) in managing written representations from the community for their Local Plans. The volume of information can be overwhelming, making it difficult for LPAs and the Planning Inspectorate to review and assess plans. The talk presents a case study that explores the potential of Large Language Models (LLMs) to streamline the analysis of representations, with significantly reduced processing time and improved accuracy.
Retrieval-Augmented Generation: Using Large Language Models to Get the Most from Your Text Data
Dr. Rob Pasternak
Large language models have taken the world by storm, but what if you’d like to use one with your internal text data? LLMs aren’t trained on your data, and finetuning a bespoke LLM is daunting. However, retrieval-augmented generation (RAG) combines LLMs with document search to create powerful generative NLP systems for your text data. In this talk we will discuss how to build effective RAG pipelines, as well as how to address potential concerns like privacy and hallucination effects.