Events Conference on Foundations and Advances of Machine Learning in Official Statistics, 3rd to 5th April, 2024

Program (Date: 25.03.2024)

Program of the Conference on Foundations and Advances of Machine Learning in Official Statistics, 3rd to 5th April, 2024
CESTDay 1: April 3rd 2024
13:00 - 14:30 Check-In, Welcome Coffee
14:30 - 15:00Opening
Chair: Florian Dumpert
15:00 - 16:00

Plenary 1
Chair: Florian Dumpert

Advancing Interpretability in Machine Learning: Model Summaries and Interpretable Regional Descriptors
Speaker: Susanne Dandl
(Slides)

16:00 - 16:30Coffee Break + Group Photo
16:30 - 17:00

Short Plenary
Chair: Florian Dumpert

The AIML4OS project - A first overview
Speaker: Francesca Kay
(Slides)

17:00 - 18:00

Plenary 2
Chair: Florian Dumpert

Leveraging Machine Learning for Official Statistics: Methodology and Application
Speaker: Marco Puts, Piet Daas
(Slides)

CESTDay 2: April 4th 2024
08:30 - 09:00Welcome Coffee
09:00 - 10:30Parallel Sessions 1
Using Large Language Models
Chair: Ariane Lestrade
Room: A.13.207
Data Validation and Imputation
Chair: Steffen Moritz
Room: E.03.112
Applied ML 1
Chair: Maren Köhlmann
Room: F.04.208
LLMs for statistical methodology advice - findings and advice
Speaker: Joni Karanka
(Slides)
"There can be only one" - Deduplicating personal records in census data using exact matching and Machine Learning techniques
Speaker: Eszter Milibak
(Slides)
Machine learning and wealth measurement : an experiment on housing wealth of French households
Speaker: Olivier Meslin
(Slides)
Extracting Data Citations with Large Language Models
Speaker: Hendrik Doll
(Slides)
Applying k-NN utilizing similarity measures for categorical data for anomaly detection in the German Federal Employment Agency’s statistics
Speaker: Hinnerk Müller
(Slides)
Extracting meaningful information from web data on real estate – challenges and experiences
Speaker: Dominik Dabrowski, Bartosz Grancow
(Slides)
Enhancing Accessibility to Statistical Data through an Open-Source Chatbot Integrated with Language Models
Speaker: Eva Charlotte Berner
(Slides)
ML-Based Imputation Methods in R Package VIM: Performance and Considerations
Speaker: Alexander Kowarik
(Slides)
Modelling the local housing situation of households based on the multi-sectoral regional microsimulation model (MikroSim)
Speaker: Sarah Bohnensteffen
(Slides)
10:30 - 11:00Coffee Break
11:00 - 12:30Parallel Sessions 2
Text Classification and Language Models
Chair: Susanne Wegner
Room: A.13.207
Streamlining Processes and Data Integration
Chair: Markus Zwick
Room: E.03.112
Applied ML 2
Chair: Andreas Tang
Room: F.04.208
Use of a large language model to derive the economic sector of businesses from unstructured text on economic activities
Speaker: Gerald Heß
(Slides)
Streamlining Business Functions in Official Statistical Production with Machine Learning
Speaker: David Salgado
(Slides)
Nowcasting for Local Population Counts (Births, Deaths, Migration) via a Self-Developed Application
Speaker: Kerstin Erfurth
(Slides)
Automatic text classification for the German Household Budget Survey (HBS)
Speaker: Jerome Olsen, Ariane Lestrade
(Slides)
Choosing the Right Tool for the Job: Machine Learning vs. Regular Expressions in the Analysis of Text Data From the German Register of Driver Fitness
Speaker: Daniel Kopper
(Slides)
Facilitating Regulatory Impact Assessments: The Benefits of Machine Learning in Legislation
Speaker: Sylvana Walprecht, Catharina Lewerenz
(Slides)
Domain adaptation of a BERT Model for analyzing job advertisements at the German Federal Employment Agency
Speaker: Barbara Hofmann, Tobias Scherl
(Slides)
Use of statistical learning algorithms to integrate administrative and survey data in a Short-Term Business Statistics
Speaker: Sandra Barragan
(Slides)
The Journey of Machine Learning at the Italian National Institute of Statistics
Speaker: Mauro Bruno
(Slides)
12:30 - 14:00Lunch Break
14:00 - 15:30Parallel Sessions 3
From Text to Code
Chair: Stefanie Setzer
Room: A.13.207
Quality, Fairness and Reproducability
Chair: Jannek Mühlhan
Room: E.03.112
Methology 1
Chair: Marcel Preising
Room: F.04.208
An overview of STATEC's projects on automatic coding.
Speaker: Yu-Lin Huang, Claude Lamboray
(Slides)
Fairness in Machine Learning for National Statistical Organizations
Speaker: Christoph Kern
(Slides)
Challenges in constructing confidence intervals around point estimates for resampling based performance
Speaker: Roman Hornung
(Slides)
Navigating model drift in a occupation classification study
Speaker: Susie Jentoft
(Slides)
Legal Implications of the use of Machine Learning in Official Statistics
Speaker: Leon Krög
(Slides)
Optimising machine learning classification for statistical outcomes
Speaker: Keno Krewer
(Slides)
NACE-Coding with Machine Learning at the Federal Statistical Office
Speaker: Susanne Wegner
(Slides)
Quality Dimensions of Machine Learning in Official Statistics
Speaker: Florian Dumpert
(Slides)
Evaluating machine learning models in non-standard settings: An overview and new findings
Speaker: Roman Hornung
(Slides)
15:30 - 16:00Coffee Break
16:00 - 17:00

Plenary 3
Chair: Theresa Küntzler

Data Science at Statistics Canada: Successes and Outstanding Challenges
Speaker: Wesley Yung
(Slides)

17:00 - 18:00

Plenary 4
Chair: Theresa Küntzler

How to Decide Which Imputation Method to Use? Our Vote for Realistic Simulation Comparisons
Speaker: Markus Pauly
(Slides)

18:00 - 19:00Break
19:00 - 20:00Guided Walking Tour around Wiesbaden
From 20:00Conference Dinner
CESTDay 3: April 5th 2024
08:30 - 09:00Welcome Coffee
09:00 - 10:30Parallel Sessions 4
Measurement Error and Sampling
Chair: Kai Lorentz
Room: A.13.207
Processes
Chair: Jannik Reichel
Room: E.03.112
Methology 2
Chair: Daniel Knapp
Room: F.04.208
Design-based predictive inference
Speaker: Luis Sanguiao
(Slides)
The process for machine learning implementation at Statistics Sweden
Speaker: Jens Malmros
(Slides)
Machine learning for model-assisted estimation in survey sampling: bridging the rigor of statistical inference with the power of machine learning
Speaker: Boriska Toth
(Slides)
Incorporating machine learning in capture-recapture estimation of survey measurement error
Speaker: Joep Burger
(Slides)
Enhancing Quality Aspects of ML in Official Statistics with the right MLOps Framework
Speaker: Florian Karl
(Slides)
Effects of Training Data Collection Methods: Evidence of Annotation Sensitivity
Speaker: Bolei Ma
(Slides)
Weighting for internet quality measurements from a self-selected Brazilian public schools sample
Speaker: Thiago Meireles
(Slides)
An open source data science platform to foster innovative and production-ready machine learning systems
Speaker: Romain Avouac, Thomas Faria
(Slides)
10:30 - 11:00Coffee Break
11:00 - 12:00

Plenary 5
Chair: Steffen Moritz

On Three Methodological Challenges – Machine Learning under Complex Sample Designs, Complex Evaluation Structures and Complex Uncertainty
Speaker: Thomas Augustin
(Slides)

12:00- 12:30Closing
Chair: Steffen Moritz