22–25 Jul 2025
EAM2025
Atlantic/Canary timezone

Addressing Measurement Error in Machine Learning-Assisted Social Science Modeling

24 Jul 2025, 10:40
15m
Faculty of Social Sciences and Communication. (The Pyramid)/10 - Room (Faculty of Social Sciences and Communication. (The Pyramid))

Faculty of Social Sciences and Communication. (The Pyramid)/10 - Room

Faculty of Social Sciences and Communication. (The Pyramid)

30
Show room on map

Speakers

Erik-Jan van Kesteren Javier García Bernardo Qixiang Fang (Utrecht University)

Description

With the advent of machine learning tools and large language models (LLMs), the collection of measurements related to social science constructs (e.g., personality traits, political attitudes, human values) has become easier, faster and more affordable. These measurements are subsequently used for modelling of societal and group processes that social scientists typically engage in, where inferences from samples to populations are also made. Valid modelling and inferences, however, requires high-quality measurements or at the very least, methods to deal with the presence of measurement error. Just like traditional questionnaire-based measurements, machine learning- and LLM-based measurements have been shown to suffer from validity and reliability issues.
While there is an abundance of research literature in dealing with measurement error, they focus on questionnaire-based measurement error. It is unclear yet how measurement issues arising from machine learning tools and LLMs should be handled in social science modelling research.
This study has two primary objectives. First, we review existing literature to identify practices for addressing machine learning- and LLM-related measurement error, both in computer science and in social sciences.
Second, we synthesise these findings with existing measurement modelling literature to propose a practical framework for making valid inferences using machine learning- and LLM-based measurements in social sciences. By bridging the gap between modern machine prediction capabilities and social science inference requirements, our framework aims to enhance the reliability and validity of social science research outcomes in the era of machine learning and LLMs.

Primary authors

Presentation materials

There are no materials yet.