Speakers
Description
Practitioners and researchers use open-ended questions when designing survey questions to obtain validity evidence of response processes to survey items and questions. Data cleaning and response coding pose significant challenges, particularly for “web probes,”given the self-administered nature of “web probes”and the large number of participants compared to the smaller number of people interviewed in the cognitive interview method. The integration of generative AI, especially models based on GPT architectures, offers new opportunities to automate these processes efficiently and accurately. This proposal focuses on the development of a data post-processing solution for automated debugging and coding of responses to web probes. This solution could be implemented by using advanced prompting techniques, in an application of the GPTs store, a standardised data processing procedure with these AI-tools; or by an application that uses the OpenAI API to offer advanced features, depending on the results or performance of each option.
The objective of the paper is twofold: a) to present the development and validation of a generative AI-based data post-processing application and procedure that allows; and b) to illustrate how such a procedure, depending on how and in which model it is applied, deductively codes themes and sub-themes in the substantive responses, and automatically detects indicators of low involvement in the response process, such as mismatches and motivational losses.
Textual data from the CAS questionnaire on climate change anxiety and other questionnaires on “quality-of-life”will be categorized into substantive (1) and non-substantive (0) by coders. Subsequently, this coding will be compared with the coding generated by four different AI models (4th, 4th custom, O1, and Deepseek) and by the state-of-the-art OpenAI model using the API. A one-shot approach will be applied, calculating the correlation between manual and automated ratings. The application is expected to demonstrate high accuracy in response debugging and coding, significantly reducing the time and effort required in manual data processing.
This project aims to set a new standard in the automated processing of open-ended responses in psychometrics and contribute to fostering “web probing”to obtain validity evidence of response processes. Future developments of AI generative for improving validation of response processes “in vivo”will be also discussed.