The scientific literature supports the relationship between self-esteem and interest or motivation, academic achievements, learning processes and academic performance of students in general, and also in highly able students. Thus, the objective of this study is to analyse how the different levels of self-esteem are related to interest, importance and learning achievements in highly able...
The increasing accessibility of large-scale international surveys has provided new opportunities for social scientists to conduct comparative research. Such studies frequently examine relations between latent constructs (e.g., how perceived economic threat affects political ideology) and compare them across groups (e.g., countries) to reveal cultural variations in value priorities, attitudes...
The recently proposed Mixture Multigroup Structural Equation Modeling (MMG-SEM) efficiently compares groups by clustering them based on their structural relations while accounting for the reality of measurement (non-)invariance. Currently, MMG-SEM relies on maximum likelihood (ML), which assumes continuous and normally distributed observed indicators. However, this can introduce bias when...
Introduction:
Mathematics anxiety negatively impacts performance, achievement, and career choices. This study investigates how intelligence and gender influence this anxiety and explores the coping strategies used by those who dislike math. Existing research shows that lower intelligence and being female often correlate with higher math anxiety, but how these factors interact with coping...
In the social sciences, a common research objective is the comparison of latent variables among different groups, such as in cross-cultural studies. For making valid comparisons measurement invariance (MI) is required, which implies that constructs are measured consistently across populations. When dealing with many groups, MI often does not hold, requiring pairwise comparisons between the...
**Context: The number of unpaid carers has risen in recent years, with a growing proportion providing high-intensity care. Balancing caregiving with employment can create significant challenges, such as financial strain, poor mental and physical well-being and reduced labour market participation. This study explores how caregiving intensity and socio-demographic characteristics combine to...
Introduction: One of the myths surrounding high intellectual abilities is the belief that individuals with higher intelligence are not interested in physical activity, implying a relationship between intelligence and exercise.
Objective:To analyze the relationships between intelligence and physical activity, as well as gender differences in interest in physical activity, while also studying...
Comparing relations between latent constructs across groups is essential for understanding social phenomena in different contexts. A key assumption for valid comparisons of such relations is that the constructs are measured equivalently across the groups, referred to as “measurement invariance”. Specifically, partial metric invariance is sufficient –meaning that at least some factor loadings...
Introduction: Several studies have indicated that the tendency to lie is more prevalent in adolescents compared to children and adults (Buta et al., 2020; DePaulo et al., 1996; Levine et al., 2013). Studying the motivations behind this behaviour can be essential to gaining a deeper understanding of this phenomenon.
Objective: Study the different reasons for lying among the adolescent...
Structural equation modelling (SEM) is the state-of-the-art method for analysing relations between latent variables (e.g., attitudes or behaviours), also called ‘factors’. SEM consists of a measurement model (MM), which specifies how questionnaire items measure the factors, and a structural model (SM), which captures the relations of interests. Traditionally, SEM estimates the MM and the SM...
Introduction Multipotentiality, defined as the ability to excel in diverse areas of interest (Cordero, 2019), has been explored through individual differences and educational factors that favor its development. Previous studies highlight that factors such as gender and high abilities play crucial roles in how individuals explore and manage multiple talents (Kerr Huffman, 2018). For example,...
Researchers often use vector autoregressive models to study dynamic processes of latent variables in daily life, such as the extent to which positive and negative affect carry over and interact with each other from one moment to the next. Mixture modeling allows finding clusters of individuals that are similar to each other in their dynamic processes. However, applying MMG-SEM to vector...
Introduction. Fear of Public Speaking (FoPS) or Public Speaking Anxiety (PSA), considered a specific subtype of Social Anxiety Disorder, profoundly affects the personal, academic, and professional spheres. This phenomenon is characterized by cognitive, emotional, and physical manifestations that limit the performance and social interactions of those who experience it.
Objectives. The...
Meta-analytic structural equation modeling (MASEM) is a method to systematically synthesize results from primary studies, allowing the researchers to simultaneously examine multiple relations among variables by fitting a structural equation model to the pooled correlations. Incorporating dichotomous variables (e.g., having
a specific disease or not) into MASEM poses challenges. While primary...
In a recent paper we presented a way of incorporating mean structures in meta-analytic structural equation modeling (MASEM). MASEM with means is applicable when the studies included in the meta-analysis used the same indicators, measured on the same scales. The meta-analytic data consist of the studies’covariance matrices and mean vectors. The MASEM then restricts the vector of meta-analyzed...
The statistical foundations of person parameter estimation for the multivariate Thurstonian item response theory (TIRT) model of pairwise comparison and forced-choice (FC) ranking data are elaborated, and several misconceptions in IRT and TIRT are addressed. It is shown that directional information (i.e. multivariate information as defined by Reckase & Kinley, 1991; Applied Psychological...
The use of Likert scales in the field of social research is becoming more and more common every day, it is necessary to investigate which is the most appropriate methodology to carry out the analysis of the data obtained. If they are ordinal, they should be treated as such, however, they are frequently analyzed considering them as continuous variables. One of the most widely used techniques to...
Reporting biases are well-known phenomena that can undermine the credibility of published scientific findings and potentially distort meta-analytic effect estimates. These biases arise when the decision to publish or report results is influenced by their nature or direction. Traditionally, methods for assessing small-study effects and evaluating the robustness of results against publication...
This study aims to analyze the effectiveness of training programs designed for mental health professionals. The analysis focuses on randomized controlled trials (RCTs) and cluster-randomized studies, examining the impact of these interventions across three levels of outcomes (based on Kirkpatrick & Kirkpatricks’model): knowledge acquisition, attitude changes, and behavioral modifications. The...
Moderator analyses play a crucial role in meta-analysis, as they help to identify relationships between study characteristics and the effect size magnitude. When multiple effect sizes are reported within studies, various methods can be used to perform moderator analysis or meta-regression. These include three-level models (which may or may not account for variability in moderator effects...
One of the biggest limitations of meta-analyses is that the information they provide can be affected by the biases of the included primary studies. To address this, evaluations of primary study risk of bias (RoB) can be performed and incorporated into the meta-analysis. However, research on this topic in clinical psychology is scarce. In this study, we examined this issue using a sample of...
Moderator analysis in meta-analysis is commonly used to study whether certain study characteristics can explain the heterogeneity in effect sizes. Understanding why effect sizes vary between contexts is important for selecting the right intervention for the right context and for guiding further research. In order to rely on the results from moderator analyses, the moderator effect estimates...
Background: This study investigated the effectiveness of a psychoeducational intervention on the quality of life and well-being of patients with myositis, a rare condition that significantly impacts daily life. Methods: All myositis patients in a specific healthcare region were invited to participate. Thirty-four eligible patients were randomly assigned to either an intervention group or a...
Univariate meta-analysis models assume that all effect sizes included in the meta-analysis are independent. This assumption is violated if, for example, two outcomes are reported in a study that are of interest to the meta-analyst or a study reports multiple experiments administered by the same researchers in the same lab. The multivariate and multilevel meta-analysis model allow to model...
Background: Chronic kidney disease (CKD) is a global health issue that significantly impacts patients’ quality of life due to physical and emotional symptoms. Anxiety and depression are common in these patients, negatively affecting their prognosis and treatment adherence. The Hospital Anxiety and Depression Scale (HADS) is a popular tool for assessing these disorders, but it has not been...
As public trust in standardized testing declines, AI-driven methods such as machine learning and natural language processing are increasingly being applied to optimize traditional measurement approaches. While these innovations offer important gains in efficiency, cost, and scalability, there is a risk that, without also addressing broader concerns of trust, equity, and relevance,...
An assessment conducted within competence-based knowledge structure theory (CbKST) aims to uncover the skills that an individual possesses based on their observed responses to test items. This process involves first deriving the set of items that the individual is capable of solving (the knowledge state) from the set of items they actually solved (the response pattern), and then inferring the...
This study investigates the effectiveness of AI-tutored learning environments in implementing evidence-based learning techniques among undergraduate students. Drawing from cognitive science principles, particularly those outlined in Willingham’s (2023) work, we developed an innovative intervention utilizing AI tutors to simulate personalized learning environments focused on three key areas:...
Psychometric evaluations of psychological assessment measures have shown that several instruments produce inconsistent factor structures across groups and contexts and provide questionable reliability and predictive validity. A key conceptual issue concerns how a theoretical construct is defined vs. how it is measured. Given that psychological constructs cannot be observed directly, but only...
The rapid advancement of artificial intelligence (AI), particularly large language models (LLMs), has introduced powerful tools for various research domains, including psychological scale development. This study presents a fully automated method to efficiently generate and select high-quality, non-redundant items for psychological assessments using LLMs and network psychometrics. Our approach...
In quantitative measurement, Likert scales are often treated as continuous variables, potentially distorting results due to their ordinal nature. This study addresses the issue of appropriately handling ordinal variables by integrating classical test theory (CTT) and item response theory (IRT) to validate a novel Scale of Cultural Capital (SCC). SCC consists of 14 items measuring three...
Recent advances in large language models (LLMs) present opportunities for developing performance-based items in educational and psychological assessment. We introduce P-AI-GENIE (Performance-based Automatic Item Generation and Network-Integrated Evaluation), an extension of AI-GENIE that focuses on generating and validating performance items. The talk will cover how items can be developed and...
Knowledge structure theory is a psychometric approach for representing the knowledge of participants in a precise, non-numerical way. The most prominent probabilistic model in knowledge structure theory is the basic local independence model. One of its fundamental assumptions is the constancy of the response error probabilities (guessing and slipping) across all participants. However, it seems...
We propose a novel approach for modeling and understanding the dynamics of emotion facial expression recognition (FER) scores. Recent advancements in deep learning and transformer-based neural network architectures enable the time series analysis of FER scores extracted from images and videos. This type of data can be important for psychological research of affective dynamics and emotion...
As educational and cognitive assessments advance, there is a growing need for innovative, evidence based methodologies that offer deeper insights into students’ abilities, knowledge representation, and response reliability. Contemporary assessment systems face the challenge of capturing nuanced insights into student learning while ensuring measurement validity, going beyond traditional...
Introduction: Mixed methods research (MMR) refers to integrating quantitative and qualitative approaches within a research study. This century has recognised MMR as a third methodological approach. During the last decade, the Journal of Mixed Methods Research (JMMR) has established the milestones of the MMR together with the Mixed Methods International Research Association...
This talk explores the core principles and practical applications of AI. We begin by defining AI as the discipline that imbues machines with human-like intelligence, encompassing reasoning, learning, and creativity. Key characteristics include the ability to perceive, interact, solve problems, act autonomously, and adapt to environments. We will cover the diverse problems AI addresses, such as...
In this State of the Art Address, I revisit and extend the conceptual boundaries of two core mixed methods transformation techniques: qualitizing and quantitizing. In so doing, I spotlight the expanded methodological and philosophical dimensions that elevate their application in contemporary mixed methods research. The first third of the presentation is dedicated to qualitizing, defined as the...
The proliferation of experience sampling methodology (ESM) has advanced the study of affect dynamics. In ESM, participants respond to multiple questions about the presence and intensity of positive an negative emotions at random moments throughout the day for many consecutive days or weeks. These items are commonly thought to represent two latent constructs, positive affect (PA) and negative...
Introduction. Social cognition allows understanding and predicting both one’s own actions and those of others. It includes processes such as the perception of emotions, the theory of the mind, empathy and social judgment. The alterations in these processes have been broadly studied in people with Schizophrenia (Ef) and Autistic Spectrum Disorders (ASD). Currently, both diagnostic evaluation...
Developing intervention research in the field of hospital pedagogy and health requires defining the concept in a way that helps to make the involved variables visible. Violant defined in 2017 hospital pedagogy as the integral action that assures ethical and bioethical principles and the right and duties of a person with the aim of improving the individual, the family, and the social...
In latent space item response theory (IRT) modelling, both subjects and items are positioned in R dimensional Euclidian latent space. This framework allows for detailed modelling of local dependences among items and subjects, which are assumed to be absent in conventional IRT models. Latent space IRT has demonstrated its value in diverse fields, including intelligence assessment (Kang & Jeon,...
Mixed methods research entails integrated analysis combining data and techniques, enhancing validity and depth by leveraging strengths of each method.
To understand the experiences and meaning that adolescents and young adults with chronic illnesses assign to their student life, a multicentre study using a sequential explanatory mixed methods design (quantitative → qualitative) and a...
Introduction. Fear of Public Speaking or Public Speaking Anxiety is a specific manifestation of Social Anxiety Disorder that can significantly interfere with personal, academic and professional performance. On the other hand, Social Cognition, which includes skills such as emotion recognition, theory of mind, empathy and attributional styles, is fundamental for interpreting the intentions and...
The cornerstone of psychometrics –factor analytical methods –is designed for the interpretable dimensional reduction of response accuracy vector data. This approach can be likened to Variational AutoEncoders (VAEs) with shallow decoders (Urban & Bauer, 2021). However, it is not suitable for analyzing raw process data due to its inability to account for autoregressive dependencies within...
Introduction: The assessment of social cognition through the Theory of Mind can contribute to the study of this construct. One of the instruments proposed to assess the Theory of Mind is the Yoni Task, with which a cross-cultural validation is being carried out for the Spanish-speaking population (Argentina, Mexico and Spain). Objective: This study aims to determine the factor structure of the...
Before the rapid development of artificial intelligence, standardized tests mainly relied on multiple-choice questions because evaluating open-ended tasks required significant resources. Modern large language models (LLMs), such as ChatGPT, Gemini, and Llama, now enable automated assessment of open-ended tasks. Unlike traditional machine learning or deep learning methods, foundational LLMs do...
Amortized variational inference (AVI) has recently been proposed in the field of Item response theory as a computationally efficient alternative to marginal maximum likelihood estimation (MML). The current study investigates if the computational advantages of AVI for large, high dimensional data carry over to discrete latent variable models. We adapt three techniques from the machine learning...
Introduction: Theory of Mind (ToM) is a fundamental neurocognitive function for Social Cognition. However, there are still not enough validated and standardized instruments to assess this function in the Latin American population, and even fewer in Colombia, which limits its clinical analysis.
Objective: Analyze the internal structure of instruments for the assessment Theory of Mind in...
Automated essay scoring systems can support teachers by providing rapid, cost-effective verbal and numerical feedback on student writing. In recent years, these systems have improved significantly with the rise of generative artificial intelligence models based on the transformer architecture. Research consistently shows that these models outperform traditional machine learning approaches...
The concept of social cognition includes a series of processes that allow people to understand the social world.
The theory of mind as the capacity of people to ascribe mental entities such as desires, beliefs, intentions and emotions has had a great development. The classic tasks that allowed the evaluation of the process were limited to all/nothing tasks, that is, the person evaluated...
Cognitive diagnostic models (CDMs) serve as an effective approach to diagnostic assessment in education. This study explores how CDMs can be applied to criterion-referenced standard setting. The relevance of the study lies in the increasing demand for more detailed assessment in education. Modern education systems require not only the sorting of students on the basis of their performance, but...
Introduction
In today’s rapidly evolving technological landscape, the integration of Information and Communication Technology (ICT) in the workplace has transformed work processes, offering numerous benefits while also introducing new challenges. One such challenge is technostress, a phenomenon describing the strain caused by the pervasive use of ICT. Although technostress affects workers...
Medical professionalism is defined as the commitment of physicians to the health of patients and society, the profession, and themselves. Measuring medical professionalism is crucial as it directly impacts the quality of patient care. The Professionalism Mini-Evaluation Exercise (P-MEX), consisting of 24 items across 4 domains, measures the professional behavior of medical professionals or...
Psychological tests are essential tools that help psychologists make decisions about people. The Board of Assessment (BoA) of the European Federation of Psychologists' Associations (EFPA) has various projects aimed at improving tests and testing practices across Europe and beyond. In this presentation, we share two BoA projects. The first is the BoA’s flagship project, which focuses on tests:...
Introduction
Theory of Mind (ToM) refers to the ability to understand and represent both one’s own mental states and those of others, enabling the process of mentalizing (Happé et al., 2017). This study posits that individuals with high cognitive abilities may exhibit distinct neural processing patterns during ToM tasks, reflecting a potentially more efficient or elaborate engagement of brain...
This study investigated whether combining musical and chromatic stimuli with congruent emotions produces a synergistic effect on emotional responses, measured through subjective self-reports and electroencephalography (EEG). The sample consisted of 33 participants (20 females; M = 20.3 years, SD = 2.4), all free of moderate to severe depressive symptoms (BDI-II: M = 5.5, SD = 5). Professional...
The value of the Programme for International Student Assessment (PISA) in informing evidence-based policymaking relies on the degree of precision with which population-level statistics are estimated and reported. But also, the degree to which those aggregate statistics can be meaningfully compared (e.g., country-level mean scores) and the interpretations made based on those comparisons are...
Electroencephalography is a harmless recording technique (Rivera et al., 2023) employed in both clinical and research settings to obtain an electroencephalogram (EEG). It has been recognized as a gold method of brain electrical activity to discover structural or functional damage in people with or without a diagnosis of neurological disease such as epilepsy (Guerrero Aranda, 2020). As a...
Standards play a crucial role in guiding practices in Educational and Psychological Assessment. Various professional associations continuously update guidelines to support practitioners in assessment-related processes, leading to the emergence of different approaches. However, how do these associations gather information to propose new standards? To what extent do they consider users’...
Introduction
Recent investigations point to a link between intelligence and more efficient neural processing, suggesting that people with higher cognitive performance tend to have stronger integration among key brain areas and reduced redundant activity (Jung & Haier, 2007). Grounded in this concept of neural efficiency, the current study examines resting-state functional activity and...
The Standards for Educational and Psychological Testing have been published by the American Psychological Association, the American Educational Research Association, and the National Council on Education since the 1950s. They are currently under revision, and the forthcoming version, is expected to be published in 2026. In this presentation, a member of the Joint Committee revising the...
Introduction: Sensory processing sensitivity (SPS) is an inherited personality trait that determines people to feel, think and interact with others differently from others. Several research studies have shown these differences through studies on brain processing. Objective: To analyse the differences in resting brain activity, as determined by functional magnetic resonance imaging, between...
In recent years, psychological research has increasingly utilized novel (often digital) data sources. Sensing data, such as those collected from smartphones, enable researchers to monitor human behavior across diverse, ecologically valid contexts and extended periods with relative ease. These rich datasets offer great potential for predicting psychological traits, such as personality facets,...
An exciting development in educational and psychological testing is culturally responsive assessment, which is assessment that is “mindful of student differences and employs assessment methods appropriate for different student groups”(Montenegro & Jankowski, 2017, p. 9). Although the call for culturally responsive assessment is strong, there are few examples of good practices in this area and...
This oral communication examines evidence of content validity in psychometrics, focusing on established procedures and potential complementary approaches. We review traditional validation methods, acknowledging their foundational importance. Our discussion emphasizes the need for isomorphic relationships between construct definitions and operational representations, a cornerstone of...
Psychology is increasingly interested in the prediction of psychological constructs via machine learning (ML) models, for example, predicting a person’s personality or intelligence. To measure these psychological constructs, psychologists often draw on questionnaire data. In supervised ML, these measurements are then used as target variables (i.e., the “ground truth”) for model training....
With the advent of machine learning tools and large language models (LLMs), the collection of measurements related to social science constructs (e.g., personality traits, political attitudes, human values) has become easier, faster and more affordable. These measurements are subsequently used for modelling of societal and group processes that social scientists typically engage in, where...
Item functioning is typically evaluated through pilot studies to identify problematic items and assess their performance. However, such analyses often fail to provide insights into the underlying causes of these problems. To address this gap, alternative strategies such as psychophysiological measures, including eye movements, may offer valuable insights into participants’ response processes....
Despite the popularity of structural equation modeling (SEM), investigating the fit of SEM models is still challenging—especially, if the global model fit evaluation implies non-negligible misfit, and researchers need to further investigate the type and severity of the misspecification in their model. Being overwhelmed by poorly fitting models, researchers sometimes strain the interpretation...
Practitioners and researchers use open-ended questions when designing survey questions to obtain validity evidence of response processes to survey items and questions. Data cleaning and response coding pose significant challenges, particularly for “web probes,”given the self-administered nature of “web probes”and the large number of participants compared to the smaller number of people...
Meta-analytic structural equation modeling (MASEM), originally referred to as model-based meta-analysis, involves testing structural equation models on meta-analytic data. The technique is being applied in a broad range of fields, including education, psychology, environmental research, information security, medicine, and ecology. In this talk I will outline various methods that can be used to...
In survey research, especially under unsupervised online conditions, careless responding—also referred to as insufficient effort responding—remains a significant threat to data quality. When respondents fail to engage meaningfully with questionnaire content, the resulting bias can weaken psychometric properties, distort correlations, and lead to erroneous conclusions. Recent estimates place...
Introduction. Experience sampling methods (ESM) are an increasingly popular strategy for studying affective processes (i.e., mood and emotions). In these studies, the emotional state of one or more individuals is measured several times a day during multiple days or weeks. A unique feature of these studies is the spacing
of observations: measurements are frequent during waking hours but...
Questionnaires are a cornerstone of scientific research when wanting to measure non-cognitive constructs. However, low motivation to complete them can lead to improper responses and compromise the validity of the drawn conclusions (i.e., Maniaci & Rogge, 2014; Podsakoff et al., 2012), especially when using unsupervised online formats (i.e., Kroehne et al., 2020). One specific factor related to...
The findings of a collection of studies addressing a common research question can be visualized in terms of a forest plot, showing the effect sizes of the individual studies together with a corresponding confidence interval. A four-sided polygon (sometimes called a summary ‘diamond’) is often added to such a plot to depict the results from a meta-analysis pooling together the effect sizes,...
A popular cost-effective way of collecting longitudinal data is the accelerated longitudinal design (ALD). In ALDs, participants from different cohorts are measured repeatedly but the measures provided by each participant cover only a fraction of the time range of the study. It is then assumed that the common trajectory can be studied by aggregating the information provided by the different...
Unmotivated responses, identified using response times (as rapid guessing in cognitive tests, Wise & Kong, 2005; or as rapid responding in questionnaires, as part of the careless and insufficient effort responding, C/IER), are a known threat to validity (e.g. Wise, 2017). It is known from the literature that unmotivated response behavior occurs more frequently in low-stakes assessments (Wise...
Effect sizes are commonly used in meta-analysis, as they provide a tool to summarize the results from each primary study in a common metric. In psychology and related fields, meta-analyses often involve integrating continuous variables measured with different scales across studies, which leads to using standardized mean differences as the effect size index. One of these indices is the...
The standardized mean change is widely recognized as a key effect size index in pretest-posttest one-group designs with quantitative dependent variables. Different parametric versions of this index are available, depending on the standardizer used to scale the mean difference into standardized units. In addition, various estimators can be applied to each parameter. This study used a Monte...
Self-report surveys often suffer from careless and insufficient effort responding (C/IER), which refers to responses provided without paying attention to the items’content. Mixture modeling approaches are promising tools to assess C/IER by means of latent class variables. However, evidence for the validity of interpreting the latent class variable as C/IER is still pending. To shed more light...
State-space models (SSMs) provide a powerful framework for modeling dynamic systems, capturing both intra-individual and inter-individual variability in longitudinal data. In the context of cognitive development research, one interesting feature of SSMs is their ability to model deviations, or “shocks,”in individual trajectories. Such shocks may signal atypical changes that could be considered...
Disengaged test-taking behavior is a problem in low-stakes assessments. To account for low engagement, popular approaches rely on item response times to classify responses as disengaged (rapid guessing) or inconspicuous (engaged). Although conceptually elegant, this binary classification has been found to miss a substantial proportion of disengaged responses. This paper introduces an extended...
In meta-analysis, the Q statistic is traditionally used for testing the hypothesis of homogeneity of the parametric effect sizes of the set of studies. Several critiques have been posed to that test, especially when applied to the standardized mean difference (g). Among them, that the weights are based on estimated, not true, variances, that the variances of the estimates correlate with the...
One of the key questions in longitudinal research is when to take measurements of the variables of interest. Panel studies usually focus on the dynamics between two processes over time (e.g., depressive symptoms and self-esteem), and include few repeated measures (<10). This forces researchers to find the most efficient way to design their study and collect their data. Recently introduced in...
In reading comprehension tests, test-takers can choose to reread the text of the task while working on an item. Up to now it is not well understood how rereading the text relates to test performance and its measurement. To close this gap, the aim of the present study was to investigate the relationship between text rereads on one hand and item parameters of item response models and test...
Underpowered studies are ubiquitous in psychology and related disciplines. Meta-analysis can help alleviate this problem, increasing the statistical power by combining the results of a set of primary studies. However, this is not necessarily true when we use a random-effects model, which is currently the predominant approach when carrying out meta-analyses. In this study, we examined the...
Differential effect analysis can reveal the preconditions for effective interventions by highlighting variations in intervention outcomes. The growing use of digital tools, such as learning apps, provides rich process data on response times and response behavior, offering insights into how participants interact with these apps. We use this information source and bridge psychometric research on...
Recent research has identified several limitations in traditional methods for conducting meta-analyses of reliability generalization, such as the lack of equivalence between total and subscale reliability indices and the violation of error independence assumptions. In response, multivariate statistical techniques have been developed to offer more accurate estimations of measurement...
Meta-analysis is the statistical methodology to synthesize findings across multiple studies. However, publication bias is arguably one of the most important threats to the validity of a meta-analysis. One major consequence of publication bias is overestimation of the meta-analytic effect size. To address this, various methods have been developed to correct for publication bias in a...
To do
Large Language Models (LLMs) have shown promise in text clustering and dimensionality analysis through embeddings, yet their potential for optimization remains largely unexplored. We conducted a comprehensive simulation study to enhance the accuracy of LLM embeddings in trait mapping using Dynamic Exploratory Graph Analysis (Dynamic EGA). The simulation generated 200 items across 4 traits of...
Background. Careless responding (CR) occurs when individuals do not pay adequate attention to item content. Research has shown that CR introduces bias and compromises data quality (Podsakoff et al., 2012), highlighting the need for effective prevention and management strategies (e.g., Arthur et al., 2021; Edwards, 2019; Ward & Meade, 2022). Different methods have been proposed to detect CR,...
The rapid advancement of large language models (LLMs) has enabled automated psychological scale development, yet questions remain about the correspondence between in-silica and human-gathered validation. This study examines whether structural validity metrics computed during automated item development match empirical validation results. Using AI-GENIE (Automatic Item Generation and Validation...
Background. Careless and insufficient effort responding (C/IER) occurs when respondents fail to give sufficient attention to item content, which leads to poor-quality data (Podsakoff et al., 2012). There are several methods to detect this phenomenon, one being Instructed Response Items (IRI), valued for its simplicity, robust metric properties, and ability to identify different C/IER patterns...
The construction of forced-choice questionnaires often relies on item banks with single-stimulus or Likert-type items. In its simplest form, items must be paired to create a desired number of blocks. A key challenge in this process is pairing items while accounting for factors such as item polarity and social desirability, which can impact the quality of the measures. Recent combinatorial...
Background. To prevent response styles associated with the use of rating scales, test items may be presented in so-called ipsative (or relative to self) formats including popular ‘forced choice’, and also ‘graded preferences’ or ‘proportions-of-total’. Like any other questionnaires, ipsative questionnaires can be subject to careless responding when respondents are not sufficiently motivated to...
Background. Careless and insufficient effort responding (C/IER) on self-report measures produces responses that fail to accurately reflect the trait being measured, posing a major threat to the quality and validity of survey data. While detecting C/IER is vital to ensure validity of conclusions drawn from self-report data, it is a non-trivial endeavor, with each detection method involving...
Parallel to the development of new technologies, computational language models have emerged as automated tools for analyzing semantic relationships between linguistic units. Due to their success in performing human-like tasks, such as vocabulary tests and sentiment analysis, interest in the practical applications of these models has grown exponentially, resulting in the development of larger...
The field of mixed methods research continues to evolve, pushing the boundaries of methodological innovation to address complex and multifaceted research problems. This keynote address introduces the Integrated Mixed Methods Transformation Approach (IMMTA) as a meta-framework that systematically transforms monomethod research designs into fully integrated mixed methods research approaches....
Meta-analysis is a statistical technique that combines effect sizes from independent primary studies on the same topic, and is currently seen as the “gold standard” for synthesizing and summarizing results from multiple primary studies. The main research objectives of a meta-analysis are (i) estimating the average effect, (ii) assessing the heterogeneity of true effect sizes, and if the true...
Self-report remains one of the most commonly used methods for gathering information about individuals due to its simplicity and low cost. However, it is highly susceptible to biases, including faking, which involves portraying oneself more positively than one truly is. One key methodological approach to identifying faking is overclaiming—exaggerated self-reports of competence that tend to...