Matthias von Davier

Last updated
ISBN 978-0387329161
  • The Role of International Large-Scale Assessments: Perspectives from Technology, Economy, and Educational Research (2012) ISBN   978-9400797116
  • Handbook of International Large-Scale Assessment: Background, Technical Issues, and Methods of Data Analysis (2013) ISBN   978-1439895122
  • Advancing Human Assessment: The Methodological, Psychological and Policy Contributions of ETS (2017) ISBN   978-3319586878
  • Handbook of Diagnostic Classification Models: Models and Model Extensions, Applications, Software Packages (2019) ISBN   978-3030055837
  • Advancing Natural Language Processing in Educational Assessment (2023) ISBN   978-1032244525
  • Selected articles

    Related Research Articles

    Psychometrics is a field of study within psychology concerned with the theory and technique of measurement. Psychometrics generally covers specialized fields within psychology and education devoted to testing, measurement, assessment, and related activities. Psychometrics is concerned with the objective measurement of latent constructs that cannot be directly observed. Examples of latent constructs include intelligence, introversion, mental disorders, and educational achievement. The levels of individuals on nonobservable latent variables are inferred through mathematical modeling based on what is observed from individuals' responses to items on tests and scales.

    <span class="mw-page-title-main">Educational Testing Service</span> Educational testing and assessment organization

    Educational Testing Service (ETS), founded in 1947, is the world's largest private educational testing and assessment organization. It is headquartered in Lawrence Township, New Jersey, but has a Princeton address.

    In psychometrics, item response theory (IRT) is a paradigm for the design, analysis, and scoring of tests, questionnaires, and similar instruments measuring abilities, attitudes, or other variables. It is a theory of testing based on the relationship between individuals' performances on a test item and the test takers' levels of performance on an overall measure of the ability that item was designed to measure. Several different statistical models are used to represent both item and test taker characteristics. Unlike simpler alternatives for creating scales and evaluating questionnaire responses, it does not assume that each item is equally difficult. This distinguishes IRT from, for instance, Likert scaling, in which "All items are assumed to be replications of each other or in other words items are considered to be parallel instruments". By contrast, item response theory treats the difficulty of each item as information to be incorporated in scaling items.

    <span class="mw-page-title-main">Likert scale</span> Psychometric measurement scale

    A Likert scale is a psychometric scale named after its inventor, American social psychologist Rensis Likert, which is commonly used in research questionnaires. It is the most widely used approach to scaling responses in survey research, such that the term is often used interchangeably with rating scale, although there are other types of rating scales.

    <span class="mw-page-title-main">Questionnaire</span> Series of questions for gathering information

    A questionnaire is a research instrument that consists of a set of questions for the purpose of gathering information from respondents through survey or statistical study. A research questionnaire is typically a mix of close-ended questions and open-ended questions. Open-ended, long-term questions offer the respondent the ability to elaborate on their thoughts. The Research questionnaire was developed by the Statistical Society of London in 1838.

    <span class="mw-page-title-main">Programme for International Student Assessment</span> Scholastic performance study by the OECD

    The Programme for International Student Assessment (PISA) is a worldwide study by the Organisation for Economic Co-operation and Development (OECD) in member and non-member nations intended to evaluate educational systems by measuring 15-year-old school pupils' scholastic performance on mathematics, science, and reading. It was first performed in 2000 and then repeated every three years. Its aim is to provide comparable data with a view to enabling countries to improve their education policies and outcomes. It measures problem solving and cognition.

    The Rasch model, named after Georg Rasch, is a psychometric model for analyzing categorical data, such as answers to questions on a reading assessment or questionnaire responses, as a function of the trade-off between the respondent's abilities, attitudes, or personality traits, and the item difficulty. For example, they may be used to estimate a student's reading ability or the extremity of a person's attitude to capital punishment from responses on a questionnaire. In addition to psychometrics and educational research, the Rasch model and its extensions are used in other areas, including the health profession, agriculture, and market research.

    The polytomous Rasch model is generalization of the dichotomous Rasch model. It is a measurement model that has potential application in any context in which the objective is to measure a trait or ability through a process in which responses to items are scored with successive integers. For example, the model is applicable to the use of Likert scales, rating scales, and to educational assessment items for which successively higher integer scores are intended to indicate increasing levels of competence or attainment.

    Estimation of a Rasch model is used to estimate the parameters of the Rasch model. Various techniques are employed to estimate the parameters from matrices of response data. The most common approaches are types of maximum likelihood estimation, such as joint and conditional maximum likelihood estimation. Joint maximum likelihood (JML) equations are efficient, but inconsistent for a finite number of items, whereas conditional maximum likelihood (CML) equations give consistent and unbiased item estimates. Person estimates are generally thought to have bias associated with them, although weighted likelihood estimation methods for the estimation of person parameters reduce the bias.

    Psychometric software refers to specialized programs used for the psychometric analysis of data that was obtained from tests, questionnaires, polls or inventories that measure latent psychoeducational variables. Although some psychometric analysis can be conducted using general statistical software like SPSS, most require dedicated tools designed specifically for psychometric purposes.

    <span class="mw-page-title-main">Benjamin Drake Wright</span> American psychometrician (1926–2015)

    Benjamin Drake Wright was an American psychometrician. He is largely responsible for the widespread adoption of Georg Rasch's measurement principles and models. In the wake of what Rasch referred to as Wright's “almost unbelievable activity in this field” in the period from 1960 to 1972, Rasch's ideas entered the mainstream in high-stakes testing, professional certification and licensure examinations, and in research employing tests, and surveys and assessments across a range of fields. Wright's seminal contributions to measurement continued until 2001, and included articulation of philosophical principles, production of practical results and applications, software development, development of estimation methods and model fit statistics, vigorous support for students and colleagues, and the founding of professional societies and new publications.

    Educational measurement refers to the use of educational assessments and the analysis of data such as scores obtained from educational assessments to infer the abilities and proficiencies of students. The approaches overlap with those in psychometrics. Educational measurement is the assigning of numerals to traits such as achievement, interest, attitudes, aptitudes, intelligence and performance.

    David Andrich is an Australian academic and assessment specialist. He has made substantial contributions to quantitative social science including seminal work on the Polytomous Rasch model for measurement, which is used in the social sciences, in health and other areas.

    <span class="mw-page-title-main">Klaus Kubinger</span>

    Klaus D. Kubinger, is a psychologist as well as a statistician, professor for psychological assessment at the University of Vienna, Faculty of Psychology. His main research work focuses on fundamental research of assessment processes and on application and advancement of Item response theory models. He is also known as a textbook author of psychological assessment on the one hand and on statistics on the other hand.

    The Mokken scale is a psychometric method of data reduction. A Mokken scale is a unidimensional scale that consists of hierarchically-ordered items that measure the same underlying, latent concept. This method is named after the political scientist Rob Mokken who suggested it in 1971.

    Computational psychometrics is an interdisciplinary field fusing theory-based psychometrics, learning and cognitive sciences, and data-driven AI-based computational models as applied to large-scale/high-dimensional learning, assessment, biometric, or psychological data. Computational psychometrics is frequently concerned with providing actionable and meaningful feedback to individuals based on measurement and analysis of individual differences as they pertain to specific areas of enquiry.

    Automatic item generation (AIG), or automated item generation, is a process linking psychometrics with computer programming. It uses a computer algorithm to automatically create test items that are the basic building blocks of a psychological test. The method was first described by John R. Bormuth in the 1960s but was not developed until recently. AIG uses a two-step process: first, a test specialist creates a template called an item model; then, a computer algorithm is developed to generate test items. So, instead of a test specialist writing each individual item, computer algorithms generate families of items from a smaller set of parent item models. More recently, neural networks, including Large Language Models, such as the GPT family, have been used successfully for generating items automatically.

    Randy Elliot Bennett is an American educational researcher who specializes in educational assessment. He is currently the Norman O. Frederiksen Chair in Assessment Innovation at Educational Testing Service in Princeton, NJ. His research and writing focus on bringing together advances in cognitive science, technology, and measurement to improve teaching and learning. He received the ETS Senior Scientist Award in 1996, the ETS Career Achievement Award in 2005, the Teachers College, Columbia University Distinguished Alumni Award in 2016, Fellow status in the American Educational Research Association (AERA) in 2017, the National Council on Measurement in Education's (NCME) Bradley Hanson Award for Contributions to Educational Measurement in 2019, the E. F. Lindquist Award from AERA and ACT in 2020, elected membership in the National Academy of Education in 2022, and the AERA Cognition and Assessment Special Interest Group Outstanding Contribution to Research in Cognition and Assessment Award in 2024. Randy Bennett was elected President of both the International Association for Educational Assessment (IAEA), a worldwide organization primarily constituted of governmental and NGO measurement organizations, and the National Council on Measurement in Education (NCME), whose members are employed in universities, testing organizations, state and federal education departments, and school districts.

    Mark Daniel Reckase is an educational psychologist and expert on quantitative methods and measurement who is known for his work on computerized adaptive testing, multidimensional item response theory, and standard setting in educational and psychological tests. Reckase is University Distinguished Professor Emeritus in the College of Education at Michigan State University.

    Jacqueline P. Leighton is a Canadian-Chilean educational psychologist, academic and author. She is a full professor in the Faculty of Education as well as vice-dean of Faculty Development and Faculty Affairs at the University of Alberta.

    References

    1. "Our Staff - Matthias von Davier". timssandpirls.bc.edu.
    2. "Methodology of Educational Measurement and Assessment". Springer.
    3. 1 2 "Presenters, Moderators, and Discussants' Biographies" (PDF).
    4. 1 2 "Awards - NCME". www.ncme.org.
    5. 1 2 "Awards". www.aera.net.
    6. "2021 AERA Fellows". www.aera.net.
    7. "Matthias von Davier Newly Elected Member of the National Academy of Education | IEA.nl". www.iea.nl.
    8. "New Executive Editor for Psychometrika". Psychometric Society. July 24, 2023.
    9. "Large-scale Assessments in Education". SpringerOpen.
    10. "Anastasi Lecture 2022 | Fordham". www.fordham.edu.
    11. "Conference Program | IEA.nl". www.iea.nl.
    12. Yelie, Yuan (April 17, 2023). "Interdisciplinary Seminar: Matthias von Davier, Boston College | Department of Statistics".
    13. "Guest lecture: Dr. Matthias von Davier - Munich Center of the Learning Sciences - LMU Munich". www.en.mcls.uni-muenchen.de.
    14. "PIAAC Methodological Seminar" (PDF).
    15. "Bio". September 2, 2016.
    16. "Matthias Von Davier | National Education Policy Center". nepc.colorado.edu.
    17. "NEPS > Project Overview > Advisory Experts > Matthias von Davier". www.neps-data.de.
    18. "Our Staff - Matthias von Davier". timssandpirls.bc.edu.
    19. "Meet Matthias von Davier - Lynch School of Education and Human Development". Boston College.
    20. "Matthias von Davier - Lynch School of Education and Human Development". Boston College.
    21. "Parallel computing for data analysis using generalized latent variable models".
    22. "Systems and methods for evaluating multilingual text sequences".
    23. "Mixture general diagnostic model".
    24. "System and Method for Large Scale Survey Analysis". www.ets.org.
    25. von Davier, Matthias; Rost, Jürgen (June 11, 1995). Fischer, Gerhard H.; Molenaar, Ivo W. (eds.). Rasch Models: Foundations, Recent Developments, and Applications. Springer. pp. 371–379. doi:10.1007/978-1-4612-4230-7_20 via Springer Link.
    26. von Davier, Matthias (February 11, 2014). "The DINA model as a constrained general diagnostic model: Two variants of a model equivalency". British Journal of Mathematical and Statistical Psychology. 67 (1): 49–71. doi:10.1111/bmsp.12003. PMID   23297749 via CrossRef.
    27. von Davier, Matthias (December 11, 2016). "High-Performance Psychometrics: The Parallel-E Parallel-M Algorithm for Generalized Latent Variable Models". ETS Research Report Series. 2016 (2): 1–11. doi:10.1002/ets2.12120 via CrossRef.
    28. 1 2 Ulitzsch, Esther; von Davier, Matthias; Pohl, Steffi (November 11, 2020). "A hierarchical latent response model for inferences about examinee engagement in terms of guessing and item-level non-response". British Journal of Mathematical and Statistical Psychology. 73 (S1): 83–112. doi:10.1111/bmsp.12188 via CrossRef.
    29. 1 2 Leng, Dihao; Bezirhan, Ummugul; Khorramdel, Lale; Fishbein, Bethany; Davier, Matthias von (April 24, 2024). "Examining Gender Differences in TIMSS 2019 Using a Multiple-Group Hierarchical Speed-Accuracy-Revisits Model". Educational Measurement: Issues and Practice. doi: 10.1111/emip.12606 via CrossRef.
    30. 1 2 von Davier, Matthias (December 1, 2018). "Automated Item Generation with Recurrent Neural Networks". Psychometrika. 83 (4): 847–857. doi:10.1007/s11336-018-9608-y. PMID   29532403 via Springer Link.
    31. 1 2 "Automated reading passage generation with OpenAI's large language model - ScienceDirect".
    32. 1 2 Jung, Ji Yoon; Tyack, Lillian; von Davier, Matthias (April 8, 2024). "Combining machine translation and automated scoring in international large-scale assessments". Large-scale Assessments in Education. 12 (1): 10. doi: 10.1186/s40536-024-00199-7 .
    33. von Davier, Matthias; Khorramdel, Lale; He, Qiwei; Shin, Hyo Jeong; Chen, Haiwen (December 11, 2019). "Developments in Psychometric Population Models for Technology-Based Large-Scale Assessments: An Overview of Challenges and Opportunities". Journal of Educational and Behavioral Statistics. 44 (6): 671–705. doi:10.3102/1076998619881789 via CrossRef.
    34. von Davier, Matthias; Yamamoto, Kentaro; Shin, Hyo Jeong; Chen, Henry; Khorramdel, Lale; Weeks, Jon; Davis, Scott; Kong, Nan; Kandathil, Mat (July 4, 2019). "Evaluating item response theory linking and model fit for data from PISA 2000–2012". Assessment in Education: Principles, Policy & Practice. 26 (4): 466–488. doi:10.1080/0969594X.2019.1586642 via CrossRef.
    35. "PISA 2022 Technical Report" (PDF).
    36. "Applying the Mixed Rasch Model to Personality Questionnaires".
    37. von Davier, Matthias; Naemi, Bobby; Roberts, Richard D. (October 11, 2012). "Factorial Versus Typological Models: A Comparison of Methods for Personality Data". Measurement: Interdisciplinary Research & Perspective. 10 (4): 185–208. doi:10.1080/15366367.2012.732798 via CrossRef.
    38. von Davier, Matthias; Shin, Hyo-Jeong; Khorramdel, Lale; Stankov, Lazar (June 11, 2018). "The Effects of Vignette Scoring on Reliability and Validity of Self-Reports". Applied Psychological Measurement. 42 (4): 291–306. doi:10.1177/0146621617730389. PMC   5978608 . PMID   29881126.
    39. Pohl, Steffi; Ulitzsch, Esther; von Davier, Matthias (April 23, 2021). "Reframing rankings in educational assessments". Science. 372 (6540): 338–340. Bibcode:2021Sci...372..338P. doi:10.1126/science.abd3300. PMID   33888624 via CrossRef.
    40. "Matthias von Davier". scholar.google.com.
    41. Bradstreet, Thomas E.; Cohen, Allan S.; Anderson-Cook, Christine M.; Cook, John R.; Robinson, Timothy J.; Cavanaugh, Joseph; Embrechts, Paul; Oleson, Jacob J. (2008). "Telegraphic Reviews". Journal of the American Statistical Association. 103 (481): 433–436. doi:10.1198/jasa.2008.s227. JSTOR   27640065 via JSTOR.
    42. Ackerman, Terry (July 3, 2015). "Rutkowski, L., von Davier, M., & Rutkowski, D. (Eds.). (2009). Handbook of International Large-Scale Assessment: Background, Technical Issues, and Methods of Data Analysis. New York, NY: CRC Press". International Journal of Testing. 15 (3): 274–289. doi:10.1080/15305058.2015.1034867 via CrossRef.
    43. Bao, Yu; Mireles, Nicolas Emundo (October 2, 2023). "Handbook of Diagnostic Classification Models: Models and Model Extensions, Applications, Software Packages Handbook of Diagnostic Classification Models: Models and Model Extensions, Applications, Software Packages , by Matthias von Davier, Young-Sun Lee, New York, United States, Springer, 2019, 656 pp., ISBN: 978-3-030-05583-7: by Matthias von Davier, Young-Sun Lee, New York, United States, Springer, 2019, 656 pp., ISBN: 978-3-030-05583-7". Measurement: Interdisciplinary Research and Perspectives. 21 (4): 282–285. doi:10.1080/15366367.2022.2159686 via CrossRef.
    44. Rutkowski, Leslie; Gonzalez, Eugenio; Joncas, Marc; von Davier, Matthias (March 11, 2010). "International Large-Scale Assessment Data: Issues in Secondary Analysis and Reporting". Educational Researcher. 39 (2): 142–151. doi:10.3102/0013189X10363170 via CrossRef.
    45. von Davier, Matthias; Gonzalez, Eugenio; Mislevy, Robert (January 30, 2009). "What are plausible values and why are they useful?" (PDF). ETS Research Report Series.
    46. von Davier, Matthias (November 11, 2008). "A general diagnostic model applied to language testing data". British Journal of Mathematical and Statistical Psychology. 61 (2): 287–307. doi:10.1348/000711007X193957. PMID   17535481 via CrossRef.
    47. von Davier, Matthias (December 11, 2005). "A General Diagnostic Model Applied to Language Testing Data". ETS Research Report Series. 2005 (2): i–35. doi:10.1002/j.2333-8504.2005.tb01993.x via CrossRef.
    48. von Davier, Matthias (December 11, 2014). "The Log-Linear Cognitive Diagnostic Model ( LCDM ) as a Special Case of the General Diagnostic Model ( GDM )". ETS Research Report Series. 2014 (2): 1–13. doi:10.1002/ets2.12043 via CrossRef.
    Matthias von Davier
    Occupation(s)Psychometrician, academic, inventor, and author
    AwardsETS Scientist Award, Educational Testing Service (2006)
    Bradley Hanson Award for Contributions to Educational Measurement, National Council on Measurement in Education (2012)
    Award for Significant Contribution to Educational Measurement and Research Methodology, American Educational Research Association (AERA) (2017)
    Academic background
    EducationMasters in Psychology
    Dr. rer. nat.
    Alma mater Kiel University