Mark Reckase

Last updated
Reckase, Mark (2022). The psychometrics of standard setting connecting policy and test scores. Boca Raton. ISBN   978-0-429-53098-2. OCLC   1370241241.
  • Reckase, Mark (2009). Multidimensional item response theory. New York: Springer. ISBN   978-0-387-89976-3. OCLC   437345698.
  • Representative publications

    • Reckase, M. (1979). Unifactor latent trait models applied to multifactor tests: Results and implications. Journal of Educational Statistics, 4, 207–230.
    • Reckase, M. (1985). The difficulty of test items that measure more than one ability. Applied Psychological Measurement, 9, 401–412.
    • Reckase, M. (1997). The past and future of multidimensional item response theory. Applied Psychological Measurement, 21, 25–36.
    • Reckase, M. (2006). A conceptual framework for a psychometric theory for standard setting with examples of its use for evaluating the functioning of two standard setting methods. Educational Measurement: Issues and Practice, 25, 4–18.
    • Reckase, M. D., & McKinley, R. L. (1991). The discriminating power of items that measure more than one dimension. Applied Psychological Measurement, 15(4), 361–373.

    Related Research Articles

    <span class="mw-page-title-main">Psychological statistics</span>

    Psychological statistics is application of formulas, theorems, numbers and laws to psychology. Statistical methods for psychology include development and application statistical theory and methods for modeling psychological data. These methods include psychometrics, factor analysis, experimental designs, and Bayesian statistics. The article also discusses journals in the same field.

    <span class="mw-page-title-main">Psychometrics</span> Theory and technique of psychological measurement

    Psychometrics is a field of study within psychology concerned with the theory and technique of measurement. Psychometrics generally refers to specialized fields within psychology and education devoted to testing, measurement, assessment, and related activities. Psychometrics is concerned with the objective measurement of latent constructs that cannot be directly observed. Examples of latent constructs include intelligence, introversion, mental disorders, and educational achievement. The levels of individuals on nonobservable latent variables are inferred through mathematical modeling based on what is observed from individuals' responses to items on tests and scales.

    <span class="mw-page-title-main">Psychological testing</span> Administration of psychological tests

    Psychological testing is the administration of psychological tests. Psychological tests are administered by trained evaluators. A person's responses are evaluated according to carefully prescribed guidelines. Scores are thought to reflect individual or group differences in the construct the test purports to measure. The science behind psychological testing is psychometrics.

    In statistics and psychometrics, reliability is the overall consistency of a measure. A measure is said to have a high reliability if it produces similar results under consistent conditions:

    "It is the characteristic of a set of test scores that relates to the amount of random error from the measurement process that might be embedded in the scores. Scores that are highly reliable are precise, reproducible, and consistent from one testing occasion to another. That is, if the testing process were repeated with a group of test takers, essentially the same results would be obtained. Various kinds of reliability coefficients, with values ranging between 0.00 and 1.00, are usually used to indicate the amount of error in the scores."

    Validity is the main extent to which a concept, conclusion or measurement is well-founded and likely corresponds accurately to the real world. The word "valid" is derived from the Latin validus, meaning strong. The validity of a measurement tool is the degree to which the tool measures what it claims to measure. Validity is based on the strength of a collection of different types of evidence described in greater detail below.

    Classical test theory (CTT) is a body of related psychometric theory that predicts outcomes of psychological testing such as the difficulty of items or the ability of test-takers. It is a theory of testing based on the idea that a person's observed or obtained score on a test is the sum of a true score (error-free score) and an error score. Generally speaking, the aim of classical test theory is to understand and improve the reliability of psychological tests.

    In psychometrics, item response theory (IRT) is a paradigm for the design, analysis, and scoring of tests, questionnaires, and similar instruments measuring abilities, attitudes, or other variables. It is a theory of testing based on the relationship between individuals' performances on a test item and the test takers' levels of performance on an overall measure of the ability that item was designed to measure. Several different statistical models are used to represent both item and test taker characteristics. Unlike simpler alternatives for creating scales and evaluating questionnaire responses, it does not assume that each item is equally difficult. This distinguishes IRT from, for instance, Likert scaling, in which "All items are assumed to be replications of each other or in other words items are considered to be parallel instruments". By contrast, item response theory treats the difficulty of each item as information to be incorporated in scaling items.

    <span class="mw-page-title-main">Raymond Cattell</span> British-American psychologist (1905–1998)

    Raymond Bernard Cattell was a British-American psychologist, known for his psychometric research into intrapersonal psychological structure. His work also explored the basic dimensions of personality and temperament, the range of cognitive abilities, the dynamic dimensions of motivation and emotion, the clinical dimensions of abnormal personality, patterns of group syntality and social behavior, applications of personality research to psychotherapy and learning theory, predictors of creativity and achievement, and many multivariate research methods including the refinement of factor analytic methods for exploring and measuring these domains. Cattell authored, co-authored, or edited almost 60 scholarly books, more than 500 research articles, and over 30 standardized psychometric tests, questionnaires, and rating scales. According to a widely cited ranking, Cattell was the 16th most eminent, 7th most cited in the scientific journal literature, and among the most productive psychologists of the 20th century. He was, however, a controversial figure, due in part to his alleged friendships with, and accusations about possible intellectual respect for, white supremacists and neo-Nazis.

    Computerized adaptive testing (CAT) is a form of computer-based test that adapts to the examinee's ability level. For this reason, it has also been called tailored testing. In other words, it is a form of computer-administered test in which the next item or set of items selected to be administered depends on the correctness of the test taker's responses to the most recent items administered.

    The Rasch model, named after Georg Rasch, is a psychometric model for analyzing categorical data, such as answers to questions on a reading assessment or questionnaire responses, as a function of the trade-off between the respondent's abilities, attitudes, or personality traits, and the item difficulty. For example, they may be used to estimate a student's reading ability or the extremity of a person's attitude to capital punishment from responses on a questionnaire. In addition to psychometrics and educational research, the Rasch model and its extensions are used in other areas, including the health profession, agriculture, and market research

    <span class="mw-page-title-main">Quantitative psychology</span> Field of scientific study

    Quantitative psychology is a field of scientific study that focuses on the mathematical modeling, research design and methodology, and statistical analysis of psychological processes. It includes tests and other devices for measuring cognitive abilities. Quantitative psychologists develop and analyze a wide variety of research methods, including those of psychometrics, a field concerned with the theory and technique of psychological measurement.

    A latent variable model is a statistical model that relates a set of observable variables to a set of latent variables.

    A computerized classification test (CCT) refers to, as its name would suggest, a test that is administered by computer for the purpose of classifying examinees. The most common CCT is a mastery test where the test classifies examinees as "Pass" or "Fail," but the term also includes tests that classify examinees into more than two categories. While the term may generally be considered to refer to all computer-administered tests for classification, it is usually used to refer to tests that are interactively administered or of variable-length, similar to computerized adaptive testing (CAT). Like CAT, variable-length CCTs can accomplish the goal of the test with a fraction of the number of items used in a conventional fixed-form test.

    Test equating traditionally refers to the statistical process of determining comparable scores on different forms of an exam. It can be accomplished using either classical test theory or item response theory.

    Differential item functioning (DIF) is a statistical characteristic of an item that shows the extent to which the item might be measuring different abilities for members of separate subgroups. Average item scores for subgroups having the same overall score on the test are compared to determine whether the item is measuring in essentially the same way for all subgroups. The presence of DIF requires review and judgment, and it does not necessarily indicate the presence of bias. DIF analysis provides an indication of unexpected behavior of items on a test. An item does not display DIF if people from different groups have a different probability to give a certain response; it displays DIF if and only if people from different groups with the same underlying true ability have a different probability of giving a certain response. Common procedures for assessing DIF are Mantel-Haenszel, item response theory (IRT) based methods, and logistic regression.

    Nambury S. Raju was an American psychology professor known for his work in psychometrics, meta-analysis, and utility theory. He was a Fellow of the Society of Industrial Organizational Psychology.

    Psychometric software is software that is used for psychometric analysis of data from tests, questionnaires, or inventories reflecting latent psychoeducational variables. While some psychometric analyses can be performed with standard statistical software like SPSS, most analyses require specialized tools.

    Li Cai is a statistician and quantitative psychologist. He is a professor of Advanced Quantitative Methodology at the UCLA Graduate School of Education and Information Studies with a joint appointment in the quantitative area of the UCLA Department of Psychology. He is also Director of the National Center for Research on Evaluation, Standards, and Student Testing, Managing Partner at Vector Psychometric Group.

    Computational Psychometrics is an interdisciplinary field fusing theory-based psychometrics, learning and cognitive sciences, and data-driven AI-based computational models as applied to large-scale/high-dimensional learning, assessment, biometric, or psychological data. Computational psychometrics is frequently concerned with providing actionable and meaningful feedback to individuals based on measurement and analysis of individual differences as they pertain to specific areas of enquiry.

    Alina Anca von Davier is a psychometrician and researcher in computational psychometrics, machine learning, and education. Von Davier is a researcher, innovator, and an executive leader with over 20 years of experience in EdTech and in the assessment industry. She is the Chief of Assessment at Duolingo, where she leads the Duolingo English Test research and development area. She is also the Founder and CEO of EdAstra Tech, a service-oriented EdTech company. In 2022, she joined the University of Oxford as an Honorary Research Fellow, and Carnegie Mellon University as a Senior Research Fellow.

    References

    1. "Mark Reckase". Touchstone Institute. Retrieved 2021-04-26.
    2. "Reckase, Mark". SAGE Publications Inc. 2021-03-27. Retrieved 2021-03-30.
    3. Gewertz, Catherine (2011-05-16). "Computer-Adaptive Testing Poses Challenges, Expert Warns". Education Week. Retrieved 2021-04-26.
    4. "Do Educational Assessments Yield Achievement Measurements, Mark D. Reckase, Michigan State University". www.ets.org. Retrieved 2021-04-26.
    5. "Update on VP election | IACAT". www.iacat.org. Retrieved 2021-03-30.
    6. "Awards - NCME". www.ncme.org. Retrieved 2021-03-30.
    7. "MSU faculty earn University Distinguished Professor title". MSUToday | Michigan State University. Retrieved 2021-03-30.
    8. Reckase, Mark (1972-01-01). "Development and Application of a Multivariate Logistic Latent Trait Model". Psychology - Dissertations.
    9. 1 2 "Reckase wins top career award in educational measurement - News - The Latest at the College of Education – Michigan State University". education.msu.edu. Retrieved 2021-04-23.
    10. "Keynotes | IACAT". iacat.org. Retrieved 2021-04-26.
    11. "Board of Directors - NCME". www.ncme.org. Retrieved 2021-04-26.
    12. Reckase, Mark (2009). Multidimensional item response theory. New York: Springer. ISBN   978-0-387-89976-3. OCLC   437345698.
    13. RECKASE, MARK (2020). THEORY OF STANDARD SETTING. [Place of publication not identified]: CHAPMAN & HALL CRC. ISBN   978-1-4987-2211-7. OCLC   1048445795.
    Mark D. Reckase
    OccupationProfessor Emeritus
    SpouseCharlene (Char) Repenty (died 2017)
    AwardsNCME Award for Career Contributions to Educational Measurement (2016)
    Academic background
    Alma mater University of Illinois at Urbana-Champaign; Syracuse University