Mark D. Reckase | |
---|---|
Occupation | Professor Emeritus |
Spouse | Charlene (Char) Repenty (died 2017) |
Awards | NCME Award for Career Contributions to Educational Measurement (2016) |
Academic background | |
Alma mater | University of Illinois at Urbana-Champaign; Syracuse University |
Academic work | |
Institutions | Michigan State University |
Mark Daniel Reckase is an educational psychologist and expert on quantitative methods and measurement [1] [2] who is known for his work on computerized adaptive testing, [3] multidimensional item response theory,and standard setting in educational and psychological tests. [4] Reckase is University Distinguished Professor Emeritus in the College of Education at Michigan State University. [5]
2016:Career Award from the National Council for Measurement in Education for Contributions to Educational Measurement. [6]
2009:Michigan State University Distinguished Professor [7]
Reckase graduated from the University of Illinois at Urbana-Champaign in 1966 with a B.S. in psychology. He earned his Ph.D. in psychology from Syracuse University in 1972,where his advisor was Eric F. Gardner. [8] His dissertation was titled Development and application of a multivariate logistic latent trait model.
Reckase began his career on the faculty at University of Missouri in 1972. In 1981,he moved on to a position as Assistant Vice President for Assessment Innovations at ACT. In 1998,he left ACT to join the faculty of the College of Education at Michigan State University. He retired in 2015. [9]
Reckase has served as President of the International Association for Computerized Adaptive Testing (2017) [10] and as President of the National Council on Measurement in Education (2008-2009). [11]
Reckase specializes in the development of educational and psychological assessments,particularly the psychometric theories underpinning the development of such tests. [9] He has worked extensively in multidimensional item response theory since the beginning of his academic career:his dissertation was on the early development of a multidimensional item response model and he went on to write a book on the topic in 2009. [12] He has branched out into other psychometric areas,like standard setting. [13]
Psychological statistics is application of formulas, theorems, numbers and laws to psychology. Statistical methods for psychology include development and application statistical theory and methods for modeling psychological data. These methods include psychometrics, factor analysis, experimental designs, and Bayesian statistics. The article also discusses journals in the same field.
Psychometrics is a field of study within psychology concerned with the theory and technique of measurement. Psychometrics generally refers to specialized fields within psychology and education devoted to testing, measurement, assessment, and related activities. Psychometrics is concerned with the objective measurement of latent constructs that cannot be directly observed. Examples of latent constructs include intelligence, introversion, mental disorders, and educational achievement. The levels of individuals on nonobservable latent variables are inferred through mathematical modeling based on what is observed from individuals' responses to items on tests and scales.
Psychological testing is the administration of psychological tests. Psychological tests are administered by trained evaluators. A person's responses are evaluated according to carefully prescribed guidelines. Scores are thought to reflect individual or group differences in the construct the test purports to measure. The science behind psychological testing is psychometrics.
In statistics and psychometrics, reliability is the overall consistency of a measure. A measure is said to have a high reliability if it produces similar results under consistent conditions:
"It is the characteristic of a set of test scores that relates to the amount of random error from the measurement process that might be embedded in the scores. Scores that are highly reliable are precise, reproducible, and consistent from one testing occasion to another. That is, if the testing process were repeated with a group of test takers, essentially the same results would be obtained. Various kinds of reliability coefficients, with values ranging between 0.00 and 1.00, are usually used to indicate the amount of error in the scores."
Validity is the main extent to which a concept, conclusion or measurement is well-founded and likely corresponds accurately to the real world. The word "valid" is derived from the Latin validus, meaning strong. The validity of a measurement tool is the degree to which the tool measures what it claims to measure. Validity is based on the strength of a collection of different types of evidence described in greater detail below.
Classical test theory (CTT) is a body of related psychometric theory that predicts outcomes of psychological testing such as the difficulty of items or the ability of test-takers. It is a theory of testing based on the idea that a person's observed or obtained score on a test is the sum of a true score (error-free score) and an error score. Generally speaking, the aim of classical test theory is to understand and improve the reliability of psychological tests.
In psychometrics, item response theory (IRT) is a paradigm for the design, analysis, and scoring of tests, questionnaires, and similar instruments measuring abilities, attitudes, or other variables. It is a theory of testing based on the relationship between individuals' performances on a test item and the test takers' levels of performance on an overall measure of the ability that item was designed to measure. Several different statistical models are used to represent both item and test taker characteristics. Unlike simpler alternatives for creating scales and evaluating questionnaire responses, it does not assume that each item is equally difficult. This distinguishes IRT from, for instance, Likert scaling, in which "All items are assumed to be replications of each other or in other words items are considered to be parallel instruments". By contrast, item response theory treats the difficulty of each item as information to be incorporated in scaling items.
Raymond Bernard Cattell was a British-American psychologist, known for his psychometric research into intrapersonal psychological structure. His work also explored the basic dimensions of personality and temperament, the range of cognitive abilities, the dynamic dimensions of motivation and emotion, the clinical dimensions of abnormal personality, patterns of group syntality and social behavior, applications of personality research to psychotherapy and learning theory, predictors of creativity and achievement, and many multivariate research methods including the refinement of factor analytic methods for exploring and measuring these domains. Cattell authored, co-authored, or edited almost 60 scholarly books, more than 500 research articles, and over 30 standardized psychometric tests, questionnaires, and rating scales. According to a widely cited ranking, Cattell was the 16th most eminent, 7th most cited in the scientific journal literature, and among the most productive psychologists of the 20th century. He was, however, a controversial figure, due in part to his alleged friendships with, and accusations about possible intellectual respect for, white supremacists and neo-Nazis.
Computerized adaptive testing (CAT) is a form of computer-based test that adapts to the examinee's ability level. For this reason, it has also been called tailored testing. In other words, it is a form of computer-administered test in which the next item or set of items selected to be administered depends on the correctness of the test taker's responses to the most recent items administered.
The Rasch model, named after Georg Rasch, is a psychometric model for analyzing categorical data, such as answers to questions on a reading assessment or questionnaire responses, as a function of the trade-off between the respondent's abilities, attitudes, or personality traits, and the item difficulty. For example, they may be used to estimate a student's reading ability or the extremity of a person's attitude to capital punishment from responses on a questionnaire. In addition to psychometrics and educational research, the Rasch model and its extensions are used in other areas, including the health profession, agriculture, and market research
Quantitative psychology is a field of scientific study that focuses on the mathematical modeling, research design and methodology, and statistical analysis of psychological processes. It includes tests and other devices for measuring cognitive abilities. Quantitative psychologists develop and analyze a wide variety of research methods, including those of psychometrics, a field concerned with the theory and technique of psychological measurement.
A latent variable model is a statistical model that relates a set of observable variables to a set of latent variables.
A computerized classification test (CCT) refers to, as its name would suggest, a test that is administered by computer for the purpose of classifying examinees. The most common CCT is a mastery test where the test classifies examinees as "Pass" or "Fail," but the term also includes tests that classify examinees into more than two categories. While the term may generally be considered to refer to all computer-administered tests for classification, it is usually used to refer to tests that are interactively administered or of variable-length, similar to computerized adaptive testing (CAT). Like CAT, variable-length CCTs can accomplish the goal of the test with a fraction of the number of items used in a conventional fixed-form test.
Test equating traditionally refers to the statistical process of determining comparable scores on different forms of an exam. It can be accomplished using either classical test theory or item response theory.
Differential item functioning (DIF) is a statistical characteristic of an item that shows the extent to which the item might be measuring different abilities for members of separate subgroups. Average item scores for subgroups having the same overall score on the test are compared to determine whether the item is measuring in essentially the same way for all subgroups. The presence of DIF requires review and judgment, and it does not necessarily indicate the presence of bias. DIF analysis provides an indication of unexpected behavior of items on a test. An item does not display DIF if people from different groups have a different probability to give a certain response; it displays DIF if and only if people from different groups with the same underlying true ability have a different probability of giving a certain response. Common procedures for assessing DIF are Mantel-Haenszel, item response theory (IRT) based methods, and logistic regression.
Nambury S. Raju was an American psychology professor known for his work in psychometrics, meta-analysis, and utility theory. He was a Fellow of the Society of Industrial Organizational Psychology.
Psychometric software is software that is used for psychometric analysis of data from tests, questionnaires, or inventories reflecting latent psychoeducational variables. While some psychometric analyses can be performed with standard statistical software like SPSS, most analyses require specialized tools.
Li Cai is a statistician and quantitative psychologist. He is a professor of Advanced Quantitative Methodology at the UCLA Graduate School of Education and Information Studies with a joint appointment in the quantitative area of the UCLA Department of Psychology. He is also Director of the National Center for Research on Evaluation, Standards, and Student Testing, Managing Partner at Vector Psychometric Group.
Computational Psychometrics is an interdisciplinary field fusing theory-based psychometrics, learning and cognitive sciences, and data-driven AI-based computational models as applied to large-scale/high-dimensional learning, assessment, biometric, or psychological data. Computational psychometrics is frequently concerned with providing actionable and meaningful feedback to individuals based on measurement and analysis of individual differences as they pertain to specific areas of enquiry.
Alina Anca von Davier is a psychometrician and researcher in computational psychometrics, machine learning, and education. Von Davier is a researcher, innovator, and an executive leader with over 20 years of experience in EdTech and in the assessment industry. She is the Chief of Assessment at Duolingo, where she leads the Duolingo English Test research and development area. She is also the Founder and CEO of EdAstra Tech, a service-oriented EdTech company. In 2022, she joined the University of Oxford as an Honorary Research Fellow, and Carnegie Mellon University as a Senior Research Fellow.