Item bank

Last updated

An item bank Or Question Bank is a term for a repository of test items that belong to a testing program, as well as all information pertaining to those items. In most applications of testing and assessment, the items are of multiple choice format, but any format can be used. Items are pulled from the bank and assigned to test forms for publication either as a paper-and-pencil test or some form of e-assessment.

Contents

Types of information

An item bank will not only include the text of each item, but also extensive information regarding test development and psychometric characteristics of the items. Examples of such information include: [1]

Item banking software

Because an item bank is essentially a simple database, it can be stored in database software or even a spreadsheet such as Microsoft Excel. However, there are several dozen commercially-available software programs specifically designed for item banking. The advantages that these provide are related to assessment. For example, items are presented on the computer screen as they would appear to a test examinee, and item response theory parameters can be translated into item response functions or information functions. Additionally, there are functionalities for publication, such as formatting a set of items to be printed as a paper-and-pencil test.

Some item banks also have test administration functionalities, such as being able to deliver e-assessment or process "bubble" answer sheets.

Related Research Articles

<span class="mw-page-title-main">Configuration management</span> Process for maintaining consistency of a product attributes with its design

Configuration management (CM) is a systems engineering process for establishing and maintaining consistency of a product's performance, functional, and physical attributes with its requirements, design, and operational information throughout its life. The CM process is widely used by military engineering organizations to manage changes throughout the system lifecycle of complex systems, such as weapon systems, military vehicles, and information systems. Outside the military, the CM process is also used with IT service management as defined by ITIL, and with other domain models in the civil engineering and other industrial engineering segments such as roads, bridges, canals, dams, and buildings.

<span class="mw-page-title-main">Graduate Management Admission Test</span> Computer adaptive test (CAT)

The Graduate Management Admission Test is a computer adaptive test (CAT) intended to assess certain analytical, writing, quantitative, verbal, and reading skills in written English for use in admission to a graduate management program, such as a Master of Business Administration (MBA) program. Answering the test questions requires knowledge of English grammatical rules, reading comprehension, and mathematical skills such as arithmetic, algebra, and geometry. The Graduate Management Admission Council (GMAC) owns and operates the test, and states that the GMAT assesses analytical writing and problem-solving abilities while also addressing data sufficiency, logic, and critical reasoning skills that it believes to be vital to real-world business and management success. It can be taken up to five times a year but no more than eight times total. Attempts must be at least 16 days apart.

Questionnaire construction refers to the design of a questionnaire to gather statistically useful information about a given topic. When properly constructed and responsibly administered, questionnaires can provide valuable data about any given subject.

<span class="mw-page-title-main">Educational Testing Service</span> Educational testing and assessment organization

Educational Testing Service (ETS), founded in 1947, is the world's largest private educational testing and assessment organization. It is headquartered in Lawrence Township, New Jersey, but has a Princeton address.

In psychometrics, item response theory (IRT) is a paradigm for the design, analysis, and scoring of tests, questionnaires, and similar instruments measuring abilities, attitudes, or other variables. It is a theory of testing based on the relationship between individuals' performances on a test item and the test takers' levels of performance on an overall measure of the ability that item was designed to measure. Several different statistical models are used to represent both item and test taker characteristics. Unlike simpler alternatives for creating scales and evaluating questionnaire responses, it does not assume that each item is equally difficult. This distinguishes IRT from, for instance, Likert scaling, in which "All items are assumed to be replications of each other or in other words items are considered to be parallel instruments". By contrast, item response theory treats the difficulty of each item as information to be incorporated in scaling items.

Optical mark recognition (OMR) collects data from people by identifying markings on a paper. OMR enables the hourly processing of hundreds or even thousands of documents. For instance, students may remember completing quizzes or surveys that required them to use a pencil to fill in bubbles on paper. A teacher or teacher's aide would fill out the form, then feed the cards into a system that grades or collects data from them.

<span class="mw-page-title-main">Likert scale</span> Psychometric measurement scale

A Likert scale is a psychometric scale named after its inventor, American social psychologist Rensis Likert, which is commonly used in research questionnaires. It is the most widely used approach to scaling responses in survey research, such that the term is often used interchangeably with rating scale, although there are other types of rating scales.

<span class="mw-page-title-main">Multiple choice</span> Assessment that are responded by choosing correct answers from a list of choices

Multiple choice (MC), objective response or MCQ is a form of an objective assessment in which respondents are asked to select only correct answers from the choices offered as a list. The multiple choice format is most frequently used in educational testing, in market research, and in elections, when a person chooses between multiple candidates, parties, or policies.

<span class="mw-page-title-main">Questionnaire</span> Series of questions for gathering information

A questionnaire is a research instrument that consists of a set of questions for the purpose of gathering information from respondents through survey or statistical study. A research questionnaire is typically a mix of close-ended questions and open-ended questions. Open-ended, long-term questions offer the respondent the ability to elaborate on their thoughts. The Research questionnaire was developed by the Statistical Society of London in 1838.

Reliability engineering is a sub-discipline of systems engineering that emphasizes the ability of equipment to function without failure. Reliability describes the ability of a system or component to function under stated conditions for a specified period of time. Reliability is closely related to availability, which is typically described as the ability of a component or system to function at a specified moment or interval of time.

Computerized adaptive testing (CAT) is a form of computer-based test that adapts to the examinee's ability level. For this reason, it has also been called tailored testing. In other words, it is a form of computer-administered test in which the next item or set of items selected to be administered depends on the correctness of the test taker's responses to the most recent items administered.

The Rasch model, named after Georg Rasch, is a psychometric model for analyzing categorical data, such as answers to questions on a reading assessment or questionnaire responses, as a function of the trade-off between the respondent's abilities, attitudes, or personality traits, and the item difficulty. For example, they may be used to estimate a student's reading ability or the extremity of a person's attitude to capital punishment from responses on a questionnaire. In addition to psychometrics and educational research, the Rasch model and its extensions are used in other areas, including the health profession, agriculture, and market research.

<span class="mw-page-title-main">National Assessment of Educational Progress</span> Assessment

The National Assessment of Educational Progress (NAEP) is the largest continuing and nationally representative assessment of what U.S. students know and can do in various subjects. NAEP is a congressionally mandated project administered by the National Center for Education Statistics (NCES), within the Institute of Education Sciences (IES) of the United States Department of Education. The first national administration of NAEP occurred in 1969. The National Assessment Governing Board (NAGB) is an independent, bipartisan board that sets policy for NAEP and is responsible for developing the framework and test specifications.The National Assessment Governing Board, whose members are appointed by the U.S. Secretary of Education, includes governors, state legislators, local and state school officials, educators, business representatives, and members of the general public. Congress created the 26-member Governing Board in 1988.

The National Council Licensure Examination (NCLEX) is a nationwide examination for the licensing of nurses in the United States, Canada, and Australia since 1982, 2015, and 2020, respectively. There are two types: the NCLEX-RN and the NCLEX-PN. After graduating from a school of nursing, one takes the NCLEX exam to receive a nursing license. A nursing license gives an individual the permission to practice nursing, granted by the state where they met the requirements.

<span class="mw-page-title-main">Raw data</span> A collection of information which has not been fully processed or analyzed

Raw data, also known as primary data, are data collected from a source. In the context of examinations, the raw data might be described as a raw score.

Standard-setting study is an official research study conducted by an organization that sponsors tests to determine a cutscore for the test. To be legally defensible in the US, in particular for high-stakes assessments, and meet the Standards for Educational and Psychological Testing, a cutscore cannot be arbitrarily determined; it must be empirically justified. For example, the organization cannot merely decide that the cutscore will be 70% correct. Instead, a study is conducted to determine what score best differentiates the classifications of examinees, such as competent vs. incompetent. Such studies require quite an amount of resources, involving a number of professionals, in particular with psychometric background. Standard-setting studies are for that reason impractical for regular class room situations, yet in every layer of education, standard setting is performed and multiple methods exist.

Psychometric software is software that is used for psychometric analysis of data from tests, questionnaires, or inventories reflecting latent psychoeducational variables. While some psychometric analyses can be performed with standard statistical software like SPSS, most analyses require specialized tools.

With the application of probability sampling in the 1930s, surveys became a standard tool for empirical research in social sciences, marketing, and official statistics. The methods involved in survey data collection are any of a number of ways in which data can be collected for a statistical survey. These are methods that are used to collect information from a sample of individuals in a systematic way. First there was the change from traditional paper-and-pencil interviewing (PAPI) to computer-assisted interviewing (CAI). Now, face-to-face surveys (CAPI), telephone surveys (CATI), and mail surveys are increasingly replaced by web surveys. In addition, remote interviewers could possibly keep the respondent engaged while reducing cost as compared to in-person interviewers.

<span class="mw-page-title-main">Exam</span> Educational assessment

An examination or test is an educational assessment intended to measure a test-taker's knowledge, skill, aptitude, physical fitness, or classification in many other topics. A test may be administered verbally, on paper, on a computer, or in a predetermined area that requires a test taker to demonstrate or perform a set of skills.

The Michigan English Test (MET) is a multilevel, modular English language examination, which measures English language proficiency in personal, public, occupational and educational contexts. It is developed by CaMLA, a not-for-profit collaboration between the University of Michigan and the University of Cambridge and has been in use since 2008.

References

  1. Vale, C.D. (2004). Computerized item banking. In Downing, S.D., & Haladyna, T.M. (Eds.) The Handbook of Test Development. Routledge.