Psychological testing

Last updated
Psychological testing
ICD-10-PCS GZ1
ICD-9-CM 94.02
MeSH D011581

Psychological testing refers to the administration of psychological tests. [1] Psychological tests are administered or scored by trained evaluators. [1] A person's responses are evaluated according to carefully prescribed guidelines. Scores are thought to reflect individual or group differences in the construct the test purports to measure. [1] The science behind psychological testing is psychometrics. [1] [2]

Contents

Psychological tests

According to Anastasi and Urbina, psychological tests involve observations made on a "carefully chosen sample [emphasis authors] of an individual's behavior." [1] A psychological test is often designed to measure unobserved constructs, also known as latent variables. Psychological tests can include a series of tasks, problems to solve, and characteristics (e.g., behaviors, symptoms) the presence of which the respondent affirms/denies to varying degrees. Psychological tests can include questionnaires and interviews. Questionnaire- and interview-based scales typically differ from psychoeducational tests, which ask for a respondent's maximum performance. Questionnaire- and interview-based scales, by contrast, ask for the respondent's typical behavior. [3] Symptom and attitude tests are more often called scales. A useful psychological test/scale must be both valid, i.e., show evidence that the test or scale measures what it is purported to measure, [1] [4] ) and reliable, i.e., show evidence of consistency across items and raters and over time, etc.

It is important that people who are equal on the measured construct (e.g., mathematics ability, depression) have an approximately equal probability of answering a test item accurately or acknowledging the presence of a symptom. [5] An example of an item on a mathematics test that might be used in the United Kingdom but not the United States could be the following: "In a football match two players get a red card; how many players are left on the pitch?" This item requires knowledge of football (soccer) to be answered correctly, not just mathematical ability. Thus, group membership can influence the probability of correctly answering items, as encapsulated in the concept of differential item functioning. Often tests are constructed for a specific population and the nature of that population should be taken into account when administering tests outside that population. A test should be invariant between relevant subgroups (e.g., demographic groups) within a larger population. [6] For example, for a test to be used in the United Kingdom, the test and its items should have approximately the same meaning for British males and females. That invariance does not necessarily apply to similar groups in another population, such as males and females in the United States or between populations, for example, the populations of the UK and the US. In test construction, it is important to establish invariance at least for the subgroups of the population of interest. [6]

Psychological assessment is similar to psychological testing but usually involves a more comprehensive assessment of the individual. According to the American Psychological Association, psychological assessment involves the collection and integration of data for the purpose of evaluating an individual’s "behavior, abilities, and other characteristics." [7] Each assessment is a process that involves integrating information from multiple sources, such as personality inventories, ability tests, symptom scales, interest inventories, and attitude scales, as well as information from personal interviews. Collateral information can also be collected from occupational records or medical histories; information can also be obtained from parents, spouses, teachers, friends, or past therapists or physicians. One or more psychological tests are sources of information used within the process of assessment. Many psychologists conduct assessments when providing services. Psychological assessment is a complex, detailed, in-depth process. Examples of assessments include providing a diagnosis, [7] identifying a learning disability in schoolchildren, [8] determining if a defendant is mentally competent, [9] [10] and selecting job applicants. [11]

History

A Song dynasty painting of candidates participating in the imperial examination, a rudimentary form of psychological testing. Palastexamen-SongDynastie.jpg
A Song dynasty painting of candidates participating in the imperial examination, a rudimentary form of psychological testing.
Physiognomy was used to assess personality traits based on an individual's outer appearance. Physiognomy.jpg
Physiognomy was used to assess personality traits based on an individual's outer appearance.

The first large-scale tests may have been part of the imperial examination system in China. The tests, an early form of psychological testing, assessed candidates based on their proficiency in topics such as civil law and fiscal policies. [12] Early tests of intelligence were made for entertainment rather than analysis. [13] Modern mental testing began in France in the 19th century. It contributed to identifying individuals with intellectual disabilities for the purpose of humanely providing them with an alternative form of education. [14]

Englishman Francis Galton coined the terms psychometrics and eugenics. He developed a method for measuring intelligence based on nonverbal sensory-motor tests. The test was initially popular but was abandoned. [14] [15] In 1905 French psychologists Alfred Binet and Théodore Simon published the Échelle métrique de l'Intelligence (Metric Scale of Intelligence), known in English-speaking countries as the Binet–Simon test. The test focused heavily on verbal ability. Binet and Simon intended that the test be used to aid in identifying schoolchildren who were intellectually challenged, which in turn would pave the way for providing the children with professional help. [14] The Binet-Simon test became the foundation for the later-developed Stanford–Binet Intelligence Scales.

The origins of personality testing date back to the 18th and 19th centuries, when phrenology was the basis for assessing personality characteristics. Phrenology, a pseudoscience, involved assessing personality by way of skull measurement. [16] Early pseudoscientific techniques eventually gave way to empirical methods. One of the earliest modern personality tests was the Woodworth Personal Data Sheet, a self-report inventory developed during World War I to be used by the United States Army for the purpose of screening potential soldiers for mental health problems and identifying victims of shell shock (the instrument was completed too late to be used for the purposes it was designed for). [16] [1] The Woodworth Inventory, however, became the forerunner of many later personality tests and scales. [1]

Principles

The development of a psychological test requires careful research. Some of the elements of test development involve the following:

  • Standardization - All procedures and steps must be conducted with consistency from one testing site/testing occasion to another. Examiner subjectivity is minimized (see objectivity next). Major standardized tests are normed on large try-out samples in order to understand what constitutes high, low, and intermediate scores.
  • Objectivity - Scoring such that subjective judgments and biases are minimized; scores are obtained in a similar manner for every test taker (see below).
  • Discrimination - Scores on a test should discriminate members of extreme groups; for example, each subscale of the original MMPI distinguished hospitalized patients suffering from mental illness and members of a well comparison group. [17] [18]
  • Test Norms - Part of the standardization of large-scale tests (see above). Norms help psychologists learn about individual differences. For example, a normed personality scale can help psychologists understand how some people are high in negative affectivity (NA) and others are low or intermediate in NA. With many psychoeducational tests, test norms allow educators and psychologists obtain an age- or grade-referenced percentile rank, for example, in reading achievement.
  • Reliability - Refers to test or scale consistency. It is important that individuals score about the same if they take a test and an alternate form of the test or if they take the same test twice, within a short time window. Reliability also refers to response consistency from test item to test item.
  • Validity - Refers to evidence that demonstrates that a test or scale measures what it is purported to measure. [2] [19]

Sample of behavior

The term sample of behavior refers to an individual's performance on tasks that have usually been prescribed beforehand. For example, a spelling test for middle school students cannot include all the words in the vocabularies of middle schoolers because there are thousands of words in their lexicon; a middle school spelling test must include only a sample of words in their vocabulary. The samples of behavior must be reasonably representative of the behavior in question. The samples of behavior that make up a paper-and-pencil test, the most common type of psychological test, are written into the test items. Total performance on the items produces a test score. A score on a well-constructed test is believed to reflect a psychological construct such as achievement in a school subject like vocabulary or mathematics knowledge, cognitive ability, dimensions of personality such as introversion/extraversion, etc. Differences in test scores are thought to reflect individual differences in the construct the test is purported to measure. [2]

Types

There are several broad categories of psychological tests:

Achievement tests

Achievement tests assess an individual's knowledge in a subject domain. Some academic achievement tests are designed to be administered by a trained evaluator. By contrast, group achievement tests are often administered by a teacher. A score on an achievement test is believed to reflect the individual's knowledge of a subject area. [1]

There are generally two types of achievement tests, norm-referenced and criterion-referenced tests. Most achievement tests are norm-referenced. The individual's responses are scored according to standardized protocols and the results can be compared to the results of a norming group. [1] Norm-referenced tests can be used to underline individual differences, that is to say, to compare each test-taker to every other test-taker. By contrast, the purpose of criterion referenced achievement tests is ascertain whether the test-taker mastered a predetermined body of knowledge rather than to compare the test-taker to everyone else who took the test. These types of tests are often a component of a mastery-based classroom. [1]

The Kaufman Test of Educational Achievement is an example of an individually administered achievement test for students. [20]

Aptitude tests

Psychological tests have been designed to measure abilities, both specific (e.g., clerical skill like the Minnesota Clerical Test) and general abilities (e.g., traditional IQ tests such as the Stanford-Binet or the Wechsler Adult Intelligence Scale). A widely used, but brief, aptitude test used in business is the Wonderlic Test. Aptitude tests have been used in assessing specific abilities or the general ability of potential new employees (the Wonderlic was once used by the NFL). [21] Aptitude tests have also been used for career guidance. [22]

Evidence suggests that aptitude tests like IQ tests are sensitive to past learning and are not pure measures of untutored ability. [23] The SAT, which used to be called the Scholastic Aptitude Test, had its named changed because performance on the test is sensitive to training. [24]

Attitude scales

An attitude scale assesses an individual's disposition regarding an event (e.g., a Supreme Court decision), person (e.g., a governor), concept (e.g., wearing face masks during a pandemic), organization (e.g., the Boy Scouts), or object (e.g., nuclear weapons) on a unidimensional favorable-unfavorable attitude continuum. Attitude scales are used in marketing to determine individuals' preferences for brands. Historically social psychologists have developed attitude scales to assess individuals' attitudes toward the United Nations and race relations. [25] Typically Likert scales are used in attitude research. Historically, the Thurstone scale was used prior to the development of the Likert scale. The Likert scale has largely supplanted the Thurstone scale. [1]

Biographical Information Blank

The Biographical Information Blanks or BIB is a paper-and-pencil form that includes items that ask about detailed personal and work history. It is used to aid in the hiring of employees by matching the backgrounds of individuals to requirements of the job.

Clinical tests

The purpose of clinical tests is to assess the presence of symptoms of psychopathology . [26] Examples of clinical assessments include the Minnesota Multiphasic Personality Inventory (MMPI), Millon Clinical Multiaxial Inventory-IV, [27] Child Behavior Checklist, [28] Symptom Checklist 90 [29] and the Beck Depression Inventory. [26]

Many large-scale clinical tests are normed. For example, scores on the MMPI are rescaled such that 50 is the middlemost score on the MMPI Depression scale and 60 is a score that places the individual one standard deviation above the mean for depressive symptoms; 40 represents a symptom level that is one standard deviation below the mean. [30]

Criterion-referenced

A criterion-referenced test is an achievement test in a specific knowledge domain. [1] An individual's performance on the test is compared to a criterion. Test-takers are not compared to each other. A passing score, i.e., the criterion performance, is established by the teacher or an educational institution. Criterion-referenced tests are part and parcel of mastery based education.

Direct observation

Psychological assessment can involve the observation of people as they engage in activities. This type of assessment is usually conducted with families in a laboratory or at home. Sometimes the observation can involve children in a classroom or the schoolyard. [31] The purpose may be clinical, such as to establish a pre-intervention baseline of a child's hyperactive or aggressive classroom behaviors or to observe the nature of parent-child interaction in order to understand a relational disorder. [32] Time sampling methods are also part of direct observational research. The reliability of observers in direct observational research can be evaluated using Cohen's kappa.

The Parent-Child Interaction Assessment-II (PCIA) [33] is an example of a direct observation procedure that is used with school-age children and parents. The parents and children are video recorded playing at a make-believe zoo. The Parent-Child Early Relational Assessment [34] is used to study parents and young children and involves a feeding and a puzzle task. The MacArthur Story Stem Battery (MSSB) [35] is used to elicit narratives from children. The Dyadic Parent-Child Interaction Coding System-II [36] tracks the extent to which children follow the commands of parents and vice versa and is well suited to the study of children with Oppositional Defiant Disorders and their parents.

Interest inventories

Psychological tests include interest inventories. [37] These tests are used primarily for career counseling. Interest inventories include items that ask about the preferred activities and interests of people seeking career counseling. The rationale is that if the individual's activities and interests are similar to the modal pattern of activities and interests of people who are successful in a given occupation, then the chances are high that the individual would find satisfaction in that occupation. A widely used instrument is the Strong Interest Inventory, which is used in career assessment, career counseling, and educational guidance. [38] [39]

Neuropsychological tests

Neuropsychological tests are designed to assess behaviors that are linked to brain structure and function. An examiner, following strict pre-set procedures, administers the test to a single person in a quiet room largely free of distractions. [1] An example of a widely-used neuropsychological test is the Stroop test.

Norm-referenced tests

Items on norm-referenced tests have been tried out on a norming group and scores on the test can be classified as high, medium, or low and the gradations in between. [1] These tests allow for the study of individual differences. Scores on norm-referenced achievement tests are associated with percentile ranks vis-á-vis other individuals who are the test-taker's age or grade.

Personality tests

Personality tests assess constructs that are thought to be the constituents of personality. Examples of personality constructs include traits in the Big Five, such as introversion-extroversion and conscientiousness. Personality constructs are thought to be dimensional. Personality measures are used in research and in the selection of employees. They include self-report and observer-report scales. [40] Examples of norm-referenced personality tests include the NEO-PI, the 16PF Questionnaire, the Occupational Personality Questionnaires, [16] and the Five-Factor Personality Inventory. [41]

The International Personality Item Pool (IPIP) scales assess the same traits that the NEO and other personality scales assess. All IPIP scales and items are in the public domain and, therefore, are available free of charge. [42]

Projective tests

Projective testing originated in the first half of the 1900s. [43] The idea animating projective tests is that the examinee is thought to project hidden aspects of his or her personality, including unconscious content, onto the ambiguous stimuli presented in the test. Examples of projective tests include Rorschach test, [44] Thematic apperception test, [45] and the Draw-A-Person test. [46] Available evidence, however, suggests that projective tests have limited validity. [47]

Psychological symptom scales

Public safety employment tests

Vocations within the public safety field (e.g., fire service, law enforcement, corrections, emergency medical services) are often required to take industrial or organizational psychological tests for initial employment and promotion. The National Firefighter Selection Inventory, the National Criminal Justice Officer Selection Inventory, and the Integrity Inventory are prominent examples of these tests. [91] [92] [93] [94]

Sources of psychological tests

Thousands of psychological tests have been developed. Some were produced by commercial testing companies that charge for their use. Others have been developed by researchers, and can be found in the academic research literature. Tests to assess specific psychological constructs can be found by conducting a database search. Some databases are open access, for example, Google Scholar (although many tests found in the Google Scholar database are not free of charge). [95] Other databases are proprietary, for example, PsycINFO, but are available through university libraries and many public libraries (e.g., the Brooklyn Public Library and the New York Public Library). [96]

There are online archives available that contain tests on various topics.

  • APA PsycTests. Requires subscription [97]
  • Mental Measurements Yearbook [98] - a non-profit that provides independent reviews of thousands of distinct psychological tests.
  • Assessment Psychology Online has links to dozens of tests for clinical assessment. [99]
  • International Personality Item Pool (IPIP) contains items to assess more than 100 personality traits including Five Factor Model. [100]
  • Organization of Work: Measurement Tools for Research and Practice. NIOSH site devoted to Occupational Health and Safety [101]

Test security

Many psychological and psychoeducational tests are not available to the public. Test publishers put restrictions on who has access to the test. Psychology licensing boards also restrict access to the tests used in licensing psychologists. [102] [103] Test publishers hold that both copyright and professional ethics require them to protect the tests. Publishers sell tests only to people who have proved their educational and professional qualifications. Purchasers are legally bound not to give test answers or the tests themselves to members of the public unless permitted by the publisher. [104]

The International Test Commission (ITC), an international association of national psychological societies and test publishers, publishes the International Guidelines for Test Use, which prescribes measures to take to "protect the integrity" of the tests by not publicly describing test techniques and by not "coaching individuals" so that they "might unfairly influence their test performance." [105]

See also

Related Research Articles

Psychometrics is a field of study within psychology concerned with the theory and technique of measurement. Psychometrics generally covers specialized fields within psychology and education devoted to testing, measurement, assessment, and related activities. Psychometrics is concerned with the objective measurement of latent constructs that cannot be directly observed. Examples of latent constructs include intelligence, introversion, mental disorders, and educational achievement. The levels of individuals on nonobservable latent variables are inferred through mathematical modeling based on what is observed from individuals' responses to items on tests and scales.

The Minnesota Multiphasic Personality Inventory (MMPI) is a standardized psychometric test of adult personality and psychopathology. A version for adolescents also exists, the MMPI-A, and was first published in 1992. Psychologists and other mental health professionals use various versions of the MMPI to help develop treatment plans, assist with differential diagnosis, help answer legal questions, screen job candidates during the personnel selection process, or as part of a therapeutic assessment procedure.

<span class="mw-page-title-main">Personality test</span> Method of assessing human personality constructs

A personality test is a method of assessing human personality constructs. Most personality assessment instruments are in fact introspective self-report questionnaire measures or reports from life records (L-data) such as rating scales. Attempts to construct actual performance tests of personality have been very limited even though Raymond Cattell with his colleague Frank Warburton compiled a list of over 2000 separate objective tests that could be used in constructing objective personality tests. One exception, however, was the Objective-Analytic Test Battery, a performance test designed to quantitatively measure 10 factor-analytically discerned personality trait dimensions. A major problem with both L-data and Q-data methods is that because of item transparency, rating scales, and self-report questionnaires are highly susceptible to motivational and response distortion ranging from lack of adequate self-insight to downright dissimulation depending on the reason/motivation for the assessment being undertaken.

The Beck Depression Inventory, created by Aaron T. Beck, is a 21-question multiple-choice self-report inventory, one of the most widely used psychometric tests for measuring the severity of depression. Its development marked a shift among mental health professionals, who had until then, viewed depression from a psychodynamic perspective, instead of it being rooted in the patient's own thoughts.

Personality Assessment Inventory (PAI), developed by Leslie Morey, is a self-report 344-item personality test that assesses a respondent's personality and psychopathology. Each item is a statement about the respondent that the respondent rates with a 4-point scale. It is used in various contexts, including psychotherapy, crisis/evaluation, forensic, personnel selection, pain/medical, and child custody assessment. The test construction strategy for the PAI was primarily deductive and rational. It shows good convergent validity with other personality tests, such as the Minnesota Multiphasic Personality Inventory and the Revised NEO Personality Inventory.

Suicide risk assessment is a process of estimating the likelihood for a person to attempt or die by suicide. The goal of a thorough risk assessment is to learn about the circumstances of an individual person with regard to suicide, including warning signs, risk factors, and protective factors. Risk for suicide is re-evaluated throughout the course of care to assess the patient's response to personal situational changes and clinical interventions. Accurate and defensible risk assessment requires a clinician to integrate a clinical judgment with the latest evidence-based practice, although accurate prediction of low base rate events, such as suicide, is inherently difficult and prone to false positives.

A self-report inventory is a type of psychological test in which a person fills out a survey or questionnaire with or without the help of an investigator. Self-report inventories often ask direct questions about personal interests, values, symptoms, behaviors, and traits or personality types. Inventories are different from tests in that there is no objectively correct answer; responses are based on opinions and subjective perceptions. Most self-report inventories are brief and can be taken or administered within five to 15 minutes, although some, such as the Minnesota Multiphasic Personality Inventory (MMPI), can take several hours to fully complete. They are popular because they can be inexpensive to give and to score, and their scores can often show good reliability.

<span class="mw-page-title-main">Beck Anxiety Inventory</span> Psychological assessment tool

The Beck Anxiety Inventory (BAI) is a formative assessment and rating scale of anxiety. This self-report inventory, or 21-item questionnaire uses a scale ; the BAI is an ordinal scale; more specifically, a Likert scale that measures the scale quality of magnitude of anxiety.

Psychological evaluation is a method to assess an individual's behavior, personality, cognitive abilities, and several other domains. A common reason for a psychological evaluation is to identify psychological factors that may be inhibiting a person's ability to think, behave, or regulate emotion functionally or constructively. It is the mental equivalent of physical examination. Other psychological evaluations seek to better understand the individual's unique characteristics or personality to predict things like workplace performance or customer relationship management.

Sexuality can be inscribed in a multidimensional model comprising different aspects of human life: biology, reproduction, culture, entertainment, relationships and love.

Positive affectivity (PA) is a human characteristic that describes how much people experience positive affects ; and as a consequence how they interact with others and with their surroundings.

A depression rating scale is a psychometric instrument (tool), usually a questionnaire whose wording has been validated with experimental evidence, having descriptive words and phrases that indicate the severity of depression for a time period. When used, an observer may make judgements and rate a person at a specified scale level with respect to identified characteristics. Rather than being used to diagnose depression, a depression rating scale may be used to assign a score to a person's behaviour where that score may be used to determine whether that person should be evaluated more thoroughly for a depressive disorder diagnosis. Several rating scales are used for this purpose.

The Revised NEO Personality Inventory is a personality inventory that assesses an individual on five dimensions of personality. These are the same dimensions found in the Big Five personality traits. These traits are openness to experience, conscientiousness, extraversion(-introversion), agreeableness, and neuroticism. In addition, the NEO PI-R also reports on six subcategories of each Big Five personality trait.

The Patient Health Questionnaire (PHQ) is a multiple-choice self-report inventory that is used as a screening and diagnostic tool for mental health disorders of depression, anxiety, alcohol, eating, and somatoform disorders. It is the self-report version of the Primary Care Evaluation of Mental Disorders (PRIME-MD), a diagnostic tool developed in the mid-1990s by Pfizer Inc. The length of the original assessment limited its feasibility; consequently, a shorter version, consisting of 11 multi-part questions - the Patient Health Questionnaire was developed and validated.

The Children's Depression Inventory is a psychological assessment that rates the severity of symptoms related to depression or dysthymic disorder in children and adolescents. The CDI is a 27-item scale that is self-rated and symptom-oriented. The assessment is now in its second edition. The 27 items on the assessment are grouped into five major factor areas. Clients rate themselves based on how they feel and think, with each statement being identified with a rating from 0 to 2. The CDI was developed by American clinical psychologist Maria Kovacs, PhD, and was published in 1979. It was developed by using the Beck Depression Inventory (BDI) of 1967 for adults as a model. The CDI is a widely used and accepted assessment for the severity of depressive symptoms in children and youth, with high reliability. It also has a well-established validity using a variety of different techniques, and good psychometric properties. The CDI is a "Level B test," which means that the test is somewhat complex to administer and score, with the administrator requiring training.

The Mood Disorder Questionnaire (MDQ) is a self-report questionnaire designed to help detect bipolar disorder. It focuses on symptoms of hypomania and mania, which are the mood states that separate bipolar disorders from other types of depression and mood disorder. It has 5 main questions, and the first question has 13 parts, for a total of 17 questions. The MDQ was originally tested with adults, but it also has been studied in adolescents ages 11 years and above. It takes approximately 5–10 minutes to complete. In 2006, a parent-report version was created to allow for assessment of bipolar symptoms in children or adolescents from a caregiver perspective, with the research looking at youths as young as 5 years old. The MDQ has become one of the most widely studied and used questionnaires for bipolar disorder, and it has been translated into more than a dozen languages.

The General Behavior Inventory (GBI) is a 73-question psychological self-report assessment tool designed by Richard Depue and colleagues to identify the presence and severity of manic and depressive moods in adults, as well as to assess for cyclothymia. It is one of the most widely used psychometric tests for measuring the severity of bipolar disorder and the fluctuation of symptoms over time. The GBI is intended to be administered for adult populations; however, it has been adapted into versions that allow for juvenile populations, as well as a short version that allows for it to be used as a screening test.

The nine-item Patient Health Questionnaire (PHQ-9) is a depressive symptom scale and diagnostic tool introduced in 2001 to screen adult patients in primary care settings. The instrument assesses for the presence and severity of depressive symptoms and a possible depressive disorder. The PHQ-9 is a component of the larger self-administered Patient Health Questionnaire (PHQ), but can be used as a stand-alone instrument. The PHQ is part of Pfizer's larger suite of trademarked products, called the Primary Care Evaluation of Mental Disorders (PRIME-MD). The PHQ-9 takes less than three minutes to complete. It is scored by simply adding up the individual items' scores. Each of the nine items reflects a DSM-5 symptom of depression. Primary care providers can use the PHQ-9 to screen for possible depression in patients.

The Somatic Symptom Disorder - B Criteria Scale (SSD-12) is a brief self-report questionnaire used to assess the B criteria of DSM-5 somatic symptom disorder, i.e. the patients’ perceptions of their symptom-related thoughts, feelings, and behaviors.

References

  1. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Urbina, Susana; Anastasi, Anne (1997). Psychological testing (7th ed.). Upper Saddle River, NJ: Prentice Hall. p. 4. ISBN   9780023030857. OCLC   35450434.
  2. 1 2 3 Nunnally, J.C., & Bernstein, I.H. (1994). Psychometric theory. New York: McGraw-Hill.
  3. Mellenbergh, G.J. (2008). Chapter 10: Surveys. In H.J. Adèr & G.J. Mellenbergh (Eds.) (with contributions by D.J. Hand), Advising on Research Methods: A consultant's companion (pp. 183-209). Huizen, The Netherlands: Johannes van Kessel Publishing.
  4. American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1999). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.
  5. Mellenbergh, Gideon J. (1989). "Item bias and item response theory". International Journal of Educational Research. 13 (2): 127–143. doi:10.1016/0883-0355(89)90002-5.
  6. 1 2 Putnick, D. L., & Bornstein, M. H. (2016). Measurement invariance conventions and reporting: The state of the art and future directions for psychological research. Developmental Review, 41, 71–90. https://doi.org/10.1016/j.dr.2016.06.004
  7. 1 2 American Psychological Association. (n.d.). Psychological assessment. APA Dictionary of Psychology. Accessed Oct. 11, 2023
  8. Barnes, M.A., Fletcher, J., & Fuchs, L. (2007). Learning disabilities: From identification to intervention. New York: The Guilford Press.
  9. Neal, Tess M.S.; Mathers, Elizabeth; Frizzell, Jason R. (2022), Asmundson, Gordon J. G. (ed.), "Psychological Assessments in Forensic Settings", Comprehensive Clinical Psychology (Second Edition), Oxford: Elsevier, pp. 243–257, doi:10.1016/b978-0-12-818697-8.00150-3, ISBN   978-0-12-822232-4, S2CID   244328284 , retrieved 2022-09-21
  10. Neal, Tess M.S.; Sellbom, Martin; de Ruiter, Corine (2022). "Personality Assessment in Legal Contexts: Introduction to the Special Issue". Journal of Personality Assessment. 104 (2): 127–136. doi: 10.1080/00223891.2022.2033248 . ISSN   0022-3891. PMID   35235475. S2CID   247219451.
  11. Board of Trustees of the Society for Personality Assessment (2006). "Standards for Education and Training in Psychological Assessment" (PDF). Journal of Personality Assessment. 87 (3): 355–357. doi:10.1207/s15327752jpa8703_17. PMID   17134344. S2CID   7572353. Archived (PDF) from the original on 2018-02-05. Retrieved 2017-06-26.
  12. Robert J. Gregory (2003). "The History of Psychological Testing" (PDF). Psychological Testing : History, Principles, and Applications. Allyn & Bacon. p. 4 in chapter 1. ISBN   9780205354726. Archived (PDF) from the original on 2019-11-24. Retrieved 2013-05-31.
  13. Shi, Jiannong (2 February 2004). Sternberg, Robert J. (ed.). International Handbook of Intelligence. Cambridge University Press. pp. 330–331. ISBN   978-0-521-00402-2.
  14. 1 2 3 Kaufman, Alan S. (2009). IQ testing 101. Springer Pub. Co. ISBN   978-0826106292. OCLC   255892649.
  15. Gillham, Nicholas W. (2001). "Sir Francis Galton and the birth of eugenics". Annual Review of Genetics. 35 (1): 83–101. doi:10.1146/annurev.genet.35.102401.090055. PMID   11700278.
  16. 1 2 3 Nezami, Elahe; Butcher, James N. (16 February 2000). Goldstein, G.; Hersen, Michel (eds.). Handbook of Psychological Assessment. Elsevier. p. 415. ISBN   978-0-08-054002-3.
  17. Domino, George; Domino, Marla L. (2006-04-24). Psychological Testing: An Introduction. Cambridge University Press. pp. 34+. ISBN   978-1-139-45514-5.
  18. Hogan, Thomas P. (2019). Psychological Testing: A Practical Introduction. John Wiley & Sons, Incorporated. pp. 171+. ISBN   978-1-119-50690-4.
  19. Schultz, Duane P.; Schultz, Sydney Ellen (2010). Psychology and work today: An introduction to industrial and organizational psychology (10th ed.). Upper Saddle River, N.J.: Prentice Hall. pp. 99–102. ISBN   978-0205683581. OCLC   318765451.
  20. "Kaufman Test of Educational Achievement | Third Edition". Archived from the original on 2020-07-10. Retrieved 2020-07-10.
  21. NFL Wonderlic
  22. Aiken, Lewis R. (1998). Tests and examinations: Measuring abilities and performance. Wiley. ISBN   9780471192633. OCLC   37820003.
  23. Ceci, S. J. (1991). How much does schooling influence general intelligence and its cognitive components? A reassessment of the evidence. Developmental Psychology, 27, 703–722. https://doi.org/10.1037/0012-1649.27.5.703 Archived 2022-08-22 at the Wayback Machine
  24. Lemann, N. (1999). The big test: The secret history of the American meritocracy. New York: Farrar, Straus and Giroux.
  25. Brown, R. (1965). Social psychology. New York: The Free Press.
  26. 1 2 Beck, A. T.; Steer, R. A.; Brown, G. K. (1996). Manual for the Beck Depression Inventory-II (2nd ed.). San Antonio, TX: Psychological Corporation.
  27. Millon, T. (1994). Millon Clinical Multiaxial Inventory-III. Minneapolis, MN: National Computer Systems.
  28. Achenbach, T. M.; Rescorla, Leslie A. (2001). Manual for the ASEBA school-age forms & profiles: An integrated system of multi-informant assessment. Burlington, Vt: ASEBA. ISBN   978-0938565734. OCLC   53902766.
  29. Derogatis L. R. (1983). SCL90: Administration, Scoring and Procedures Manual for the Revised Version. Baltimore: Clinical Psychometric Research.
  30. Ben-Porath, Y.-S., Tellegen, A. (2011). Minnesota Multiphasic Personality Inventory Manual of Administration-2-RF. Minneapolis: University of Minnesota Press
  31. Reid, J. B., Eddy, J. M., Fetrow, R. A., & Stoolmiller, M. (1999). Description and immediate impacts of a preventive intervention for conduct problems. American Journal of Community Psychology, 27, 483–517.
  32. Waters, E., & Deane, K.E. (1985). Defining and assessing individual differences in attachment relationships: Q-methodology and the organization of behavior in infancy and early childhood (pp. 41-65)Monographs of the Society for Research in Child Development, 50, 41-65.
  33. Holigrocki, R. J; Kaminski, P. L.; Frieswyk, S. H. (1999). "Introduction to the Parent-Child Interaction Assessment". Bulletin of the Menninger Clinic. 63 (3): 413–428. PMID   10452199.
  34. Clark, R (1999). "The Parent-Child Early Relational Assessment: A Factorial Validity Study". Educational and Psychological Measurement. 59 (5): 821–846. doi:10.1177/00131649921970161. S2CID   146211674.
  35. Bretherton, I., Oppenheim, D., Buchsbaum, H., Emde, R. N., & the MacArthur Narrative Group. (1990). MacArthur Story-Stem battery. Unpublished manual.
  36. Robinson, Elizabeth A.; Eyberg, Sheila M. (1981). "The dyadic parent–child interaction coding system: Standardization and validation". Journal of Consulting and Clinical Psychology. 49 (2): 245–250. doi:10.1037/0022-006x.49.2.245. PMID   7217491.
  37. Anastasi, A., & Urbina, S. (1997). Psychological testing (7th ed.). Upper Saddle River, NJ: Prentice Hall.
  38. Donnay, D.A.C. (1997). E.K. Strong's legacy and beyond: 70 years of the Strong Interest Inventory. The Career Development Quarterly, 46(1), 2–22. doi:10.1002/j.2161-0045.1997.tb00688.x
  39. Blackwell, T., & Case, J. (2008|). Test Review - Strong Interest Inventory, Revised Edition. Rehabilitation Counseling Bulletin, 51(2), 122–26, doi:10.1177/0034355207311350
  40. Ashton, M. C., (2017). Individual Differences and Personality (3rd ed.). Amsterdam: Elsevier.
  41. Jolijn Hendriks, A.a., Hofstee, W.K.B, & De Raad, B. (1999). The Five-Factor Personality Inventory (FFPI). Personality and Individual Differences, 27(2), 307-325. https://doi.org/10.1016/S0191-8869(98)00245-1
  42. International Personality Item Pool. Archived 2019-08-20 at the Wayback Machine Accessed July 14, 2020
  43. John D., Wasserman (2003). "Nonverbal Assessment of Personality and Psychopathology". In McCallum, Steve R. (ed.). Handbook of Nonverbal Assessment. New York: Kluwer Academic / Plenum Publishers. ISBN   978-0-306-47715-7 . Retrieved 20 November 2010.
  44. Meyer, G.J., Hilsenroth, M.J., Baxter, D, Exner, J.E., Fowler, J. C., Piers, C.C., Resnick J. (2002). An examination of interrater reliability for scoring the Rorschach comprehensive system in eight data sets. Journal of Personality Assessment, 78(2), 219–274. doi:10.1207/S15327752JPA7802_03.
  45. Murray, H. (1943). The Thematic Apperception Technique. Cambridge, MA: Harvard University Press. OCLC 223083.
  46. Murray, Henry A. (1943). Thematic Apperception Test manual. Cambridge, MA: Harvard University Press. OCLC   223083.
  47. Lilienfeld, S.O., Wood, J.M., & Garb, H.N. (2000). The scientific status of projective techniques. Psychological Science in the Public Interest, 1(2), 27–66. doi:10.1111/1529-1006.002. doi:10.1111/1529-1006.002
  48. Beck, A. T., Steer, R. A., & Brown, G. K. (1996). Manual for the Beck Depression Inventory-II San Antonio, TX: Psychological Corporation
  49. Beck A.T. (1988). Beck Hopelessness Scale. Harcourt Assessment / The Psychological Corporation
  50. Bortner, R.W., Gallacher, J.E.J., Sweetnam, P.M., Yarnell, J.W.G., Elwood, P.C., & Stansfeld, S.A. (2003). Bortner Type A Scale. Psychosomatic Medicine. 65, 339-346.
  51. Radloff, L.S.(1977).The CES-D scale: A self-report depression scale for research in the general population. Applied Psychological Measurement, 1, 385–401.
  52. Cole, J. C., Rabin, A. S., Smith, T. L., & Kaufman, A. S. (2004). Development and validation of a Rasch-derived CES-D short form. Psychological Assessment, 16, 360 –372. doi:10.1037/1040-3590.16.4.360
  53. Kovacs, M. (1992). Children's Depression Inventory. North Tonawanda, NY: Multi-Health Systems
  54. Kovacs, M. (2014). Children's Depression Inventory, 2nd ed. Upper Saddle River, NJ: Pearson
  55. Lovibond, S.H., & Lovibond, P.F. (1995). Manual for the Depression Anxiety Stress Scales (2nd. Ed.). Sydney: Psychology Foundation.
  56. Goldberg, D.P. (1972). The detection of psychiatric illness by questionnaire. Maudsley Monograph No. 21. Oxford: Oxford University Press.
  57. Spitzer, R. L., Kroenke, K., Williams, J. B., & Löwe, B. (2006). A brief measure for assessing generalized anxiety disorder: The GAD-7. Archives of Internal Medicine, 166, 1092–1097. http://dx.doi.org/10.1001/archinte.166.10.1092
  58. Hamilton, M. (1959). The assessment of anxiety states by rating. British Journal of Medical Psychology, 32, 50–55. doi:10.1111/j.2044-8341.1959.tb00467.x
  59. Hamilton, M. (1960). A rating scale for depression. Journal of Neurology, Neurosurgery, and Psychiatry, 23(1), 56–62. doi:10.1136/jnnp.23.1.56
  60. Hamilton, M. (1980). Rating depressive patients. Journal of Clinical Psychiatry, 41(12 Pt 2), 21–24.
  61. Harburg, E., Erfurt, J.C., Hauenstein, L.S., Chape, C., Schull, W.J., & Schork, M.A. (1973). Socio-ecological stress, suppressed hostility, skin color, and black-white male blood pressure: Detroit. Psychosomatic Medicine, 35, 276–296. https://doi.org/10.1037/0033-2909.120.2.293
  62. Harburg, E., Blakelock, E. H., & Roeper, P. J. (1979). Resentful and reflective coping with arbitrary authority and blood pressure: Detroit. Psychosomatic Medicine, 41, 189–202. https://doi.org/10.1097/00006842-197905000-00002
  63. Derogatis, L.R., Lipman, R.S., Rickels, K., Uhlenhuth, E.H. & Covi, L. (1974). The Hopkins Symptom Checklist (HSCL): A self-report symptom inventory. Behavioral Science, 19, 1-15.
  64. Zigmond, A.S., & Smith, R.P. (1983). The Hospital Anxiety and Depression Scale. Acta Psychiatrica Scandinavica, 67, 361-370.
  65. Jenkins, D.C., Hames, C.G., Zyanski, S.J., Rosenman, R.H., & Friedman, M. (1969). Psychological traits and serum lipids. Psychosomatic Medicine, 31(2), 115–128. doi:10.1097/00006842-196903000-00004
  66. Kessler, R. C., Andrews, G., Colpe, L. J., Hiripi, E., Mroczek, D. K., Normand, S. L. T.,... Zaslavsky, A. M. (2002). Short screening scales to monitor population prevalences and trends in non-specific psychological distress. ‘’Psychological Medicine, 32’’, 959 –976. http://dx.doi.org/ 10.1017/S0033291702006074
  67. Furukawa, T.A., Kessler, R.C., Slade, T., & Andrews, G. (2003). The performance of the K6 and K10 screening scales for psychological distress in the Australian National Survey of Mental Health and Well-Being. Psychological Medicine, 33(2), 357-362. doi:10.1017/s0033291702006700.
  68. Langner, T.S. (1962). A twenty-two item screening score of psychiatric symptoms indicating impairment. Journal of Health and Human Behaviour 3, 269-276.
  69. Srole, L., Langner, T.S., Michael, S.T., Opler, M.K. & Rennie, T.A.C. (1962). Mental health in the metropolis. McGraw-Hill: New York.
  70. Siegel, J.M. (1986). The Multidimensional Anger Inventory. Journal of Personality and Social Psychology, 51(1), 191-200.
  71. Bianchi, R., & Schonfeld, I. S. (2020). The Occupational Depression Inventory: A new tool for clinicians and epidemiologists. Journal of Psychosomatic Research, 138, Article 110249. https://doi.org/10.1016/j.jpsychores.2020.110249
  72. Schonfeld, I. S., & Bianchi, R. (2022). Distress in the workplace: Characterizing the relationship of burnout measures to the Occupational Depression Inventory. International Journal of Stress Management, 29, 253-259. https://doi.org/10.1037/str0000261
  73. Cohen, S., Kamarck, T., & Mermelstein, R. (1983). A global measure of perceived stress. Journal of Health and Social Behavior, 24(4), 385–396. doi:10.2307/2136404
  74. Kroenke, K., Spitzer, R.L., & Williams, J.B. (2001). The PHQ-9: Validity of a brief depression severity measure. Journal of General Internal Medicine, 16(9), 606–613. doi:10.1046/j.1525-1497.2001.016009606.x
  75. Kroenke, K. & Spitzer, R.L. (2002). The PHQ-9: A new depression diagnostic and severity measure. Psychiatric Annals, 32, 509-515.
  76. Stöber, J., & Bittencourt, J. (1998). Weekly assessment of worry: an adaptation of the Penn state worry questionnaire for monitoring changes during treatment. ‘’Behaviour Research and Therapy, 36’’(6), 645–656. doi: 10.1016/S0005-7967(98)00031-X
  77. Watson, D., Clark, L. A., & Tellegen, A. (1988). Development and validation of brief measures of positive and negative affect: The PANAS scales. Journal of Personality and Social Psychology, 54(6), 1063–1070. doi:10.1037/0022-3514.54.6.1063
  78. Lorr, M., McNair, D. M., & Fisher, S. (1982). Evidence for bipolar mood states. Journal of Personality Assessment, 46(4), 432–436. doi:10.1207/s15327752jpa4604_16
  79. Dohrenwend, B.P., Shrout, P.E., Ergi, G.E. & Mendelsohn, F.S. (1980). Measures of non-specific psychological distress and other dimensions of psychopathology in the general population. Archives of General Psychiatry 37, 1229-1236.
  80. Fried, Y., & Tiegs, R. B. (1993). The main effect model versus buffering model of shop steward social support: A study of rank-and-file auto workers in the USA. Journal of Organizational Behavior, 14(5), 481–493. doi:10.1002/job.4030140509
  81. Caplan, R. D., Cobb, S., French, J. R. P., Harrison, R. V., & Pinneau, S. R. (1980). Job demands and worker health: Main effects and occupational differences. Ann Arbor, MI: Survey Research Center, Institute for Social Research, University of Michigan.
  82. Mojtabai, R., Corey-Lisle, P. K., Ip, E. H.-S., Kopeykina, I., Haeri, S., Cohen, L. J., Shumaker, S., Mojtabai, R., Corey-Lisle, P. K., Ip, E. H.-S., Kopeykina, I., Haeri, S., Cohen, L. J., & Shumaker, S. (2012). Psychotic Symptoms Subscale. [Subscale from: Patient Assessment Questionnaire]. Psychiatry Research, 200(2-3), 857–866.
  83. Weathers, F.W., Litz, B.T., Keane, T.M., Palmieri, P.A., Marx, B.P., & Schnurr, P.P. (2013). The PTSD Checklist for DSM-5 (PCL-5). National Center for PTSD. https://www.ptsd.va.gov/professional/assessment/adult-sr/ptsd-checklist.asp
  84. Rosenberg, M. (1965). Society and the adolescent self-image. Princeton, NJ: Princeton University Press
  85. Spector, P.E., & Jex, S.M. (1995). Development of four self-report measures of job stressors and strain: Interpersonal Conflict at Work Scale, Organizational Constraints Scale, Quantitative Workload Inventory, and Physical Symptoms Inventory. Journal of Occupational Health Psychology,3(4), 356-367. doi:10.1037//1076-8998.3.4.356
  86. Low, C. A., Matthews, K. A., Kuller, L. H., & Edmundowicz, D. (2011). Psychosocial predictors of coronary artery calcification progression in postmenopausal women. Psychosomatic Medicine, 73(9), 789–794. doi:10.1097/PSY.0b013e318236b68a
  87. Russell, D., Peplau, L. A., & Ferguson, M. L. (1978). Developing a measure of loneliness. Journal of Personality Assessment, 42(3), 290–294. doi:10.1207/s15327752jpa4203_11
  88. Russell, D., Peplau, L. A., & Cutrona, C. E. (1980). The revised UCLA Loneliness Scale: Concurrent and discriminant validity evidence. Journal of Personality and Social Psychology, 39(3), 472–480. doi:10.1037/0022-3514.39.3.472
  89. Zung, W. W. (1971). A rating instrument for anxiety disorders. Psychosomatics, 12, 371–379.
  90. Zung, W. W. (1965). A self-rating depression scale. Archives of General Psychiatry, 12, 63–70. doi:10.1001/archpsyc.1965.01720310065008
  91. Public Safety Self Assessment. National Testing Network.
  92. National Firefighter Selection Inventory Technical Report, 2011, I/O Solutions, Inc., Westchester, Illinois 60154
  93. National Criminal Justice Officer Selection Inventory Squared
  94. Integrity Inventory
  95. "Home". scholar.google.com. Archived from the original on 2014-01-31. Retrieved 2021-10-19.
  96. "APA PsycInfo". American Psychological Association (APA). Archived from the original on 2022-08-22. Retrieved 2021-10-19.
  97. "APA PsycTests". American Psychological Association (APA). Archived from the original on 2021-10-19. Retrieved 2021-10-19.
  98. Carlson, Janet (2022). "Mental Measurements Yearbook" . Retrieved 2022-09-20.
  99. "Psychological Test List". Assessment Psychology Online. Archived from the original on 2022-08-22. Retrieved 2021-10-30.
  100. "Home". ipip.ori.org. Archived from the original on 2019-08-20. Retrieved 2020-07-14.
  101. "Organization of Work | NIOSH | CDC". 15 October 2021. Archived from the original on 19 October 2021. Retrieved 19 October 2021.
  102. The Committee on Psychological Tests and Assessment (CPTA), American Psychological Association (1994). "Statement on the Use of Secure Psychological Tests in the Education of Graduate and Undergraduate Psychology Students". American Psychological Association. Archived from the original on 2009-12-11. Retrieved 2009-11-08. It should be recognized that certain tests used by psychologists and related professionals may suffer irreparable harm to their validity if their items, scoring keys or protocols, and other materials are publicly disclosed.
  103. Kenneth R. Morel (2009-09-24). "Test Security in Medicolegal Cases: Proposed Guidelines for Attorneys Utilizing Neuropsychology Practice". Archives of Clinical Neuropsychology. 24 (7): 635–646. doi: 10.1093/arclin/acp062 . PMID   19778915.
  104. Pearson Assessments (2009). "Legal Policies". Psychological Corporation. Archived from the original on 2011-07-15. Retrieved 2009-11-15.
  105. International Test Commission (2000) International Guidelines for Test Use