Social-desirability bias

Last updated

In social science research, social-desirability bias is a type of response bias that is the tendency of survey respondents to answer questions in a manner that will be viewed favorably by others. [1] It can take the form of over-reporting "good behavior" or under-reporting "bad", or undesirable behavior. The tendency poses a serious problem with conducting research with self-reports. This bias interferes with the interpretation of average tendencies as well as individual differences.

Contents

Topics subject to social-desirability bias

Topics where socially desirable responding (SDR) is of special concern are self-reports of abilities, personality, sexual behavior, and drug use. When confronted with the question "How often do you masturbate?," for example, respondents may be pressured by the societal taboo against masturbation, and either under-report the frequency or avoid answering the question. Therefore, the mean rates of masturbation derived from self-report surveys are likely to be severely underestimated.

When confronted with the question, "Do you use drugs/illicit substances?" the respondent may be influenced by the fact that controlled substances, including the more commonly used marijuana, are generally illegal. Respondents may feel pressured to deny any drug use or rationalize it, e.g. "I only smoke marijuana when my friends are around." The bias can also influence reports of number of sexual partners. In fact, the bias may operate in opposite directions for different subgroups: Whereas men tend to inflate the numbers, women tend to underestimate theirs. In either case, the mean reports from both groups are likely to be distorted by social desirability bias.

Other topics that are sensitive to social-desirability bias include:

Individual differences in socially desirable responding

In 1953, Allen L. Edwards introduced the notion of social desirability to psychology, demonstrating the role of social desirability in the measurement of personality traits. He demonstrated that social desirability ratings of personality trait descriptions are very highly correlated with the probability that a subsequent group of people will endorse these trait self-descriptions. In his first demonstration of this pattern, the correlation between one group of college students’ social desirability ratings of a set of traits and the probability that college students in a second group would endorse self-descriptions describing the same traits was so high that it could distort the meaning of the personality traits. In other words, do these self-descriptions describe personality traits or social desirability? [10]

Edwards subsequently developed the first Social Desirability Scale, a set of 39, true-false questions extracted from the Minnesota Multiphasic Personality Inventory (MMPI), questions that judges could, with high agreement, order according to their social desirability. [2] These items were subsequently found to be very highly correlated with a wide range of measurement scales, MMPI personality and diagnostic scales. [11] The SDS is also highly correlated with the Beck Hopelessness Inventory. [12]

The fact that people differ in their tendency to engage in socially desirable responding (SDR) is a special concern to those measuring individual differences with self-reports. Individual differences in SDR make it difficult to distinguish those people with good traits who are responding factually from those distorting their answers in a positive direction.

When SDR cannot be eliminated, researchers may resort to evaluating the tendency and then control for it. A separate SDR measure must be administered together with the primary measure (test or interview) aimed at the subject matter of the research/investigation. The key assumption is that respondents who answer in a socially desirable manner on that scale are also responding desirably to all self-reports throughout the study.

In some cases, the entire questionnaire package from high scoring respondents may simply be discarded. Alternatively, respondents' answers on the primary questionnaires may be statistically adjusted commensurate with their SDR tendencies. For example, this adjustment is performed automatically in the standard scoring of MMPI scales.

The major concern with SDR scales is that they confound style with content. After all, people actually differ in the degree to which they possess desirable traits (e.g. nuns versus criminals). Consequently, measures of social desirability confound true differences with social-desirability bias.

Standard measures of individual SDR

Until the 1990s, the most commonly used measure of socially desirable responding was the Marlowe–Crowne Social Desirability Scale. [13] The original version comprised 33 True-False items. A shortened version, the Strahan–Gerbasi only comprises ten items, but some have raised questions regarding the reliability of this measure. [14]

In 1991, Delroy L. Paulhus published the Balanced Inventory of Desirable Responding (BIDR): a questionnaire designed to measure two forms of SDR. [15] This forty-item instrument provides separate subscales for "impression management," the tendency to give inflated self-descriptions to an audience; and self-deceptive enhancement, the tendency to give honest but inflated self-descriptions. The commercial version of the BIDR is called the "Paulhus Deception Scales (PDS)." [16]

Scales designed to tap response styles are available in all major languages, including Italian [17] [18] and German. [19]

Techniques to reduce social-desirability bias

Anonymity and confidentiality

Anonymous survey administration, compared with in-person or phone-based administration, has been shown to elicit higher reporting of items with social-desirability bias. [20] In anonymous survey settings, the subject is assured that their responses will not be linked to them, and they are not asked to divulge sensitive information directly to a surveyor. Anonymity can be established through self-administration of paper surveys returned by envelope, mail, or ballot boxes, or self-administration of electronic survey via computer, smartphone, or tablet. [1] [21] Audio-assisted electronic surveys have also been established for low-literacy or non-literate study subjects. [1] [22]

Confidentiality can be established in non-anonymous settings by ensuring that only study staff are present and by maintaining data confidentiality after surveys are complete. Including assurances of data confidentiality in surveys has a mixed effect on sensitive-question response; it may either increase response due to increased trust, or decrease response by increasing suspicion and concern. [1]

Specialized questioning techniques

Several techniques have been established to reduce bias when asking questions sensitive to social desirability. [20] Complex question techniques may reduce social-desirability bias, but may also be confusing or misunderstood by respondents.

Beyond specific techniques, social-desirability bias may be reduced by neutral question and prompt wording. [1]

Randomized response techniques

The randomized response technique asks a participant to respond with a fixed answer or to answer truthfully based on the outcome of a random act. [22] For example, respondents secretly throw a coin and respond "yes" if it comes up heads (regardless of their actual response to the question), and are instructed to respond truthfully if it comes up tails. This enables the researcher to estimate the actual prevalence of the given behavior among the study population without needing to know the true state of any one individual respondent. Research shows that the validity of the randomized response technique is limited. [23]

Nominative and best-friend techniques

The nominative technique asks a participant about the behavior of their close friends, rather than about their own behavior. [24] Participants are asked how many close friends they know have done for certain a sensitive behavior and how many other people they think know about that behavior. Population estimates of behaviors can be derived from the response.

The similar best-friend methodology asks the participant about the behavior of one best friend. [25]

Unmatched-count technique

The unmatched-count technique asks respondents to indicate how many of a list of several items they have done or are true for them. [26] Respondents are randomized to receive either a list of non-sensitive items or that same list plus the sensitive item of interest. Differences in the total number of items between the two groups indicate how many of those in the group receiving the sensitive item said yes to it.

Grouped-answer method

The grouped-answer method, also known as the two-card or three-card method, combines answer choices such that the sensitive response is combined with at least one non-sensitive response option. [27]

Crosswise, triangular, and hidden-sensitivity methods

These methods ask participants to select one response based on two or more questions, only one of which is sensitive. [28] For example, a participant will be asked whether their birth year is even and whether they have performed an illegal activity; if yes to both or no to both, to select A, and if yes to one but no to the other, select B. By combining sensitive and non-sensitive questions, the participant's response to the sensitive item is masked. Research shows that the validity of the crosswise model is limited. [29]

Bogus pipeline

Bogus-pipeline techniques are those in which a participant believes that an objective test, like a lie detector, will be used along with survey response, whether or not that test or procedure is actually used. [1]

Other response styles

"Extreme-response style" (ERS) takes the form of exaggerated-extremity preference, e.g. for '1' or '7' on 7-point scales. Its converse, 'moderacy bias' entails a preference for middle-range (or midpoint) responses (e.g. 3–5 on 7-point scales).

"Acquiescence" (ARS) is the tendency to respond to items with agreement/affirmation independent of their content ("yea"-saying).

These kinds of response styles differ from social-desirability bias in that they are unrelated to the question's content and may be present in both socially neutral and in socially favorable or unfavorable contexts, whereas SDR is, by definition, tied to the latter.

See also

Related Research Articles

Emotional intelligence (EI) is defined as the ability to perceive, use, understand, manage, and handle emotions. People with high emotional intelligence can recognize their own emotions and those of others, use emotional information to guide thinking and behavior, discern between different feelings and label them appropriately, and adjust emotions to adapt to environments.

Psychological testing refers to the administration of psychological tests. Psychological tests are administered or scored by trained evaluators. A person's responses are evaluated according to carefully prescribed guidelines. Scores are thought to reflect individual or group differences in the construct the test purports to measure. The science behind psychological testing is psychometrics.

Questionnaire construction refers to the design of a questionnaire to gather statistically useful information about a given topic. When properly constructed and responsibly administered, questionnaires can provide valuable data about any given subject.

Survey methodology is "the study of survey methods". As a field of applied statistics concentrating on human-research surveys, survey methodology studies the sampling of individual units from a population and associated techniques of survey data collection, such as questionnaire construction and methods for improving the number and accuracy of responses to surveys. Survey methodology targets instruments or procedures that ask one or more questions that may or may not be answered.

The Minnesota Multiphasic Personality Inventory (MMPI) is a standardized psychometric test of adult personality and psychopathology. Psychologists and other mental health professionals use various versions of the MMPI to help develop treatment plans, assist with differential diagnosis, help answer legal questions, screen job candidates during the personnel selection process, or as part of a therapeutic assessment procedure.

<span class="mw-page-title-main">Likert scale</span> Psychometric measurement scale

A Likert scale is a psychometric scale named after its inventor, American social psychologist Rensis Likert, which is commonly used in research questionnaires. It is the most widely used approach to scaling responses in survey research, such that the term is often used interchangeably with rating scale, although there are other types of rating scales.

<span class="mw-page-title-main">Personality test</span> Method of assessing human personality constructs

A personality test is a method of assessing human personality constructs. Most personality assessment instruments are in fact introspective self-report questionnaire measures or reports from life records (L-data) such as rating scales. Attempts to construct actual performance tests of personality have been very limited even though Raymond Cattell with his colleague Frank Warburton compiled a list of over 2000 separate objective tests that could be used in constructing objective personality tests. One exception however, was the Objective-Analytic Test Battery, a performance test designed to quantitatively measure 10 factor-analytically discerned personality trait dimensions. A major problem with both L-data and Q-data methods is that because of item transparency, rating scales and self-report questionnaires are highly susceptible to motivational and response distortion ranging all the way from lack of adequate self-insight to downright dissimulation depending on the reason/motivation for the assessment being undertaken.

<span class="mw-page-title-main">Questionnaire</span> Series of questions for gathering information

A questionnaire is a research instrument that consists of a set of questions for the purpose of gathering information from respondents through survey or statistical study. A research questionnaire is typically a mix of close-ended questions and open-ended questions. Open-ended, long-term questions offer the respondent the ability to elaborate on their thoughts. The Research questionnaire was developed by the Statistical Society of London in 1838.

<span class="mw-page-title-main">Response bias</span> Type of bias

Response bias is a general term for a wide range of tendencies for participants to respond inaccurately or falsely to questions. These biases are prevalent in research involving participant self-report, such as structured interviews or surveys. Response biases can have a large impact on the validity of questionnaires or surveys.

Personality Assessment Inventory (PAI), developed by Leslie Morey, is a self-report 344-item personality test that assesses a respondent's personality and psychopathology. Each item is a statement about the respondent that the respondent rates with a 4-point scale. It is used in various contexts, including psychotherapy, crisis/evaluation, forensic, personnel selection, pain/medical, and child custody assessment. The test construction strategy for the PAI was primarily deductive and rational. It shows good convergent validity with other personality tests, such as the Minnesota Multiphasic Personality Inventory and the Revised NEO Personality Inventory.

A self-report study is a type of survey, questionnaire, or poll in which respondents read the question and select a response by themselves without any outside interference. A self-report is any method which involves asking a participant about their feelings, attitudes, beliefs and so on. Examples of self-reports are questionnaires and interviews; self-reports are often used as a way of gaining participants' responses in observational studies and experiments.

A self-report inventory is a type of psychological test in which a person fills out a survey or questionnaire with or without the help of an investigator. Self-report inventories often ask direct questions about personal interests, values, symptoms, behaviors, and traits or personality types. Inventories are different from tests in that there is no objectively correct answer; responses are based on opinions and subjective perceptions. Most self-report inventories are brief and can be taken or administered within five to 15 minutes, although some, such as the Minnesota Multiphasic Personality Inventory (MMPI), can take several hours to fully complete. They are popular because they can be inexpensive to give and to score, and their scores can often show good reliability.

Acquiescence bias, also known as agreement bias, is a category of response bias common to survey research in which respondents have a tendency to select a positive response option or indicate a positive connotation disproportionately more frequently. Respondents do so without considering the content of the question or their 'true' preference. Acquiescence is sometimes referred to as "yea-saying" and is the tendency of a respondent to agree with a statement when in doubt. Questions affected by acquiescence bias take the following format: a stimulus in the form of a statement is presented, followed by 'agree/disagree,' 'yes/no' or 'true/false' response options. For example, a respondent might be presented with the statement "gardening makes me feel happy," and would then be expected to select either 'agree' or 'disagree.' Such question formats are favoured by both survey designers and respondents because they are straightforward to produce and respond to. The bias is particularly prevalent in the case of surveys or questionnaires that employ truisms as the stimuli, such as: "It is better to give than to receive" or "Never a lender nor a borrower be". Acquiescence bias can introduce systematic errors that affect the validity of research by confounding attitudes and behaviours with the general tendency to agree, which can result in misguided inference. Research suggests that the proportion of respondents who carry out this behaviour is between 10% and 20%.

Participation bias or non-response bias is a phenomenon in which the results of elections, studies, polls, etc. become non-representative because the participants disproportionately possess certain traits which affect the outcome. These traits mean the sample is systematically different from the target population, potentially resulting in biased estimates.

Psychological evaluation is a method to assess an individual's behavior, personality, cognitive abilities, and several other domains. A common reason for a psychological evaluation is to identify psychological factors that may be inhibiting a person's ability to think, behave, or regulate emotion functionally or constructively. It is the mental equivalent of physical examination. Other psychological evaluations seek to better understand the individual's unique characteristics or personality to predict things like workplace performance or customer relationship management.

The Sixteen Personality Factor Questionnaire (16PF) is a self-report personality test developed over several decades of empirical research by Raymond B. Cattell, Maurice Tatsuoka and Herbert Eber. The 16PF provides a measure of personality and can also be used by psychologists, and other mental health professionals, as a clinical instrument to help diagnose psychiatric disorders, and help with prognosis and therapy planning. The 16PF can also provide information relevant to the clinical and counseling process, such as an individual's capacity for insight, self-esteem, cognitive style, internalization of standards, openness to change, capacity for empathy, level of interpersonal trust, quality of attachments, interpersonal needs, attitude toward authority, reaction toward dynamics of power, frustration tolerance, and coping style. Thus, the 16PF instrument provides clinicians with a normal-range measurement of anxiety, adjustment, emotional stability and behavioral problems. Clinicians can use 16PF results to identify effective strategies for establishing a working alliance, to develop a therapeutic plan, and to select effective therapeutic interventions or modes of treatment. It can also be used within other areas of psychology, such as career and occupational selection.

The Levenson Self-Report Psychopathy scale (LSRP) is a 26-item, 4-point Likert scale, self-report inventory to measure primary and secondary psychopathy in non-institutionalised populations. It was developed in 1995 by Michael R. Levenson, Kent A. Kiehl and Cory M. Fitzpatrick. The scale was created for the purpose of conducting a psychological study examining antisocial disposition among a sample of 487 undergraduate students attending psychology classes at the University of California, Davis.

Delroy L. Paulhus is a personality psychology researcher and professor. He received his doctorate from Columbia University and has worked at the University of California, Berkeley and the University of California, Davis. Currently, Paulhus is a professor of psychology at the University of British Columbia in Vancouver, Canada where he teaches undergraduate and graduate courses. He is best known for being the co creator of the dark triad, along with fellow researcher Kevin Williams.

A validity scale, in psychological testing, is a scale used in an attempt to measure reliability of responses, for example with the goal of detecting defensiveness, malingering, or careless or random responding.

The Marlowe–Crowne Social Desirability Scale (MC–SDS) is a 33-item self-report questionnaire that assesses whether or not respondents are concerned with social approval. The scale was created by Douglas P. Crowne and David Marlowe in 1960 in an effort to measure social desirability bias, which is considered one of the most common biases affecting survey research. The MC–SDS has been listed in more than 1,000 articles and dissertations.

References

  1. 1 2 3 4 5 6 Krumpal, Ivar (2013). "Determinants of social desirability bias in sensitive surveys: a literature review". Quality & Quantity. 47 (4): 2025–2047. doi:10.1007/s11135-011-9640-9. S2CID   143045969.
  2. 1 2 Edwards, Allen (1957). The social desirability variable in personality assessment and research. New York: The Dryden Press.
  3. Stuart, Gretchen S.; Grimes, David A. (2009). "Social desirability bias in family planning studies: A neglected problem". Contraception. 80 (2): 108–112. doi:10.1016/j.contraception.2009.02.009. PMID   19631784.
  4. Sedgh, Gilda; Keogh, Sarah C. (2019-04-18). "Novel approaches to estimating abortion incidence". Reproductive Health. 16 (1): 44. doi: 10.1186/s12978-019-0702-0 . PMC   6472065 . PMID   30999917.
  5. Presser, Stanley; Stinson, Linda (1998). "Data Collection Mode and Social Desirability Bias in Self-Reported Religious Attendance". American Sociological Review. 63 (1): 137–145. doi:10.2307/2657486. JSTOR   2657486.
  6. Brian, Duff; Hanmer, Michael J.; Park, Won-Ho; White, Ismail K. (2007). "Good Excuses: Understanding Who Votes With An Improved Turnout Question". Public Opinion Quarterly. 71 (1): 67–90. doi:10.1093/poq/nfl045.
  7. Hanmer, Michael J.; Banks, Antoine J.; White, Ismail K. (2013). "Experiments to reduce the over-reporting of voting: A pipeline to the truth". Political Analysis. 22 (1): 130–141. doi: 10.1093/pan/mpt027 .
  8. Morin-Chassé, Alexandre; Bol, Damien; Stephenson, Laura B.; Labbé St-Vincent, Simon (2017). "How to survey about electoral turnout? The efficacy of the face-saving response items in 19 different contexts" (PDF). Political Science Research and Methods. 5 (3): 575–584. doi:10.1017/psrm.2016.31. S2CID   148277964.
  9. Morin-Chassé, Alexandre (2018). "How to Survey About Electoral Turnout? Additional Evidence". Journal of Experimental Political Science. 5 (3): 230–233. doi: 10.1017/XPS.2018.1 . S2CID   158608425.
  10. Edwards, Allen (1953). "The relationship between the judged desirability of a trait and the probability that the trait will be endorsed". Journal of Applied Psychology. 37 (2): 90–93. doi:10.1037/h0058073.
  11. Fordyce, William (1956). "Social desirability in the MMPI". Journal of Consulting Psychology. 20 (3): 171–175. doi:10.1037/h0048547. PMID   13357640.
  12. Linehan, Marsha (1981). "Assessment of suicide ideation and parasuicide: Hopelessness and social desirability". Journal of Consulting and Clinical Psychology. 49 (5): 773–775. doi:10.1037/0022-006X.49.5.773. PMID   7287996.
  13. Crowne, Douglas P.; Marlowe, David (1960). "A new scale of social desirability independent of psychopathology". Journal of Consulting Psychology. 24 (4): 349–354. doi:10.1037/h0047358. PMID   13813058. S2CID   9781635.
  14. Thompson, Edmund R.; Phua, Florence T. T. (2005). "Reliability among Senior Managers of the Marlowe–Crowne Short-Form Social Desirability Scale". Journal of Business and Psychology. 19 (4): 541–554. doi:10.1007/s10869-005-4524-4. S2CID   143818289.
  15. Paulhus, D.L. (1991). Measurement and control of response biases. In J.P. Robinson et al. (Eds.), Measures of personality and social psychological attitudes. San Diego: Academic Press
  16. Paulhus D.L., (1998) Paulhus Deception Scales (PDS) is published by Multi-Health Systems of Toronto.
  17. Roccato M., (2003) Desiderabilità Sociale e Acquiescenza. Alcune Trappole delle Inchieste e dei Sondaggi. LED Edizioni Universitarie, Torino. ISBN   88-7916-216-0
  18. Corbetta P., (2003) La ricerca sociale: metodologia e tecniche. Vol. I-IV. Il Mulino, Bologna.
  19. Stöber, Joachim (2001). "The Social Desirability Scale-17 (SDS-17)" (PDF). European Journal of Psychological Assessment. 17 (3): 222–232. doi:10.1027//1015-5759.17.3.222.
  20. 1 2 Nederhof, Anton J. (1985-07-01). "Methods of coping with social desirability bias: A review". European Journal of Social Psychology. 15 (3): 263–280. doi:10.1002/ejsp.2420150303.
  21. McBurney D.H., (1994) Research Methods. Brooks/Cole, Pacific Grove, California.
  22. 1 2 Tourangeau, R.; Yan, T. (2007). "Sensitive questions in surveys". Psychological Bulletin. 133 (5): 859–83. CiteSeerX   10.1.1.563.2414 . doi:10.1037/0033-2909.133.5.859. PMID   17723033. S2CID   7160451.
  23. John, Leslie K.; Loewenstein, George; Acquisti, Alessandro; Vosgerau, Joachim (September 2018). "When and why randomized response techniques (fail to) elicit the truth". Organizational Behavior and Human Decision Processes. 148: 101–123. doi:10.1016/j.obhdp.2018.07.004. S2CID   52263233.
  24. Miller, J.D. (1985). "The nominative technique: a new method of estimating heroin prevalence" (PDF). NIDA Research Monograph. 54: 104–124. PMID   3929108.
  25. Yeatman, Sara; Trinitapoli, Jenny (2011-09-01). "Best-Friend Reports: A Tool for Measuring the Prevalence of Sensitive Behaviors". American Journal of Public Health. 101 (9): 1666–1667. doi:10.2105/AJPH.2011.300194. PMC   3154247 . PMID   21778489.
  26. Droitcour, Judith; Caspar, Rachel A.; Hubbard, Michael L.; Parsley, Teresa L.; Visscher, Wendy; Ezzati, Trena M. (2011), "The Item Count Technique as a Method of Indirect Questioning: A Review of Its Development and a Case Study Application", Measurement Errors in Surveys, John Wiley & Sons, Ltd, pp. 185–210, doi:10.1002/9781118150382.ch11, ISBN   9781118150382
  27. Droitcour, Judith A.; Larson, Eric M. (2016-07-22). "An Innovative Technique for Asking Sensitive Questions: the Three-Card Method". Bulletin of Sociological Methodology/Bulletin de Méthodologie Sociologique. 75: 5–23. doi:10.1177/075910630207500103. S2CID   73189531.
  28. Yu, Jun-Wu; Tian, Guo-Liang; Tang, Man-Lai (2007-04-18). "Two new models for survey sampling with sensitive characteristic: design and analysis". Metrika. 67 (3): 251. doi:10.1007/s00184-007-0131-x. S2CID   122941401.
  29. Schnapp, Patrick (2019). "Sensitive Question Techniques and Careless Responding: Adjusting the Crosswise Model for Random Answers". Methods, Data, Analyses. 13: 307–320. doi:10.12758/mda.2019.03.