Privacy for research participants

Last updated

Privacy for research participants is a concept in research ethics which states that a person in human subject research has a right to privacy when participating in research. Some typical scenarios this would apply to include, or example, a surveyor doing social research conducts an interview with a participant, or a medical researcher in a clinical trial asks for a blood sample from a participant to see if there is a relationship between something which can be measured in blood and a person's health. In both cases, the ideal outcome is that any participant can join the study and neither the researcher nor the study design nor the publication of the study results would ever identify any participant in the study. Thus, the privacy rights of these individuals can be preserved.

Contents

Privacy for medical research participants is protected by several procedures such as informed consent, compliance with medical privacy laws, and transparency in how patient data is accumulated and analyzed. [1]

People decide to participate in research for any number of different reasons, such as a personal interest, a desire to promote research which benefits their community, or for other reasons. Various guidelines for human subject research protect study participants who choose to participate in research, and the international consensus is that the rights of people who participate in studies are best protected when the study participant can trust that researchers will not connect the identities of study participants with their input into the study.

Many study participants have experienced problems when their privacy was not upheld after participating in research. Sometimes privacy is not kept because of insufficient study protection, but also sometimes it is because of unanticipated problems with the study design which inadvertently compromise privacy. The privacy of research participants is typically protected by the research organizer, but the institutional review board is a designated overseer which monitors the organizer to provide protection to study participants.

Information privacy

Researchers publish data that they get from participants. To preserve participants' privacy, the data goes through a process to de-identify it. The goal of such a process would be to remove protected health information which could be used to connect a study participant to their contribution to a research project so that the participants will not suffer from data re-identification.[ citation needed ]

Privacy attacks

A privacy attack is the exploitation of an opportunity for someone to identify a study participant based on public research data. [2] The way that this might work is that researchers collect data, including confidential identifying data, from study participants. This produces an identified dataset. [2] Before the data is sent for research processing, it is "de-identified", which means that personally identifying data is removed from the dataset. Ideally, this means that the dataset alone could not be used to identify a participant. [2]

In some cases, the researchers simply misjudge the information in a de-identified dataset and actually it is identifying, or perhaps the advent of new technology makes the data identifying. [2] In other cases, the published de-identified data can be cross-referenced with other data sets, and by finding matches between an identified dataset and the de-identified data set, participants in the de-identified set may be revealed. [2] This is particularly the case with medical research data because traditional data anonymization techniques designed for numerical data are not as effective for the nonnumerical data contained in medical data, such as rare diagnoses and personalized treatments. Thus, in cases like medical research data that contain unique nonnumerical data, only removing identifying numerical features, such as age and social security number, may not be enough to mitigate privacy attacks. [3]

Risk mitigation

The ideal situation from the research perspective is the free sharing of data. [2] Since privacy for research participants is a priority, though, various proposals for protecting participants have been made for different purposes. [2] Replacing the real data with synthetic data allows the researchers to show data which gives a conclusion equivalent to the one drawn by the researchers, but the data may have problems such as being unfit for repurposing for other research. [2] Other strategies include "noise addition" by making random value changes or "data swapping" by exchanging values across entries. [2] [4] Still another approach is to separate the identifiable variables in the data from the rest, aggregate the identifiable variables and reattach them with the rest of the data. This principle has been used successfully in creating maps of diabetes in Australia [5] and the United Kingdom [6] using confidential General Practice clinic data.

Biobank privacy

A biobank is a place where human biological specimens are kept for research, and often where genomics data is paired with phenotype data and personally-identifying data. For many reasons, biobank research has created new controversies, perspectives, and challenges for satisfying the rights of student participants and the needs of the researchers to access resources for their work. [2]

One problem is that if even a small percentage of genetic information is available, that information can be used to uniquely identify the individual from which it came. Studies have shown that a determination of whether an individual participated in a study can be made even from reporting of aggregate data. [7]

Negative consequences

When research participants have their identities revealed they may face various problems. Concerns include facing genetic discrimination from an insurance company or employer. [8] Respondents in the United States have expressed a desire to have their research data to be restricted from access by law enforcement agencies and would want to prevent a connection between study participation and legal consequences of the same. [9] Another fear study participants have is about the research revealing private personal practices which a person may not want to discuss, such as a medical history which includes a sexually transmitted disease, substance abuse, psychiatric treatment, or an elective abortion. [10] In the case of genomic studies on families, genetic screening may reveal that paternity is different from what had been supposed. [10] For no particular reason, some people may find that if their private information becomes disclosed because of research participation, they may feel invaded and find the entire system distasteful. [10] An Australian study investigating the violence, bullying, and harassment towards LGBTIQ people revealed that some of the participants, who were all members of the LGBTIQ community, had been subjected to levels of violence that would constitute a crime. However, the participants were reluctant to report their victimization to the police. This meant that the researchers were placed in a position where they could report a crime to the police. However, ethical practices meant that they were obliged to respect the privacy and wishes of the participants and so they could not do so. [11]

Privacy controversies

See also

Related Research Articles

<span class="mw-page-title-main">UK Biobank</span> Long-term biobank study of 500,000 people

UK Biobank is a large long-term biobank study in the United Kingdom (UK) which is investigating the respective contributions of genetic predisposition and environmental exposure to the development of disease. It began in 2006.

Genetic discrimination occurs when people treat others differently because they have or are perceived to have a gene mutation(s) that causes or increases the risk of an inherited disorder. It may also refer to any and all discrimination based on the genotype of a person rather than their individual merits, including that related to race, although the latter would be more appropriately included under racial discrimination. Some legal scholars have argued for a more precise and broader definition of genetic discrimination: "Genetic discrimination should be defined as when an individual is subjected to negative treatment, not as a result of the individual's physical manifestation of disease or disability, but solely because of the individual's genetic composition." Genetic Discrimination is considered to have its foundations in genetic determinism and genetic essentialism, and is based on the concept of genism, i.e. distinctive human characteristics and capacities are determined by genes.

Public health genomics is the use of genomics information to benefit public health. This is visualized as more effective preventive care and disease treatments with better specificity, tailored to the genetic makeup of each patient. According to the Centers for Disease Control and Prevention (U.S.), Public Health genomics is an emerging field of study that assesses the impact of genes and their interaction with behavior, diet and the environment on the population's health.

Protected health information (PHI) under U.S. law is any information about health status, provision of health care, or payment for health care that is created or collected by a Covered Entity, and can be linked to a specific individual. This is interpreted rather broadly and includes any part of a patient's medical record or payment history.

<span class="mw-page-title-main">Biobank</span> Repository of biological samples used for research

A biobank is a type of biorepository that stores biological samples for use in research. Biobanks have become an important resource in medical research, supporting many types of contemporary research like genomics and personalized medicine.

Personal genomics or consumer genetics is the branch of genomics concerned with the sequencing, analysis and interpretation of the genome of an individual. The genotyping stage employs different techniques, including single-nucleotide polymorphism (SNP) analysis chips, or partial or full genome sequencing. Once the genotypes are known, the individual's variations can be compared with the published literature to determine likelihood of trait expression, ancestry inference and disease risk.

Biobank ethics refers to the ethics pertaining to all aspects of biobanks. The issues examined in the field of biobank ethics are special cases of clinical research ethics.

Return of results is a concept in research ethics which describes the extent of the duty of a researcher to reveal and explain the results of research to a research participant.

<span class="mw-page-title-main">De-identification</span> Preventing personal identity from being revealed

De-identification is the process used to prevent someone's personal identity from being revealed. For example, data produced during human subject research might be de-identified to preserve the privacy of research participants. Biological data may be de-identified in order to comply with HIPAA regulations that define and stipulate patient privacy laws.

The right to withdraw is a concept in clinical research ethics that a study participant in a clinical trial has a right to end participation in that trial at will. According to ICH GCP guidelines, a person can withdraw from the research at any point in time and the participant is not required to reveal the reason for discontinuation.

In Electronic Health Records (EHR’s) data masking, or controlled access, is the process of concealing patient health data from certain healthcare providers. Patients have the right to request the masking of their personal information, making it inaccessible to any physician, or a particular physician, unless a specific reason is provided. Data masking is also performed by healthcare agencies to restrict the amount of information that can be accessed by external bodies such as researchers, health insurance agencies and unauthorised individuals. It is a method used to protect patients’ sensitive information so that privacy and confidentiality are less of a concern. Techniques used to alter information within a patient’s EHR include data encryption, obfuscation, hashing, exclusion and perturbation.

Data re-identification or de-anonymization is the practice of matching anonymous data with publicly available information, or auxiliary data, in order to discover the person the data belong to. This is a concern because companies with privacy policies, health care providers, and financial institutions may release the data they collect after the data has gone through the de-identification process.

Dynamic consent is an approach to informed consent that enables on-going engagement and communication between individuals and the users and custodians of their data. It is designed to address the many issues that are raised by the use of digital technologies in research and clinical care that enable the wide-scale use, linkage, analysis and integration of diverse datasets and the use of AI and big data analyses. These issues include how to obtain informed consent in a rapidly-changing environment; growing expectations that people should know how their data is being used; increased legal and regulatory requirements for the management of secondary use of data in biobanks and other medical research infrastructure. The approach started to be implemented in 2007 by an Italian group who introduced the ways to have an ongoing process of interaction between researcher and participant where "technology now allows the establishment of dynamic participant–researcher partnerships." The use of digital interfaces in this way was first described as 'Dynamic Consent' in the EnCoRe project. Dynamic Consent therefore describes a personalised, digital interface that enables two-way communication between participants and researchers and is a practical example of how software can be developed to give research participants greater understanding and control over how their data is used. It also enables clinical trial managers, researchers and clinicians to know what type of consent is attached to the use of data they hold and to have an easy way to seek a new consent if the use of the data changes. It is able to support greater accountability and transparency, streamlining consent processes to enable compliance with regulatory requirements.

openSNP is an open source website where users can share their genetic information. Users upload their genes, including gender, age, eye color, medical history, Fitbit data. With a focus on user patient-led research (PLR), there is potential to redefine the way health research is conducted.

"It promises to be a vital supplement to standard research: it can focus on conditions that are neglected by standard research, such as rare diseases or side effects, and can draw on a broader range of data and deliver outcomes more rapidly. It can also be a way of realising valuable forms of social interaction and support in cases where members of a community conduct PLR together, for example, patients suffering from the same illness."

Genetic privacy involves the concept of personal privacy concerning the storing, repurposing, provision to third parties, and displaying of information pertaining to one's genetic information. This concept also encompasses privacy regarding the ability to identify specific individuals by their genetic sequence, and the potential to gain information on specific characteristics about that person via portions of their genetic information, such as their propensity for specific diseases or their immediate or distant ancestry.

DNA encryption is the process of hiding or perplexing genetic information by a computational method in order to improve genetic privacy in DNA sequencing processes. The human genome is complex and long, but it is very possible to interpret important, and identifying, information from smaller variabilities, rather than reading the entire genome. A whole human genome is a string of 3.2 billion base paired nucleotides, the building blocks of life, but between individuals the genetic variation differs only by 0.5%, an important 0.5% that accounts for all of human diversity, the pathology of different diseases, and ancestral story. Emerging strategies incorporate different methods, such as randomization algorithms and cryptographic approaches, to de-identify the genetic sequence from the individual, and fundamentally, isolate only the necessary information while protecting the rest of the genome from unnecessary inquiry. The priority now is to ascertain which methods are robust, and how policy should ensure the ongoing protection of genetic privacy.

Yaniv Erlich is an Israeli-American scientist. He formerly served as an Associate Professor of Computer Science at Columbia University and was the Chief Science Officer of MyHeritage. Erlich's work combines computer science and genomics.

Privacy laws vary from state to state within the United States of America. Several states have recently passed new legislation that adapt to changes in cyber security laws, medical privacy laws, and other privacy related laws. State laws are typically extensions of existing United States federal laws, expanding them or changing the implementation of the law.

<span class="mw-page-title-main">Biological data</span>

Biological data refers to a compound or information derived from living organisms and their products. A medicinal compound made from living organisms, such as a serum or a vaccine, could be characterized as biological data. Biological data is highly complex when compared with other forms of data. There are many forms of biological data, including text, sequence data, protein structure, genomic data and amino acids, and links among others.

References

  1. "Privacy and Information Security Issues in Clinical Research". ACRP. 2018-03-13. Retrieved 2020-11-08.
  2. 1 2 3 4 5 6 7 8 9 10 Malin, B.; Loukides, G.; Benitez, K.; Clayton, E. W. (2011). "Identifiability in biobanks: Models, measures, and mitigation strategies". Human Genetics. 130 (3): 383–392. doi:10.1007/s00439-011-1042-5. PMC   3621020 . PMID   21739176.
  3. Nurmi, Sanna-Maria; Kangasniemi, Mari; Halkoaho, Arja; Pietilä, Anna-Maija (2018-10-24). "Privacy of Clinical Research Subjects: An Integrative Literature Review". Journal of Empirical Research on Human Research Ethics. 14 (1): 33–48. doi: 10.1177/1556264618805643 . PMID   30353779.
  4. Adam, Nabil R.; Worthmann, John C. (1989). "Security-control methods for statistical databases: a comparative study". ACM Computing Surveys. 21 (4): 515–556. CiteSeerX   10.1.1.115.3011 . doi:10.1145/76894.76895. S2CID   3060114.
  5. Mazumdar S, Konings P, Hewett M, et al. (2014). "Protecting the privacy of individual general practice patient electronic records for geospatial epidemiology research". Australian and New Zealand Journal of Public Health. 38 (6): 548–552. doi:10.1111/1753-6405.12262. hdl: 1885/76167 . PMID   25308525. S2CID   41505067. http://onlinelibrary.wiley.com/doi/10.1111/1753-6405.12262/full
  6. Nobel, D; Smith, D; Mathur, R; Robson, J; Greenhalgh, T (15 February 2012). "Feasibility study of geospatial mapping of chronic disease risk to inform public health commissioning". BMJ Open . 2 (1): e000711. doi:10.1136/bmjopen-2011-000711. PMC   3282296 . PMID   22337817. Open Access logo PLoS transparent.svg
  7. Clayton, D. (2010). "On inferring presence of an individual in a mixture: A Bayesian approach". Biostatistics. 11 (4): 661–673. doi:10.1093/biostatistics/kxq035. PMC   2950790 . PMID   20522729.
  8. Lemke, A. A.; Wolf, W. A.; Hebert-Beirne, J.; Smith, M. E. (2010). "Public and Biobank Participant Attitudes toward Genetic Research Participation and Data Sharing". Public Health Genomics. 13 (6): 368–377. doi:10.1159/000276767. PMC   2951726 . PMID   20805700.
  9. Kaufman, D. J.; Murphy-Bollinger, J.; Scott, J.; Hudson, K. L. (2009). "Public Opinion about the Importance of Privacy in Biobank Research". The American Journal of Human Genetics. 85 (5): 643–654. doi:10.1016/j.ajhg.2009.10.002. PMC   2775831 . PMID   19878915.
  10. 1 2 3 Greely, H. T. (2007). "The Uneasy Ethical and Legal Underpinnings of Large-Scale Genomic Biobanks". Annual Review of Genomics and Human Genetics. 8: 343–364. doi: 10.1146/annurev.genom.7.080505.115721 . PMID   17550341.
  11. "James Roffee & Andrea Waling Resolving ethical challenges when researching with minority and vulnerable populations: LGBTIQ victims of violence, harassment and bullying".
  12. "Netflix Spilled Your Brokeback Mountain Secret, Lawsuit Claims". Wired. ISSN   1059-1028 . Retrieved 2020-10-31.
  13. Donald, Laura M. Mac (2007-09-02). "Opinion | America's Toe-Tapping Menace (Published 2007)". The New York Times. ISSN   0362-4331 . Retrieved 2020-10-31.