Protected health information (PHI) under U.S. law is any information about health status, provision of health care, or payment for health care that is created or collected by a Covered Entity (or a Business Associate of a Covered Entity), and can be linked to a specific individual. This is interpreted rather broadly and includes any part of a patient's medical record or payment history.
Instead of being anonymized, PHI is often sought out in datasets for de-identification before researchers share the dataset publicly. Researchers remove individually identifiable PHI from a dataset to preserve privacy for research participants.
There are many forms of PHI, with the most common being physical storage in the form of paper-based personal health records (PHR). Other types of PHI include electronic health records, wearable technology, and mobile applications. In recent years, there has been a growing number of concerns regarding the safety and privacy of PHI.
Under the U.S. Health Insurance Portability and Accountability Act (HIPAA), PHI that is linked based on the following list of 18 identifiers must be treated with special care:
The HIPAA Privacy Rule addresses the privacy and security aspects of PHI. There are three main purposes which include:
In 2016, the U.S. Circuit Court of Appeals for the Eleventh Circuit overturned the decision in LabMD, Inc. v. Federal Trade Commission (FTC). The FTC filed a complaint against medical testing laboratory LabMD, Inc. alleging that the company failed to reasonably protect the security of consumers’ personal data, including medical information. The FTC alleged that in two separate incidents, LabMD collectively exposed the personal information of approximately 10,000 consumers. The court vacated the original cease-and-desist order, stating that it would "mandate a complete overhaul of LabMD’s data-security program and says little about how this is to be accomplished.” [1]
Anonymization is a process in which PHI elements are eliminated or manipulated with the purpose of hindering the possibility of going back to the original data set. [2] This involves removing all identifying data to create unlinkable data. [3] De-identification under the HIPAA Privacy Rule occurs when data has been stripped of common identifiers by two methods:
De-identified data is coded, with a link to the original, fully identified data set kept by an honest broker. Links exist in coded de-identified data making the data considered indirectly identifiable and not anonymized. Coded de-identified data is not protected by the HIPAA Privacy Rule, but is protected under the Common Rule. The purpose of de-identification and anonymization is to use health care data in larger increments, for research purposes. Universities, government agencies, and private health care entities use such data for research, development and marketing purposes. [3]
Covered Entities
In general, U.S. law governing PHI applies to data collected in the course of providing and paying for health care. Privacy and security regulations govern how healthcare professionals, hospitals, health insurers, and other Covered Entities use and protect the data they collect. The source of the data is as relevant as the data itself when determining if information is PHI under U.S. law. For example, sharing information about someone on the street with an obvious medical condition such as an amputation is not restricted by U.S. law. However, obtaining information about the amputation exclusively from a protected source, such as from an electronic medical record, would breach HIPAA regulations.
Business Associates
Covered Entities often use third parties to provide certain health and business services. If they need to share PHI with those third parties, it is the responsibility of the Covered Entity to put in place a Business Associate Agreement that holds the third party to the same standards of privacy and confidentiality as the Covered Entity. [6]
Protected health information can be stored in many different forms. According to HIPAA, there are many requirements and limitations regarding how PHI can be stored.
Until recently, physical storage has been the most common method of storing PHI. Physical safeguards for PHI include storing paper records in locked cabinets and enabling a control over the records. A security authority, PIN pad, or identification card could all be necessary to access physical storage of PHI.
Much of PHI is stored in electronic health records (EHR). Cloud computing and other services allow healthcare providers to store vast amounts of data for easy access. For example, Kaiser Permanente has over 9 million members and stores anywhere from 25 to 44 petabytes. [7] In Australia, over 90% of healthcare institutions have implemented EHRs, in an attempt to improve efficiency. [8] E-health architecture types can either be public, private, hybrid, or community, depending on the data stored. Healthcare providers will often store their data on a vast network of remote servers, proving susceptible to privacy breaches. According to a study, the US could save $81 billion annually from switching to a universal electronic health record (EHR). [9]
In PHI, wearable technology often comes in the form of smartwatches, ECG monitors, blood pressure monitors, and biosensors. Wearable technology has faced rapid growth with 102.4 million units shipped in 2016, up 25% from the 81.9 million units shipped in 2015. According to Insider Intelligence research, the number of health and fitness app users will remain over 84 million through 2022. [10] Health and fitness tracking capabilities are a target for companies producing wearable technology. Privacy concerns for consumers arise when these technology companies are not considered covered entities or business associates under HIPAA or where the health information collected is not PHI.[ citation needed ]
Mobile applications have been proven essential, especially for the elderly or disabled. The adoption of mobile healthcare is said to be attractive due to factors like patient behavior, subjective norm, personal innovativeness, perceived behavioral control, and behavioral intention. The legitimacy of certain mobile applications that store PHI can be determined by the user reviews on the application. [11]
In a study conducted by researchers, 14 patients were asked for their opinions on privacy concerns and healthcare perceptions. Researchers found that all participants agreed on the importance of healthcare privacy. Participants demonstrated a vague understanding of the legislated patient privacy rights. There were differing opinions on whose responsibility it should be to protect health information; some thought it was their own responsibility, while others thought that the government was responsible. Consent was rarely brought up within the discussion. [12]
Because patient privacy is the reason for regulations on PHI, analyzing consumer data can be extremely difficult to come by. Luca Bonomi and Xiaoqian Jiang determined a technique to perform temporal record linkage using non-protected health information data. As standard linkage processes lack the ability to incorporate the time setting, they result in being ineffective. Bonomi and Jiang propose using the patient’s non-protected health information data to determine records and establish patterns. This approach allows the linkage of patient records using non-PHI data, by giving doctors patterns and a better idea of important diagnoses. [13]
From 2005 to 2019, the total number of individuals affected by healthcare data breaches was 249.09 million. According to an IBM report, the average cost of a data breach in 2019 was $3.92 million, while a healthcare industry breach usually costs $6.45 million. However, the average cost of a healthcare data breach (average breach size 25,575 records) in the U.S. is $15 million. [16]
In 2017, healthcare compliance analytics platform Protenus stated that 477 healthcare breaches were reported to the U.S. Division of Health and Human Services (HHS). Of these, 407 showed that 5.579 million patient records were affected. [17]
The 2018 Verizon Protected Health Information Data Breach Report (PHIDBR) examined 27 countries and 1368 incidents, detailing that the focus of healthcare breaches was mainly the patients, their identities, health histories, and treatment plans. According to HIPAA, 255.18 million people were affected from 3051 healthcare data breach incidents from 2010 to 2019.
Health-related fraud is estimated to cost the U.S. nearly $80 billion annually. The healthcare industry remains the most costly and targeted industry to data breaches. Healthcare companies have been criticized for not adapting and prioritizing data security. One reason is due to the leeway and minimal penalties for those that fail to comply with the HIPAA Security Rule. There is also limited competition and a stable customer base within the healthcare industry. [18] Researchers are searching for more secure ways to protect PHI. [19]
In the case of PHI, there are ethical concerns regarding how information is treated on a daily basis by healthcare personnel. In 1996, the Clinton Administration passed the HIPAA Privacy Rule, limiting a physician's ability to arbitrarily disclose patients’ personal medical records. [20]
As health artificial intelligence (AI) applications are expected to save over $150 billion in annual savings for U.S. healthcare, researchers are studying the risks of potential PHI leaks. [21] Currently, 21% of U.S. consumers or 57 million people, use a quantified self health and fitness tracking (QSHFT) application. In a study conducted by Nancy Brinson and Danielle Rutherford, over 90% of consumers were comfortable with the opportunity to share data with a healthcare provider. However, Brinson and Rutherford claim that consumers fail to make privacy a priority when they choose to share this information. To combat misuse of PHI on mobile healthcare platforms, Brinson and Rutherford suggest the creation of a policy rating system for consumers. A rating system, monitored by the Federal Trade Commission would allow consumers a centralized way to evaluate data collection methods amongst mobile health providers. [22]
In 2019, the US Department of Health and Human Services Office for Civil Rights (OCA) promised to enforce patients’ right to access under HIPAA, using the Right of Access Initiative. There have currently already been two settlements with the OCA under the Right of Access Initiative, after companies failed to give patient medical records. [23]
Information privacy is the relationship between the collection and dissemination of data, technology, the public expectation of privacy, contextual information norms, and the legal and political issues surrounding them. It is also known as data privacy or data protection.
Medical privacy, or health privacy, is the practice of maintaining the security and confidentiality of patient records. It involves both the conversational discretion of health care providers and the security of medical records. The terms can also refer to the physical privacy of patients from other patients and providers while in a medical facility, and to modesty in medical settings. Modern concerns include the degree of disclosure to insurance companies, employers, and other third parties. The advent of electronic medical records (EMR) and patient care management systems (PCMS) have raised new concerns about privacy, balanced with efforts to reduce duplication of services and medical errors.
The Health Insurance Portability and Accountability Act of 1996 is a United States Act of Congress enacted by the 104th United States Congress and signed into law by President Bill Clinton on August 21, 1996. It aimed to alter the transfer of healthcare information, stipulated the guidelines by which personally identifiable information maintained by the healthcare and healthcare insurance industries should be protected from fraud and theft, and addressed some limitations on healthcare insurance coverage. It generally prohibits healthcare providers and businesses called covered entities from disclosing protected information to anyone other than a patient and the patient's authorized representatives without their consent. The bill does not restrict patients from receiving information about themselves. Furthermore, it does not prohibit patients from voluntarily sharing their health information however they choose, nor does it require confidentiality where a patient discloses medical information to family members, friends or other individuals not employees of a covered entity.
The terms medical record, health record and medical chart are used somewhat interchangeably to describe the systematic documentation of a single patient's medical history and care across time within one particular health care provider's jurisdiction. A medical record includes a variety of types of "notes" entered over time by healthcare professionals, recording observations and administration of drugs and therapies, orders for the administration of drugs and therapies, test results, X-rays, reports, etc. The maintenance of complete and accurate medical records is a requirement of health care providers and is generally enforced as a licensing or certification prerequisite.
An electronic health record (EHR) is the systematized collection of patient and population electronically stored health information in a digital format. These records can be shared across different health care settings. Records are shared through network-connected, enterprise-wide information systems or other information networks and exchanges. EHRs may include a range of data, including demographics, medical history, medication and allergies, immunization status, laboratory test results, radiology images, vital signs, personal statistics like age and weight, and billing information.
A personal health record (PHR) is a health record where health data and other information related to the care of a patient is maintained by the patient. This stands in contrast to the more widely used electronic medical record, which is operated by institutions and contains data entered by clinicians to support insurance claims. The intention of a PHR is to provide a complete and accurate summary of an individual's medical history which is accessible online. The health data on a PHR might include patient-reported outcome data, lab results, and data from devices such as wireless electronic weighing scales or from a smartphone.
Health technology is defined by the World Health Organization as the "application of organized knowledge and skills in the form of devices, medicines, vaccines, procedures, and systems developed to solve a health problem and improve quality of lives". This includes pharmaceuticals, devices, procedures, and organizational systems used in the healthcare industry, as well as computer-supported information systems. In the United States, these technologies involve standardized physical objects, as well as traditional and designed social means and methods to treat or care for patients.
Patient portals are healthcare-related online applications that allow patients to interact and communicate with their healthcare providers, such as physicians and hospitals. Typically, portal services are available on the Internet at all hours of the day and night. Some patient portal applications exist as stand-alone web sites and sell their services to healthcare providers. Other portal applications are integrated into the existing web site of a healthcare provider. Still others are modules added onto an existing electronic medical record (EMR) system. What all of these services share is the ability of patients to interact with their medical information via the Internet. Currently, the lines between an EMR, a personal health record, and a patient portal are blurring. For example, Intuit Health and Microsoft HealthVault describe themselves as personal health records (PHRs), but they can interface with EMRs and communicate through the Continuity of Care Record standard, displaying patient data on the Internet so it can be viewed through a patient portal.
De-identification is the process used to prevent someone's personal identity from being revealed. For example, data produced during human subject research might be de-identified to preserve the privacy of research participants. Biological data may be de-identified in order to comply with HIPAA regulations that define and stipulate patient privacy laws.
The Health Information Technology for Economic and Clinical Health Act, abbreviated the HITECH Act, was enacted under Title XIII of the American Recovery and Reinvestment Act of 2009. Under the HITECH Act, the United States Department of Health and Human Services resolved to spend $25.9 billion to promote and expand the adoption of health information technology. The Washington Post reported the inclusion of "as much as $36.5 billion in spending to create a nationwide network of electronic health records." At the time it was enacted, it was considered "the most important piece of health care legislation to be passed in the last 20 to 30 years" and the "foundation for health care reform."
Digital health is a discipline that includes digital care programs, technologies with health, healthcare, living, and society to enhance the efficiency of healthcare delivery and to make medicine more personalized and precise. It uses information and communication technologies to facilitate understanding of health problems and challenges faced by people receiving medical treatment and social prescribing in more personalised and precise ways. The definitions of digital health and its remits overlap in many ways with those of health and medical informatics.
Health information on the Internet refers to all health-related information communicated through or available on the Internet.
Health care analytics is the health care analysis activities that can be undertaken as a result of data collected from four areas within healthcare; claims and cost data, pharmaceutical and research and development (R&D) data, clinical data, and patient behavior and sentiment data (patient behaviors and preferences,. Health care analytics is a growing industry in the United States, expected to grow to more than $31 billion by 2022. The industry focuses on the areas of clinical analysis, financial analysis, supply chain analysis, as well as marketing, fraud and HR analysis.
Medical data, including patients' identity information, health status, disease diagnosis and treatment, and biogenetic information, not only involve patients' privacy but also have a special sensitivity and important value, which may bring physical and mental distress and property loss to patients and even negatively affect social stability and national security once leaked. However, the development and application of medical AI must rely on a large amount of medical data for algorithm training, and the larger and more diverse the amount of data, the more accurate the results of its analysis and prediction will be. However, the application of big data technologies such as data collection, analysis and processing, cloud storage, and information sharing has increased the risk of data leakage. In the United States, the rate of such breaches has increased over time, with 176 million records breached by the end of 2017. There have been 245 data breaches of 10,000 or more records, 68 breaches of the healthcare data of 100,000 or more individuals, 25 breaches that affected more than half a million individuals, and 10 breaches of the personal and protected health information of more than 1 million individuals.
Data re-identification or de-anonymization is the practice of matching anonymous data with publicly available information, or auxiliary data, in order to discover the person the data belong to. This is a concern because companies with privacy policies, health care providers, and financial institutions may release the data they collect after the data has gone through the de-identification process.
Privacy in education refers to the broad area of ideologies, practices, and legislation that involve the privacy rights of individuals in the education system. Concepts that are commonly associated with privacy in education include the expectation of privacy, the Family Educational Rights and Privacy Act (FERPA), the Fourth Amendment, and the Health Insurance Portability and Accountability Act of 1996 (HIPAA). Most privacy in education concerns relate to the protection of student data and the privacy of medical records. Many scholars are engaging in an academic discussion that covers the scope of students’ privacy rights, from student in K-12 and even higher education, and the management of student data in an age of rapid access and dissemination of information.
DNA encryption is the process of hiding or perplexing genetic information by a computational method in order to improve genetic privacy in DNA sequencing processes. The human genome is complex and long, but it is very possible to interpret important, and identifying, information from smaller variabilities, rather than reading the entire genome. A whole human genome is a string of 3.2 billion base paired nucleotides, the building blocks of life, but between individuals the genetic variation differs only by 0.5%, an important 0.5% that accounts for all of human diversity, the pathology of different diseases, and ancestral story. Emerging strategies incorporate different methods, such as randomization algorithms and cryptographic approaches, to de-identify the genetic sequence from the individual, and fundamentally, isolate only the necessary information while protecting the rest of the genome from unnecessary inquiry. The priority now is to ascertain which methods are robust, and how policy should ensure the ongoing protection of genetic privacy.
Health data is any data "related to health conditions, reproductive outcomes, causes of death, and quality of life" for an individual or population. Health data includes clinical metrics along with environmental, socioeconomic, and behavioral information pertinent to health and wellness. A plurality of health data are collected and used when individuals interact with health care systems. This data, collected by health care providers, typically includes a record of services received, conditions of those services, and clinical outcomes or information concerning those services. Historically, most health data has been sourced from this framework. The advent of eHealth and advances in health information technology, however, have expanded the collection and use of health data—but have also engendered new security, privacy, and ethical concerns. The increasing collection and use of health data by patients is a major component of digital health.
Federal and state governments, insurance companies and other large medical institutions are heavily promoting the adoption of electronic health records. The US Congress included a formula of both incentives and penalties for EMR/EHR adoption versus continued use of paper records as part of the Health Information Technology for Economic and Clinical Health (HITECH) Act, enacted as part of the, American Recovery and Reinvestment Act of 2009.
Project Nightingale is a data storage and processing project by Google Cloud and Ascension, a Catholic health care system comprising a chain of 2,600 hospitals, doctors' offices and other related facilities, in 21 states, with tens of millions of patient records available for processing health care data. Ascension is one of the largest health-care systems in the United States with comprehensive and specific health care information of millions who are part of its system. The project is Google's attempt to gain a foothold into the healthcare industry on a large scale. Amazon, Microsoft and Apple Inc. are also actively advancing into health care, but none of their business arrangements are equal in scope to Project Nightingale.