Latanya Sweeney

Last updated
Latanya Sweeney
Latanya Sweeney at a Knight News Challenge event in New York City, November 2017.jpg
Sweeney at a panel discussion in New York City, November 2017
Education
Known for k-anonymity
Scientific career
Fields
Institutions Harvard University
Carnegie Mellon University
Thesis Computational Disclosure Control: Theory and Practice  (2001)
Website latanyasweeney.org

Latanya Arvette Sweeney is an American computer scientist. She is the Daniel Paul Professor of the Practice of Government and Technology at the Harvard Kennedy School and in the Harvard Faculty of Arts and Sciences at Harvard University. [1] She is the founder and director of the Public Interest Tech Lab, founded in 2021 with a $3 million grant from the Ford Foundation as well as the Data Privacy Lab. [2] [3] She is the current Faculty Dean in Currier House at Harvard. [4] [5]

Contents

Sweeney is the former Chief Technologist of the Federal Trade Commission and Editor-in-Chief of Technology Science. [6] [7] [8] Her best known academic work is on the theory of k-anonymity, and she is credited with the observation that "87% of the U.S. population is uniquely identified by date of birth, gender, postal code". [9]

Education

Sweeney graduated from Dana Hall Schools in Wellesley, Massachusetts, receiving her high school diploma in 1977. She delivered the valedictory at the graduation ceremony. [10]

She began her undergraduate work in computer science at MIT, but left to found a company. She completed her undergraduate degree in computer science at the Harvard University Extension School. [11] In 2001, Sweeney was awarded a Ph.D. in computer science from the Massachusetts Institute of Technology, the first African American woman to do so. [12] Of this career move she has said, "I would do something that was really quite noteworthy, but there was nowhere to publish about it. You could get paid for it, but there was no way to say, 'You won’t believe what I just did!' The only way to get it was to go back to school." [13]

Career

In 2001, Sweeney founded the Data Privacy Lab at Carnegie Mellon University. She was a member of the Program Committee for Modeling Decisions for Artificial Intelligence in 2005. In 2004, she founded the Journal of Privacy Technology, later becoming the editor-in-chief in 2006. [14] She is currently one of the faculty deans of Harvard's Currier House. [15]

Health Insurance Portability and Accountability Act (HIPAA)

In 1997, Sweeney conducted her first re-identification experiment wherein she successfully identified the then Governor of Massachusetts, Bill Weld, to his medical records using publicly accessible records. [16] [17] Her results had a significant impact on privacy centered policy making including the health privacy legislation Health Insurance Portability and Accountability Act (HIPAA). However publication of the experiment was rejected twenty times. The several re-identification experiments she conducted after this were met with serious publication challenges as well. In fact, a court ruling in Southern Illinoisian v. Department of Public Health barred her from publication and sharing of her methods for a successful re-identification experiment. [18]

In her landmark article “Only You, Your Doctor, and Many Others May Know,” [19] Sweeney discussed her research project in which she was able to locate and match identities and personal health records through a number of methods. Such methods, as she explains in depth later on, include looking at public health records from hospitals and newspaper stories. Towards the end of the article, Sweeney touches upon the different approaches of how she analyzed and matched the data, either through using computer programs or human effort. She then makes the conclusion that new and improved methods of data sharing are necessary. [20]

Medical dataset de-anonymization

In 1998 Sweeney published a now famous example about data de-anonymization, demonstrating that a medical dataset that was in the public domain, can be used to identify individuals, regardless the removal of all explicit identifiers, when the medical dataset was combined with a public voter list. Sweeney found that 87% of the US population in a censorship dataset, can be identified by combining data attributes which are known as quasi-identifiers. [21]

Data Privacy Lab at Harvard

Since 2011, Sweeney's Data Privacy Lab at Harvard has been conducting research about data privacy. It intends to provide a cross-disciplinary perspective about privacy in the process of disseminating data. [22]

In 2021, Sweeney launched the Public Interest Technology Lab at Harvard's Harvard Kennedy School, housed in the Shorenstein Center on Media, Politics and Public Policy. [23] [24]

See also

Related Research Articles

Information privacy is the relationship between the collection and dissemination of data, technology, the public expectation of privacy, contextual information norms, and the legal and political issues surrounding them. It is also known as data privacy or data protection.

<span class="mw-page-title-main">Alex Pentland</span> American academic and entrepreneur (born 1951)

Alex Paul "Sandy" Pentland is an American computer scientist, the Toshiba Professor of Media Arts and Sciences at MIT, and serial entrepreneur.

Pseudonymization is a data management and de-identification procedure by which personally identifiable information fields within a data record are replaced by one or more artificial identifiers, or pseudonyms. A single pseudonym for each replaced field or collection of replaced fields makes the data record less identifiable while remaining suitable for data analysis and data processing.

Protected health information (PHI) under U.S. law is any information about health status, provision of health care, or payment for health care that is created or collected by a Covered Entity, and can be linked to a specific individual. This is interpreted rather broadly and includes any part of a patient's medical record or payment history.

<span class="mw-page-title-main">Deborah Estrin</span> American computer scientist

Deborah Estrin is a Professor of Computer Science at Cornell Tech. She is co-founder of the non-profit Open mHealth and gave a TEDMED talk on small data in 2013.

Privacy for research participants is a concept in research ethics which states that a person in human subject research has a right to privacy when participating in research. Some typical scenarios this would apply to include, or example, a surveyor doing social research conducts an interview with a participant, or a medical researcher in a clinical trial asks for a blood sample from a participant to see if there is a relationship between something which can be measured in blood and a person's health. In both cases, the ideal outcome is that any participant can join the study and neither the researcher nor the study design nor the publication of the study results would ever identify any participant in the study. Thus, the privacy rights of these individuals can be preserved.

<span class="mw-page-title-main">De-identification</span> Preventing personal identity from being revealed

De-identification is the process used to prevent someone's personal identity from being revealed. For example, data produced during human subject research might be de-identified to preserve the privacy of research participants. Biological data may be de-identified in order to comply with HIPAA regulations that define and stipulate patient privacy laws.

Khaled El Emam is a co-founder and Director at Replica Analytics. El Emam is also a senior scientist at the Children's Hospital of Eastern Ontario (CHEO) Research Institute and director of the multi-disciplinary Electronic Health Information Laboratory, conducting academic research on de-identification and re-identification risk. As of 2022, El-Emam has served as Editor-in-Chief of JMIR AI, a journal focused on research and applications for the health AI community.

Quasi-identifiers are pieces of information that are not of themselves unique identifiers, but are sufficiently well correlated with an entity that they can be combined with other quasi-identifiers to create a unique identifier.

Data anonymization is a type of information sanitization whose intent is privacy protection. It is the process of removing personally identifiable information from data sets, so that the people whom the data describe remain anonymous.

Data Privacy Lab is a program dedicated to teaching and research in areas related to privacy technology. The Data Privacy Lab in Harvard University is operating in the Institute for Quantitative Social Science (IQSS). Latanya Sweeney founded the Lab and continues as its Director. The program was first started in 2001 at Carnegie Mellon University in the Heinz College and in 2002, moved to the School of Computer Science, where it operated until 2011 before moving to Harvard. The University of North Carolina at Charlotte is also running a Data Privacy Lab program and it is functioning in the College of Computing and Informatics.

Datafly algorithm is an algorithm for providing anonymity in medical data. The algorithm was developed by Latanya Arvette Sweeney in 1997−98. Anonymization is achieved by automatically generalizing, substituting, inserting, and removing information as appropriate without losing many of the details found within the data. The method can be used on-the-fly in role-based security within an institution, and in batch mode for exporting data from an institution. Organizations release and receive medical data with all explicit identifiers—such as name—removed, in the erroneous belief that patient confidentiality is maintained because the resulting data look anonymous. However the remaining data can be used to re-identify individuals by linking or matching the data to other databases or by looking at unique characteristics found in the fields and records of the database itself.

k-anonymity is a property possessed by certain anonymized data. The term k-anonymity was first introduced by Pierangela Samarati and Latanya Sweeney in a paper published in 1998, although the concept dates to a 1986 paper by Tore Dalenius.

<span class="mw-page-title-main">Nick Sinai</span> Adjunct faculty and a senior in the Obama Administration

Nick Sinai is a venture capitalist, adjunct faculty at Harvard Kennedy School, author, and a former senior official in the Obama Administration.

Data re-identification or de-anonymization is the practice of matching anonymous data with publicly available information, or auxiliary data, in order to discover the person the data belong to. This is a concern because companies with privacy policies, health care providers, and financial institutions may release the data they collect after the data has gone through the de-identification process.

Sidewalk Toronto is a cancelled urban development project proposed by Sidewalk Labs at Quayside, a waterfront area in Toronto, Ontario, Canada. This project was first initiated by Waterfront Toronto in 2017 by issuing the request for proposal (RFP) on development of the Quayside area. Sidewalk Labs, which is a subsidiary of Google, won the bid in 2017. The Master Innovation Development Plan (MIDP) was created in 2019 through conversations with over 21,000 Toronto residents and aimed to be an innovative reinvention of Toronto's neglected eastern downtown waterfront.

Limor Shmerling Magazanik is a thought leader in digital technology policy, ethics and regulation. She is an expert in data governance, privacy, AI ethics and cybersecurity policy. Since November 2018, she has been the managing director of the Israel Tech Policy Institute (ITPI) and a senior fellow at the Future of Privacy Forum. She is a visiting scholar at the Duke University Sanford School of Public Policy. Previously, for 10 years, she was director at the Israeli Privacy Protection Authority and an adjunct lecturer at the Hebrew University of Jerusalem Faculty of Law and the Interdisciplinary Center Herzliya School of Law and a research advisor at the Milken Innovation Center. Her background also includes positions in the private sector, law firms and high-tech industry. She has promoted policy initiatives in various technology sectors and has been an advocate for compliance with data protection and privacy by design.

<span class="mw-page-title-main">Joan Donovan</span> American social scientist and academic

Joan Donovan is an American social science researcher, sociologist, and academic noted for her research on disinformation. She is currently an assistant professor at the College of Communication at Boston University.

Andrea M. Matwyshyn is an American law professor and engineering professor at The Pennsylvania State University. She is known as a scholar of technology policy, particularly as an expert at the intersection of law and computer security and for her work with government. She is credited with originating the legal and policy concept of the Internet of Bodies.

Public interest technology (PIT) is an approach to the use of technology to promote "the development and realization of socially responsible solutions to the challenges in a technology-driven world." It has been characterized as "people-centered problem solving." PIT emerged as a field of academic research and action in higher education in 2019 with the establishment of the Public Interest Technology University Network (PIT-UN) by New America.

References

  1. "Latanya Sweeney". Harvard Kennedy School Faculty Profile. Retrieved 2021-06-21.
  2. "People". Public Interest Tech Lab. Retrieved 2021-07-21.
  3. "Humanizing Technology". Harvard Gazette. 23 June 2021. Retrieved 2021-07-22.
  4. "Latanya Sweeney". Harvard Business School Digital Initiative. Retrieved 2019-11-12.
  5. "Data Privacy Lab". dataprivacylab.org. Retrieved 2022-06-24.
  6. "FTC Names Latanya Sweeney as Chief Technologist; Andrea Matwyshyn as Policy Advisor". Federal Trade Commission. November 18, 2013. Retrieved August 6, 2014.
  7. "Hello world! | Federal Trade Commission". www.ftc.gov. Retrieved 2016-03-22.
  8. "About Us | Technology Science". techscience.org. Retrieved 2022-06-24.
  9. L. Sweeney. "Simple Demographics Often Identify People Uniquely (Data Privacy Working Paper 3) Pittsburgh 2000" (PDF). Carnegie Mellon University . Retrieved 18 January 2014.
  10. "Dr. Latanya Sweeney, Curriculum Vitae". latanyasweeney.org. Retrieved 2017-10-24.
  11. "Biographical sketches of Latanya Sweeney". Data Privacy Lab, Harvard University. Retrieved 18 January 2014.
  12. "New faculty deans appointed". Harvard Gazette. 9 May 2016. Retrieved 2016-06-06.
  13. "Tech Pioneer Latanya Sweeney's Advice to her Younger Self". Ms. Magazine. 2019-09-04. Retrieved 22 July 2021.
  14. "Biographical sketches of Latanya Sweeney, Ph.D." latanyasweeney.org. Retrieved 2017-10-24.
  15. "New faculty deans appointed". Harvard Gazette. 9 May 2016. Retrieved 6 June 2016.
  16. "Law, Ethics & Science of Re-identification Demonstrations". Petrie Flom Center at Harvard University. Retrieved 12 June 2021.
  17. Paul Ohm (13 August 2009). "Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization". UCLA Law Review. ssrn.com. 57: 1701.
  18. "No. 5-02-0836, The Southern Illinoisan v. Department of Public Health". www.illinoiscourts.gov. Retrieved 2022-06-24.
  19. Sweeney, Latanya. "Only You, Your Doctor, and Many Others May Know". Technology Science via techscience.org.
  20. Sweeney, Latanya (2015-09-29). "Only You, Your Doctor, and Many Others May Know". Technology Science.
  21. Tianqing Zhu; Gang Li; Wanlei Zhou; Philip S. Yu (2017). Differential Privacy and Applications. Springer International Publishing. p. 2. ISBN   9783319620046.
  22. Caroline Perry (18 October 2011). "You're not so anonymous". Harvard Gazette.
  23. "New Public Interest Tech Lab". Shorenstein Center. 23 June 2021. Retrieved 24 November 2021.
  24. "Public Interest Tech Lab". Techlab. Retrieved 24 June 2022.