Local differential privacy

Last updated

Local differential privacy (LDP) is a model of differential privacy with the added requirement that if an adversary has access to the personal responses of an individual in the database, that adversary will still be unable to learn much of the user's personal data. This is contrasted with global differential privacy, a model of differential privacy that incorporates a central aggregator with access to the raw data. [1]

Contents

Local differential privacy (LDP) is an approach to mitigate the concern of data fusion and analysis techniques used to expose individuals to attacks and disclosures. LDP is a well-known privacy model for distributed architectures that aims to provide privacy guarantees for each user while collecting and analyzing data, protecting from privacy leaks for the client and server. [2] LDP has been widely adopted to alleviate contemporary privacy concerns in the era of big data. [3]

History

In 2003, Alexandre V. Evfimievski, Johannes Gehrke, and Ramakrishnan Srikant [4] gave a definition equivalent to local differential privacy. In 2008, Kasiviswanathan et al. [5] gave a formal definition conforming to the now-standard definition of differential privacy.

The prototypical example of a mechanism with local differential privacy is the randomized response survey technique proposed by Stanley L. Warner in 1965. [6] Warner's innovation was the introduction of the “untrusted curator” model, where the entity collecting the data may not be trustworthy. Before users' responses are sent to the curator, the answers are randomized in a controlled manner, guaranteeing differential privacy while still allowing valid population-wide statistical inferences.

Applications

The era of big data exhibits a high demand for machine learning services that provide privacy protection for users. Demand for such services has pushed research into algorithmic paradigms that provably satisfy specific privacy requirements.

Anomaly Detection

Anomaly detection is formally defined as the process of identifying unexpected items or events in data sets. The rise of social networking in the current era has led to many potential concerns related to information privacy. As more and more users rely on social networks, they are often threatened by privacy breaches, unauthorized access to personal information, and leakage of sensitive data. To attempt to solve this issue, the authors of "Anomaly Detection over Differential Preserved Privacy in Online Social Networks" have proposed a model using a social network utilizing restricted local differential privacy. By using this model, it aims for improved privacy preservation through anomaly detection. In this paper, the authors propose a privacy preserving model that sanitizes the collection of user information from a social network utilizing restricted local differential privacy (LDP) to save synthetic copies of collected data. This model uses reconstructed data to classify user activity and detect abnormal network behavior. The experimental results demonstrate that the proposed method achieves high data utility on the basis of improved privacy preservation. Furthermore, local differential privacy sanitized data are suitable for use in subsequent analyses, such as anomaly detection. Anomaly detection on the proposed method’s reconstructed data achieves a detection accuracy similar to that on the original data. [7]

Blockchain Technology

Potential combinations of blockchain technology with local differential privacy have received research attention. Blockchains implement distributed, secured, and shared ledgers used to record and track data within a decentralized network, and they have successfully replaced certain prior systems of economic transactions within and between organizations. Increased usage of blockchains has raised some questions regarding privacy and security of data they store, and local differential privacy of various kinds has been proposed as a desirable property for blockchains containing sensitive data. [8]

Context-Free Privacy

Local differential privacy provides context-free privacy even in the absence of a trusted data collector, though often at the expense of a significant drop in utility. The classical definition of LDP assumes that all elements in the data domain are equally sensitive. However, in many applications, some symbols are more sensitive than others. A context-aware framework of local differential privacy [9] can allow a privacy designer to incorporate the application’s context into the privacy definition. For binary data domains, algorithmic research has provided a universally optimal privatization scheme and highlighted its connections to Warner’s randomized response [10] (RR) and Mangat’s improved response. For k-ary data domains, motivated by geolocation and web search applications, researchers have considered at least two special cases of context-aware LDP: block-structured LDP and high-low LDP (the latter is also defined in [11] ). The research has provided communication-efficient, sample-optimal schemes and information theoretic lower bounds for both models.

Facial Recognition

Facial recognition, though convenient, can potentially lead to a leak of biometric features that identify the user No typing passwords with facial recognition login.jpg
Facial recognition, though convenient, can potentially lead to a leak of biometric features that identify the user

Facial recognition has become more and more widespread in recent years. Recent smartphones, for example, utilize facial recognition to unlock the users phone as well as authorize the payment with their credit card. Though this is convenient, it poses privacy concerns. It is a resource-intensive task that often involves third party users, often resulting in a gap where the user’s privacy could be compromised. Biometric information delivered to untrusted third-party servers in an uncontrolled manner can constitute a significant privacy leak as biometrics can be correlated with sensitive data such as healthcare or financial records. In Chamikara's academic article, he proposes a privacy-preserving technique for “controlled information release”, where they disguise an original face image and prevent leakage of the biometric features while identifying a person. He introduces a new privacy-preserving face recognition protocol named PEEP (Privacy using Eigenface Perturbation) that utilizes local differential privacy. PEEP applies perturbation to Eigenfaces utilizing differential privacy and stores only the perturbed data in the third-party servers to run a standard Eigenface recognition algorithm. As a result, the trained model will not be vulnerable to privacy attacks such as membership inference and model memorization attacks. [12] This model provided by Chami kara shows the potential solution of this issue or privacy leaks.

Federated Learning (FL)

With federated learning coupled with local differential privacy, researchers have found this model to be quite effective to facilitate crowdsourcing applications and provide protection for users' privacy Federated learning process central case.png
With federated learning coupled with local differential privacy, researchers have found this model to be quite effective to facilitate crowdsourcing applications and provide protection for users' privacy

Federated learning has the ambition to protect data privacy through distributed learning methods that keep the data in its storage. Likewise, differential privacy (DP) attains to improve the protection of data privacy by measuring the privacy loss in the communication among the elements of federated learning. The prospective matching of federated learning and differential privacy to the challenges of data privacy protection has caused the release of several software tools that support their functionalities, but they lack a unified vision of these techniques, and a methodological workflow that supports their usage. In the study sponsored by the Andalusian Research Institute in Data Science and computational Intelligence, they developed a Sherpa.ai FL, 1,2 which is an open-research unified FL and DP framework that aims to foster the research and development of AI services at the edges and to preserve data privacy. The characteristics of FL and DP tested and summarized in the study suggests that they make them good candidates to support AI services at the edges and to preserve data privacy through their finding that by setting the value of for lower values would guarantee higher privacy at the cost of lower accuracy. [13]

Health Data Aggregation

The rise of technology not only changes the way we work and perform our everyday lives, but also the changes to the health industry is also prominent as a result of the rise of the big data era is emphasized. The rapid growth of the health data scale, the limited storage and computation resources of wireless body area sensor networks is becoming a barrier to the development of the health industry to keep up. Aiming to solve this, the outsourcing of encrypted health data to the cloud has been an appealing strategy. However, there may come potential downsides as do all choices. The data aggregation will become more difficult and more vulnerable to data branches of this sensitive information of the patients of the healthcare industry. In his academic article, "Privacy-Enhanced and Multifunctional Health Data Aggregation under Differential Privacy Guarantees," Hao Ren and his team proposes a privacy enhanced and multifunctional health data aggregation scheme (PMHA-DP) under differential privacy. This aggregation function is designed to protect the aggregated data from cloud servers. The performance and evaluation done in their study shows that the proposal leads to less communication overhead than the existing data aggregation models currently in place. [14]

Internet Connected Vehicles

The idea of having internet in one's car would only be a dream if this concept was brought up during the last century. However, now most updated vehicles contain this feature for the convenience of the users. Though convenient, this poses yet another threat to the user's privacy. Internet of connected vehicles (IoV) are expected to enable intelligent traffic management, intelligent dynamic information services, intelligent vehicle control, etc. However, vehicles’ data privacy is argued to be a major barrier toward the application and development of IoV, thus causing a wide range of attention. Local differential privacy (LDP) is the relaxed version of the privacy standard, differential privacy, and it can protect users’ data privacy against the untrusted third party in the worst adversarial setting. The computational costs of using LDP is one concern among researchers as it is quite expensive to implement for such a specific model given that the model needs high mobility and short connection times. [15] Furthermore, as the number of vehicles increases, the frequent communication between vehicles and the cloud server incurs unexpected amounts of communication cost. To avoid the privacy threat and reduce the communication cost, researchers propose to integrate federated learning and local differential privacy (LDP) to facilitate the crowdsourcing applications to achieve the machine learning model. [16]

Phone Blacklisting

With LDP based systems, it is shown that it can counter the ever-growing population of spam calls while protecting users' privacy. Mobile phone spam Vi.jpg
With LDP based systems, it is shown that it can counter the ever-growing population of spam calls while protecting users' privacy.

The topic of spam phone calls has been increasingly relevant, and though it has been a growing nuisance to the current digital world, researchers have been looking at potential solutions in minimizing this issue. To counter this increasingly successful attack vector, federal agencies such as the US Federal Trade Commission (FTC) have been working with telephone carriers to design systems for blocking robocalls. Furthermore, a number of commercial and smartphone apps that promise to block spam phone calls have been created, but they come with a subtle cost. The user’s privacy information that comes with giving the app the access to block spam calls may be leaked without the user’s consent or knowledge of it even occurring. In the study, [17] the researchers analyze the challenges and trade-offs related to using local differential privacy, evaluate the LDP-based system on real-world user-reported call records collected by the FTC, and show that it is possible to learn a phone blacklist using a reasonable overall privacy budget and at the same time preserve users’ privacy while maintaining utility for the learned blacklist.

Trajectory Cross-Correlation Constraint

Aiming to solve the problem of low data utilization and privacy protection, a personalized differential privacy protection method based on cross-correlation constraints is proposed by researcher Hu. By protecting sensitive location points on the trajectory and the sensitive points, this extended differential privacy protection model combines the sensitivity of the user’s trajectory location and user privacy protection requirements and privacy budget. Using autocorrelation Laplace transform, specific white noise is transformed into noise that is related to the user's real trajectory sequence in both time and space. This noise data is used to find the cross-correlation constraint mechanics of the trajectory sequence in the model. By proposing this model, the researcher Hu's personalized differential privacy protection method is broken down and addresses the issue of adding independent and uncorrelated noise and the same degree of scrambling results in low privacy protection and poor data availability. [18]

ε-local differential privacy

Definition of ε-local differential privacy

Let ε be a positive real number and be a randomized algorithm that takes a user's private data as input. Let denote the image of . The algorithm is said to provide -local differential privacy if, for all pairs of users' possible private data and and all subsets of :

where the probability is taken over the random measure implicit in the algorithm.

The main difference between this definition of local differential privacy and the definition of standard (global) differential privacy is that in standard differential privacy the probabilities are of the outputs of an algorithm that takes all users' data and here it is on an algorithm that takes a single user's data.

Other formal definitions of local differential privacy concern algorithms that categorize all users' data as input and output a collection of all responses (such as the definition in Raef Bassily, Kobbi Nissim, Uri Stemmer and Abhradeep Guha Thakurta's 2017 paper [19] ).

Deployment

Algorithms guaranteeing local differential privacy have been deployed in several internet companies:

Related Research Articles

Reinforcement learning (RL) is an interdisciplinary area of machine learning and optimal control concerned with how an intelligent agent ought to take actions in a dynamic environment in order to maximize the cumulative reward. Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning.

In statistics, the k-nearest neighbors algorithm (k-NN) is a non-parametric supervised learning method first developed by Evelyn Fix and Joseph Hodges in 1951, and later expanded by Thomas Cover. It is used for classification and regression. In both cases, the input consists of the k closest training examples in a data set. The output depends on whether k-NN is used for classification or regression:

An autoencoder is a type of artificial neural network used to learn efficient codings of unlabeled data. An autoencoder learns two functions: an encoding function that transforms the input data, and a decoding function that recreates the input data from the encoded representation. The autoencoder learns an efficient representation (encoding) for a set of data, typically for dimensionality reduction.

In data analysis, anomaly detection is generally understood to be the identification of rare items, events or observations which deviate significantly from the majority of the data and do not conform to a well defined notion of normal behavior. Such examples may arouse suspicions of being generated by a different mechanism, or appear inconsistent with the remainder of that set of data.

In computer science, locality-sensitive hashing (LSH) is a fuzzy hashing technique that hashes similar input items into the same "buckets" with high probability. Since similar items end up in the same buckets, this technique can be used for data clustering and nearest neighbor search. It differs from conventional hashing techniques in that hash collisions are maximized, not minimized. Alternatively, the technique can be seen as a way to reduce the dimensionality of high-dimensional data; high-dimensional input items can be reduced to low-dimensional versions while preserving relative distances between items.

Privacy-enhancing technologies (PET) are technologies that embody fundamental data protection principles by minimizing personal data use, maximizing data security, and empowering individuals. PETs allow online users to protect the privacy of their personally identifiable information (PII), which is often provided to and handled by services or applications. PETs use techniques to minimize an information system's possession of personal data without losing functionality. Generally speaking, PETs can be categorized as either hard or soft privacy technologies.

The exponential mechanism is a technique for designing differentially private algorithms. It was developed by Frank McSherry and Kunal Talwar in 2007. Their work was recognized as a co-winner of the 2009 PET Award for Outstanding Research in Privacy Enhancing Technologies.

<span class="mw-page-title-main">Differential privacy</span> Methods of safely sharing general data

Differential privacy (DP) is a mathematically rigorous framework for releasing statistical information about datasets while protecting the privacy of individual data subjects. It enables a data holder to share aggregate patterns of the group while limiting information that is leaked about specific individuals. This is done by injecting carefully calibrated noise into statistical computations such that the utility of the statistic is preserved while provably limiting what can be inferred about any individual in the dataset.

Synthetic data is information that is artificially generated rather than produced by real-world events. Typically created using algorithms, synthetic data can be deployed to validate mathematical models and to train machine learning models.

Adversarial machine learning is the study of the attacks on machine learning algorithms, and of the defenses against such attacks. A survey from May 2020 exposes the fact that practitioners report a dire need for better protecting machine learning systems in industrial applications.

Algorithm selection is a meta-algorithmic technique to choose an algorithm from a portfolio on an instance-by-instance basis. It is motivated by the observation that on many practical problems, different algorithms have different performance characteristics. That is, while one algorithm performs well in some scenarios, it performs poorly in others and vice versa for another algorithm. If we can identify when to use which algorithm, we can optimize for each scenario and improve overall performance. This is what algorithm selection aims to do. The only prerequisite for applying algorithm selection techniques is that there exists a set of complementary algorithms.

Differentially private analysis of graphs studies algorithms for computing accurate graph statistics while preserving differential privacy. Such algorithms are used for data represented in the form of a graph where nodes correspond to individuals and edges correspond to relationships between them. For examples, edges could correspond to friendships, sexual relationships, or communication patterns. A party that collected sensitive graph data can process it using a differentially private algorithm and publish the output of the algorithm. The goal of differentially private analysis of graphs is to design algorithms that compute accurate global information about graphs while preserving privacy of individuals whose data is stored in the graph.

Adding controlled noise from predetermined distributions is a way of designing differentially private mechanisms. This technique is useful for designing private mechanisms for real-valued functions on sensitive data. Some commonly used distributions for adding noise include Laplace and Gaussian distributions.

A reconstruction attack is any method for partially reconstructing a private dataset from public aggregate information. Typically, the dataset contains sensitive information about individuals, whose privacy needs to be protected. The attacker has no or only partial access to the dataset, but has access to public aggregate statistics about the datasets, which could be exact or distorted, for example by adding noise. If the public statistics are not sufficiently distorted, the attacker is able to accurately reconstruct a large portion of the original private data. Reconstruction attacks are relevant to the analysis of private data, as they show that, in order to preserve even a very weak notion of individual privacy, any published statistics need to be sufficiently distorted. This phenomenon was called the Fundamental Law of Information Recovery by Dwork and Roth, and formulated as "overly accurate answers to too many questions will destroy privacy in a spectacular way."

Spatial cloaking is a privacy mechanism that is used to satisfy specific privacy requirements by blurring users’ exact locations into cloaked regions. This technique is usually integrated into applications in various environments to minimize the disclosure of private information when users request location-based service. Since the database server does not receive the accurate location information, a set including the satisfying solution would be sent back to the user. General privacy requirements include K-anonymity, maximum area, and minimum area.

<span class="mw-page-title-main">Federated learning</span> Decentralized machine learning

Federated learning is a sub-field of machine learning focusing on settings in which multiple entities collaboratively train a model while ensuring that their data remains decentralized. This stands in contrast to machine learning settings in which data is centrally stored. One of the primary defining characteristics of federated learning is data heterogeneity. Due to the decentralized nature of the clients' data, there is no guarantee that data samples held by each client are independently and identically distributed.

Proof of personhood (PoP) is a means of resisting malicious attacks on peer to peer networks, particularly, attacks that utilize multiple fake identities, otherwise known as a Sybil attack. Decentralized online platforms are particularly vulnerable to such attacks by their very nature, as notionally democratic and responsive to large voting blocks. In PoP, each unique human participant obtains one equal unit of voting power, and any associated rewards.

Soft privacy technologies fall under the category of PETs, Privacy-enhancing technologies, as methods of protecting data. Soft privacy is a counterpart to another subcategory of PETs, called hard privacy. Soft privacy technology has the goal of keeping information safe, allowing services to process data while having full control of how data is being used. To accomplish this, soft privacy emphasizes the use of third-party programs to protect privacy, emphasizing auditing, certification, consent, access control, encryption, and differential privacy. Since evolving technologies like the internet, machine learning, and big data are being applied to many long-standing fields, we now need to process billions of datapoints every day in areas such as health care, autonomous cars, smart cards, social media, and more. Many of these fields rely on soft privacy technologies when they handle data.

Data sanitization involves the secure and permanent erasure of sensitive data from datasets and media to guarantee that no residual data can be recovered even through extensive forensic analysis. Data sanitization has a wide range of applications but is mainly used for clearing out end-of-life electronic devices or for the sharing and use of large datasets that contain sensitive information. The main strategies for erasing personal data from devices are physical destruction, cryptographic erasure, and data erasure. While the term data sanitization may lead some to believe that it only includes data on electronic media, the term also broadly covers physical media, such as paper copies. These data types are termed soft for electronic files and hard for physical media paper copies. Data sanitization methods are also applied for the cleaning of sensitive data, such as through heuristic-based methods, machine-learning based methods, and k-source anonymity.

Ali Dehghantanha is an academic-entrepreneur in cybersecurity and cyber threat intelligence. He is a Professor of Cybersecurity and a Canada Research Chair in Cybersecurity and Threat Intelligence.

References

  1. "Local vs. global differential privacy – Ted is writing things". desfontain.es. Retrieved 2020-02-10.
  2. Joseph, Matthew; Roth, Aaron; Ullman, Jonathan; Waggoner, Bo (2018-11-19). "Local Differential Privacy for Evolving Data". arXiv: 1802.07128 [cs.LG].
  3. Wang, Teng; Zhang, Xuefeng; Feng, Jingyu; Yang, Xinyu (2020-12-08). "A Comprehensive Survey on Local Differential Privacy toward Data Statistics and Analysis". Sensors (Basel, Switzerland). 20 (24): 7030. arXiv: 2010.05253 . Bibcode:2020Senso..20.7030W. doi: 10.3390/s20247030 . ISSN   1424-8220. PMC   7763193 . PMID   33302517.
  4. Evfimievski, Alexandre V.; Gehrke, Johannes; Srikant, Ramakrishnan (June 9–12, 2003). "Limiting privacy breaches in privacy preserving data mining". Proceedings of the Twenty-Second ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems. pp. 211–222. doi:10.1145/773153.773174. ISBN   1-58113-670-6. S2CID   2379506.
  5. Kasiviswanathan, Shiva Prasad; Lee, Homin K.; Nissim, Kobbi; Raskhodnikova, Sofya; Smith, Adam D. (2008). "What Can We Learn Privately?". 2008 49th Annual IEEE Symposium on Foundations of Computer Science. pp. 531–540. arXiv: 0803.0924 . doi:10.1109/FOCS.2008.27. ISBN   978-0-7695-3436-7.
  6. Warner, Stanley L. (1965). "Randomized Response: A Survey Technique for Eliminating Evasive Answer Bias". Journal of the American Statistical Association. 60 (309): 63–69. doi:10.1080/01621459.1965.10480775. PMID   12261830. S2CID   35435339.
  7. Aljably, Randa; Tian, Yuan; Al-Rodhaan, Mznah; Al-Dhelaan, Abdullah (2019-04-25). "Anomaly detection over differential preserved privacy in online social networks". PLOS ONE. 14 (4): e0215856. Bibcode:2019PLoSO..1415856A. doi: 10.1371/journal.pone.0215856 . ISSN   1932-6203. PMC   6483223 . PMID   31022238.
  8. Ul Hassan, Muneeb; Rehmani, Mubashir Husain; Chen, Jinjun (2020-11-01). "Differential privacy in blockchain technology: A futuristic approach". Journal of Parallel and Distributed Computing. 145: 50–74. arXiv: 1910.04316 . doi:10.1016/j.jpdc.2020.06.003. ISSN   0743-7315. S2CID   204008404.
  9. Acharya, Jayadev; Bonawitz, Kallista; Kairouz, Peter; Ramage, Daniel; Sun, Ziteng (2020-11-21). "Context Aware Local Differential Privacy". International Conference on Machine Learning. PMLR: 52–62. arXiv: 1911.00038 .
  10. Kim, Jong-Min; Warde, William D. (2004-02-15). "A stratified Warner's randomized response model". Journal of Statistical Planning and Inference. 120 (1–2): 155–165. doi:10.1016/S0378-3758(02)00500-1. ISSN   0378-3758.
  11. Murakami, Takao; Kawamoto, Yusuke (2019). "Utility-Optimized Local Differential Privacy Mechanisms for Distribution Estimation" (PDF). Proceedings of the 28th USENIX Security Symposium: 1877–1894. arXiv: 1807.11317 .
  12. Chamikara, M.A.P.; Bertok, P.; Khalil, I.; Liu, D.; Camtepe, S. (2020-10-01). "Privacy Preserving Face Recognition Utilizing Differential Privacy". Computers & Security. 97: 101951. arXiv: 2005.10486 . doi:10.1016/j.cose.2020.101951. ISSN   0167-4048. S2CID   218763393.
  13. Rodríguez-Barroso, Nuria; Stipcich, Goran; Jiménez-López, Daniel; Antonio Ruiz-Millán, José; Martínez-Cámara, Eugenio; González-Seco, Gerardo; Luzón, M. Victoria; Veganzones, Miguel Ángel; Herrera, Francisco (2020). "Federated Learning and Differential Privacy: Software Tools Analysis, the Sherpa.ai FL Framework and Methodological Guidelines for Preserving Data Privacy". Information Fusion. 64: 270–92. arXiv: 2007.00914 . doi:10.1016/j.inffus.2020.07.009. S2CID   220302072.
  14. Ren, Hao; Li, Hongwei; Liang, Xiaohui; He, Shibo; Dai, Yuanshun; Zhao, Lian (2016-09-10). "Privacy-Enhanced and Multifunctional Health Data Aggregation under Differential Privacy Guarantees". Sensors (Basel, Switzerland). 16 (9): 1463. Bibcode:2016Senso..16.1463R. doi: 10.3390/s16091463 . ISSN   1424-8220. PMC   5038741 . PMID   27626417.
  15. Zhao, Ping; Zhang, Guanglin; Wan, Shaohua; Liu, Gaoyang; Umer, Tariq (2020-11-01). "A survey of local differential privacy for securing internet of vehicles". The Journal of Supercomputing. 76 (11): 8391–8412. doi:10.1007/s11227-019-03104-0. S2CID   208869853.
  16. Zhao, Yang; Zhao, Jun; Yang, Mengmeng; Wang, Teng; Wang, Ning; Lyu, Lingjuan; Niyato, Dusit; Lam, Kwok-Yan (2020-11-10). "Local Differential Privacy based Federated Learning for Internet of Things". IEEE Internet of Things Journal. PP (11): 8836–8853. arXiv: 2004.08856 . doi:10.1109/JIOT.2020.3037194. hdl:10356/147888. S2CID   215828540.
  17. Ucci, Daniele; Perdisci, Roberto; Lee, Jaewoo; Ahamad, Mustaque (2020-06-01). "Towards a Practical Differentially Private Collaborative Phone Blacklisting System". Annual Computer Security Applications Conference. pp. 100–115. arXiv: 2006.09287 . doi:10.1145/3427228.3427239. ISBN   978-1-4503-8858-0. S2CID   227911367.
  18. Hu, Zhaowei; Yang, Jing (2020-08-12). "Differential privacy protection method based on published trajectory cross-correlation constraint". PLOS ONE. 15 (8): e0237158. Bibcode:2020PLoSO..1537158H. doi: 10.1371/journal.pone.0237158 . ISSN   1932-6203. PMC   7423147 . PMID   32785242.
  19. Bassily, Raef; Nissim, Kobbi; Stemmer, Uri; Thakurta, Abhradeep Guha (2017). "Privacy Aware Learning". Practical Locally Private Heavy Hitters. Advances in Neural Information Processing Systems. Vol. 30. pp. 2288–2296. arXiv: 1707.04982 . Bibcode:2017arXiv170704982B.
  20. Erlingsson, Úlfar; Pihur, Vasyl; Korolova, Aleksandra (2014). "RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response". arXiv: 1407.6981 . Bibcode:2014arXiv1407.6981E. doi:10.1145/2660267.2660348. S2CID   6855746.{{cite journal}}: Cite journal requires |journal= (help)
  21. "Learning with Privacy at Scale". 2017.{{cite journal}}: Cite journal requires |journal= (help)