fbpx
Wikipedia

Differential privacy

Differential privacy (DP) is an approach for providing privacy while sharing information about a group of individuals, by describing the patterns within the group while withholding information about specific individuals.[1][2] This is done by making arbitrary small changes to individual data that do not change the statistics of interest. Thus the data cannot be used to infer much about any individual.

Another way to describe differential privacy is as a constraint on the algorithms used to publish aggregate information about a statistical database which limits the disclosure of private information of records in the database. For example, differentially private algorithms are used by some government agencies to publish demographic information or other statistical aggregates while ensuring confidentiality of survey responses, and by companies to collect information about user behavior while controlling what is visible even to internal analysts.

Roughly, an algorithm is differentially private if an observer seeing its output cannot tell whether a particular individual's information was used in the computation. Differential privacy is often discussed in the context of identifying individuals whose information may be in a database. Although it does not directly refer to identification and reidentification attacks, differentially private algorithms provably resist such attacks.[3]

Differential privacy was developed by cryptographers and thus is often associated with cryptography, and draws much of its language from cryptography.

History Edit

Historical background Edit

Official statistics organizations are charged with collecting information from individuals or establishments, and publishing aggregate data to serve the public interest. For example, the 1790 United States Census collected information about individuals living in the United States and published tabulations based on sex, age, race, and condition of servitude.[4] Census records were originally posted, but started with the 1840 Census they were collected under a promise of confidentiality that the information provided will be used for statistical purposes, but that the publications will not produce information that can be traced back to a specific individual or establishment.

To accomplish the goal of confidentiality, statistical organizations have long suppressed information in their publications. For example, in a table presenting the sales of each business in a town grouped by business category, a cell that has information from only one company might be suppressed, in order to maintain the confidentiality of that company's specific sales.

The adoption of electronic information processing systems by statistical agencies in the 1950s and 1960s dramatically increased the number of tables that a statistical organization could produce and, in so doing, significantly increased the potential for an improper disclosure of confidential information. For example, if a business that had its sales numbers suppressed also had those numbers appear in the total sales of a region, then it might be possible to determine the suppressed value by subtracting the other sales from that total. But there might also be combinations of additions and subtractions that might cause the private information to be revealed. The number of combinations that needed to be checked increases exponentially with the number of publications, and it is potentially unbounded if data users are able to make queries of the statistical database using an interactive query system.

Early research leading to differential privacy Edit

In 1977, Tore Dalenius formalized the mathematics of cell suppression.[5] Tore Dalenius was a Swedish statistician who contributed to statistical privacy through his 1977 paper that revealed a key point about statistical databases, which was that databases shouldn't reveal information about an individual that isn't otherwise accessible.[6]

In 1979, Dorothy Denning, Peter J. Denning and Mayer D. Schwartz formalized the concept of a Tracker, an adversary that could learn the confidential contents of a statistical database by creating a series of targeted queries and remembering the results.[7] This and future research showed that privacy properties in a database could only be preserved by considering each new query in light of (possibly all) previous queries. This line of work is sometimes called query privacy, with the final result being that tracking the impact of a query on the privacy of individuals in the database was NP-hard.

21st century research into differential privacy Edit

In 2003, Kobbi Nissim and Irit Dinur demonstrated that it is impossible to publish arbitrary queries on a private statistical database without revealing some amount of private information, and that the entire information content of the database can be revealed by publishing the results of a surprisingly small number of random queries—far fewer than was implied by previous work.[8] The general phenomenon is known as the Fundamental Law of Information Recovery, and its key insight, namely that in the most general case, privacy cannot be protected without injecting some amount of noise, led to development of differential privacy.

In 2006, Cynthia Dwork, Frank McSherry, Kobbi Nissim and Adam D. Smith published an article formalizing the amount of noise that needed to be added and proposing a generalized mechanism for doing so.[3] Their work was a co-recipient of the 2016 TCC Test-of-Time Award[9] and the 2017 Gödel Prize.[10]

Since then, subsequent research has shown that there are many ways to produce very accurate statistics from the database while still ensuring high levels of privacy.[1]

ε-differential privacy Edit

The 2006 Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam D. Smith article introduced the concept of ε-differential privacy, a mathematical definition for the privacy loss associated with any data release drawn from a statistical database. (Here, the term statistical database means a set of data that are collected under the pledge of confidentiality for the purpose of producing statistics that, by their production, do not compromise the privacy of those individuals who provided the data.)

The intuition for the 2006 definition of ε-differential privacy is that a person's privacy cannot be compromised by a statistical release if their data are not in the database. Therefore, with differential privacy, the goal is to give each individual roughly the same privacy that would result from having their data removed. That is, the statistical functions run on the database should not overly depend on the data of any one individual.

Of course, how much any individual contributes to the result of a database query depends in part on how many people's data are involved in the query. If the database contains data from a single person, that person's data contributes 100%. If the database contains data from a hundred people, each person's data contributes just 1%. The key insight of differential privacy is that as the query is made on the data of fewer and fewer people, more noise needs to be added to the query result to produce the same amount of privacy. Hence the name of the 2006 paper, "Calibrating noise to sensitivity in private data analysis."

The 2006 paper presents both a mathematical definition of differential privacy and a mechanism based on the addition of Laplace noise (i.e. noise coming from the Laplace distribution) that satisfies the definition.

Definition of ε-differential privacy Edit

Let ε be a positive real number and   be a randomized algorithm that takes a dataset as input (representing the actions of the trusted party holding the data).

Let   denote the image of  .

The algorithm   is said to provide ε-differential privacy if, for all datasets   and   that differ on a single element (i.e., the data of one person), and all subsets   of  :

 

where the probability is taken over the randomness used by the algorithm.[11]


Differential privacy offers strong and robust guarantees that facilitate modular design and analysis of differentially private mechanisms due to its composability, robustness to post-processing, and graceful degradation in the presence of correlated data.

Composability Edit

(Self-)composability refers to the fact that the joint distribution of the outputs of (possibly adaptively chosen) differentially private mechanisms satisfies differential privacy.

Sequential composition. If we query an ε-differential privacy mechanism   times, and the randomization of the mechanism is independent for each query, then the result would be  -differentially private. In the more general case, if there are   independent mechanisms:  , whose privacy guarantees are   differential privacy, respectively, then any function   of them:   is  -differentially private.[12]

Parallel composition. If the previous mechanisms are computed on disjoint subsets of the private database then the function   would be  -differentially private instead.[12]

Robustness to post-processing Edit

For any deterministic or randomized function   defined over the image of the mechanism  , if   satisfies ε-differential privacy, so does  .

Together, composability and robustness to post-processing permit modular construction and analysis of differentially private mechanisms and motivate the concept of the privacy loss budget. If all elements that access sensitive data of a complex mechanisms are separately differentially private, so will be their combination, followed by arbitrary post-processing.

Group privacy Edit

In general, ε-differential privacy is designed to protect the privacy between neighboring databases which differ only in one row. This means that no adversary with arbitrary auxiliary information can know if one particular participant submitted his information. However this is also extendable. We may want to protect databases differing in   rows, which amounts to an adversary with arbitrary auxiliary information knowing if   particular participants submitted their information. This can be achieved because if   items change, the probability dilation is bounded by   instead of  ,[13] i.e., for D1 and D2 differing on   items:

 

Thus setting ε instead to   achieves the desired result (protection of   items). In other words, instead of having each item ε-differentially private protected, now every group of   items is ε-differentially private protected (and each item is  -differentially private protected).

ε-differentially private mechanisms Edit

Since differential privacy is a probabilistic concept, any differentially private mechanism is necessarily randomized. Some of these, like the Laplace mechanism, described below, rely on adding controlled noise to the function that we want to compute. Others, like the exponential mechanism[14] and posterior sampling[15] sample from a problem-dependent family of distributions instead.

Sensitivity Edit

Let   be a positive integer,   be a collection of datasets, and   be a function. The sensitivity [3] of a function, denoted  , is defined by

 
where the maximum is over all pairs of datasets   and   in   differing in at most one element and   denotes the   norm.

In the example of the medical database below, if we consider   to be the function  , then the sensitivity of the function is one, since changing any one of the entries in the database causes the output of the function to change by either zero or one.

There are techniques (which are described below) using which we can create a differentially private algorithm for functions with low sensitivity.

The Laplace mechanism Edit

The Laplace mechanism adds Laplace noise (i.e. noise from the Laplace distribution, which can be expressed by probability density function  , which has mean zero and standard deviation  ). Now in our case we define the output function of   as a real valued function (called as the transcript output by  ) as   where   and   is the original real valued query/function we planned to execute on the database. Now clearly   can be considered to be a continuous random variable, where

 

which is at most  . We can consider   to be the privacy factor  . Thus   follows a differentially private mechanism (as can be seen from the definition above). If we try to use this concept in our diabetes example then it follows from the above derived fact that in order to have   as the  -differential private algorithm we need to have  . Though we have used Laplace noise here, other forms of noise, such as the Gaussian Noise, can be employed, but they may require a slight relaxation of the definition of differential privacy.[13]

According to this definition, differential privacy is a condition on the release mechanism (i.e., the trusted party releasing information about the dataset) and not on the dataset itself. Intuitively, this means that for any two datasets that are similar, a given differentially private algorithm will behave approximately the same on both datasets. The definition gives a strong guarantee that presence or absence of an individual will not affect the final output of the algorithm significantly.

For example, assume we have a database of medical records   where each record is a pair (Name, X), where   is a Boolean denoting whether a person has diabetes or not. For example:

Name Has Diabetes (X)
Ross 1
Monica 1
Joey 0
Phoebe 0
Chandler 1
Rachel 0

Now suppose a malicious user (often termed an adversary) wants to find whether Chandler has diabetes or not. Suppose he also knows in which row of the database Chandler resides. Now suppose the adversary is only allowed to use a particular form of query   that returns the partial sum of the first   rows of column   in the database. In order to find Chandler's diabetes status the adversary executes   and  , then computes their difference. In this example,   and  , so their difference is 1. This indicates that the "Has Diabetes" field in Chandler's row must be 1. This example highlights how individual information can be compromised even without explicitly querying for the information of a specific individual.

Continuing this example, if we construct   by replacing (Chandler, 1) with (Chandler, 0) then this malicious adversary will be able to distinguish   from   by computing   for each dataset. If the adversary were required to receive the values   via an  -differentially private algorithm, for a sufficiently small  , then he or she would be unable to distinguish between the two datasets.

Randomized response Edit

A simple example, especially developed in the social sciences,[16] is to ask a person to answer the question "Do you own the attribute A?", according to the following procedure:

  1. Toss a coin.
  2. If heads, then toss the coin again (ignoring the outcome), and answer the question honestly.
  3. If tails, then toss the coin again and answer "Yes" if heads, "No" if tails.

(The seemingly redundant extra toss in the first case is needed in situations where just the act of tossing a coin may be observed by others, even if the actual result stays hidden.) The confidentiality then arises from the refutability of the individual responses.

But, overall, these data with many responses are significant, since positive responses are given to a quarter by people who do not have the attribute A and three-quarters by people who actually possess it. Thus, if p is the true proportion of people with A, then we expect to obtain (1/4)(1-p) + (3/4)p = (1/4) + p/2 positive responses. Hence it is possible to estimate p.

In particular, if the attribute A is synonymous with illegal behavior, then answering "Yes" is not incriminating, insofar as the person has a probability of a "Yes" response, whatever it may be.

Although this example, inspired by randomized response, might be applicable to microdata (i.e., releasing datasets with each individual response), by definition differential privacy excludes microdata releases and is only applicable to queries (i.e., aggregating individual responses into one result) as this would violate the requirements, more specifically the plausible deniability that a subject participated or not.[17][18]

Stable transformations Edit

A transformation   is  -stable if the Hamming distance between   and   is at most  -times the Hamming distance between   and   for any two databases  . Theorem 2 in [12] asserts that if there is a mechanism   that is  -differentially private, then the composite mechanism   is  -differentially private.

This could be generalized to group privacy, as the group size could be thought of as the Hamming distance   between   and   (where   contains the group and   doesn't). In this case   is  -differentially private.

Other notions of differential privacy Edit

Since differential privacy is considered to be too strong or weak for some applications, many versions of it have been proposed.[19] The most widespread relaxation is (ε, δ)-differential privacy,[20] which weakens the definition by allowing an additional small δ density of probability on which the upper bound ε does not hold.

Adoption of differential privacy in real-world applications Edit

To date there are over 12 real-world deployments of differential privacy, the most noteworthy being:

  • 2008: U.S. Census Bureau, for showing commuting patterns.[21]
  • 2014: Google's RAPPOR, for telemetry such as learning statistics about unwanted software hijacking users' settings.[22][23]
  • 2015: Google, for sharing historical traffic statistics.[24]
  • 2016: Apple iOS 10, for use in Intelligent personal assistant technology.[25]
  • 2017: Microsoft, for telemetry in Windows.[26]
  • 2020: Social Science One and Facebook, a 55 trillion cell dataset for researchers to learn about elections and democracy.[27][28]
  • 2021: The US Census Bureau uses differential privacy to release redistricting data from the 2020 Census.[29]

Public purpose considerations Edit

There are several public purpose considerations regarding differential privacy that are important to consider, especially for policymakers and policy-focused audiences interested in the social opportunities and risks of the technology:[30]

  • Data utility and accuracy. The main concern with differential privacy is the trade-off between data utility and individual privacy. If the privacy loss parameter is set to favor utility, the privacy benefits are lowered (less “noise” is injected into the system); if the privacy loss parameter is set to favor heavy privacy, the accuracy and utility of the dataset are lowered (more “noise” is injected into the system). It is important for policymakers to consider the trade-offs posed by differential privacy in order to help set appropriate best practices and standards around the use of this privacy preserving practice, especially considering the diversity in organizational use cases. It is worth noting, though, that decreased accuracy and utility is a common issue among all statistical disclosure limitation methods and is not unique to differential privacy. What is unique, however, is how policymakers, researchers, and implementers can consider mitigating against the risks presented through this trade-off.
  • Data privacy and security. Differential privacy provides a quantified measure of privacy loss and an upper bound and allows curators to choose the explicit trade-off between privacy and accuracy. It is robust to still unknown privacy attacks. However, it encourages greater data sharing, which if done poorly, increases privacy risk. Differential privacy implies that privacy is protected, but this depends very much on the privacy loss parameter chosen and may instead lead to a false sense of security. Finally, though it is robust against unforeseen future privacy attacks, a countermeasure may be devised that we cannot predict.

See also Edit

Publications Edit

  • Calibrating noise to sensitivity in private data analysis, Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. 2006. In Proceedings of the Third conference on Theory of Cryptography (TCC'06). Springer-Verlag, Berlin, Heidelberg, 265–284. https://doi.org/10.1007/11681878_14 (This is the original publication of Differential Privacy, and not the eponymous article by Dwork that was published the same year.)
  • Differential Privacy: A Survey of Results by Cynthia Dwork, Microsoft Research, April 2008 (Presents what was discovered during the first two years of research on differential privacy.)
  • The Algorithmic Foundations of Differential Privacy by Cynthia Dwork and Aaron Roth, 2014. (This is the open-source textbook published by Dwork and Roth.)
  • Learning Statistics with Privacy, aided by the Flip of a Coin by Úlfar Erlingsson, Google Research Blog, October 2014 (Google's use of local differential privacy in the Chrome Browser, later abandoned.)
  • Differential Privacy: A Primer for a Non-Technical Audience, Alexandra Wood, Micah Altman, Aaron Bembenek, Mark Bun, Marco Gaboardi, et al, Vanderbilt Journal of Entertainment & Technology LawVanderbilt Journal of Entertainment, Volume 21, Issue 1, Fall 2018. (A good introductory document, but definitely *not* for non-technical audiences!)
  • Technology Factsheet: Differential Privacy by Raina Gandhi and Amritha Jayanti, Belfer Center for Science and International Affairs, Fall 2020
  • Differential Privacy and the 2020 US Census, MIT Case Studies in Social and Ethical Responsibilities of Computing, no. Winter 2022 (January). https://doi.org/10.21428/2c646de5.7ec6ab93.

Tutorials Edit

  • A Practical Beginner's Guide To Differential Privacy by Christine Task, Purdue University, April 2012

References Edit

  1. ^ a b Hilton, Michael. "Differential Privacy: A Historical Survey". S2CID 16861132. {{cite journal}}: Cite journal requires |journal= (help)
  2. ^ Dwork, Cynthia (2008-04-25). "Differential Privacy: A Survey of Results". In Agrawal, Manindra; Du, Dingzhu; Duan, Zhenhua; Li, Angsheng (eds.). Theory and Applications of Models of Computation. Lecture Notes in Computer Science. Vol. 4978. Springer Berlin Heidelberg. pp. 1–19. doi:10.1007/978-3-540-79228-4_1. ISBN 978-3-540-79227-7. S2CID 2887752.
  3. ^ a b c Calibrating Noise to Sensitivity in Private Data Analysis by Cynthia Dwork, Frank McSherry, Kobbi Nissim, Adam Smith. In Theory of Cryptography Conference (TCC), Springer, 2006. doi:10.1007/11681878_14. The full version appears in Journal of Privacy and Confidentiality, 7 (3), 17-51. doi:10.29012/jpc.v7i3.405
  4. ^ "1790 Census Records".
  5. ^ Tore Dalenius (1977). "Towards a methodology for statistical disclosure control" (PDF). Statistik Tidskrift. 15.
  6. ^ Dwork, Cynthia (2006). Bugliesi, Michele; Preneel, Bart; Sassone, Vladimiro; Wegener, Ingo (eds.). "Differential Privacy". Automata, Languages and Programming. Lecture Notes in Computer Science. Berlin, Heidelberg: Springer: 1–12. doi:10.1007/11787006_1. ISBN 978-3-540-35908-1.
  7. ^ Dorothy E. Denning; Peter J. Denning; Mayer D. Schwartz (March 1979). "The Tracker: A Threat to Statistical Database Security". ACM Transactions on Database Systems. 4 (1): 76–96. doi:10.1145/320064.320069. S2CID 207655625.
  8. ^ Irit Dinur and Kobbi Nissim. 2003. Revealing information while preserving privacy. In Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems (PODS '03). ACM, New York, NY, USA, 202–210. doi:10.1145/773153.773173
  9. ^ "TCC Test-of-Time Award".
  10. ^ "2017 Gödel Prize".
  11. ^ The Algorithmic Foundations of Differential Privacy by Cynthia Dwork and Aaron Roth. Foundations and Trends in Theoretical Computer Science. Vol. 9, no. 3–4, pp. 211‐407, Aug. 2014. doi:10.1561/0400000042
  12. ^ a b c Privacy integrated queries: an extensible platform for privacy-preserving data analysis by Frank D. McSherry. In Proceedings of the 35th SIGMOD International Conference on Management of Data (SIGMOD), 2009. doi:10.1145/1559845.1559850
  13. ^ a b Differential Privacy by Cynthia Dwork, International Colloquium on Automata, Languages and Programming (ICALP) 2006, p. 1–12. doi:10.1007/11787006_1
  14. ^ F.McSherry and K.Talwar. Mechasim Design via Differential Privacy. Proceedings of the 48th Annual Symposium of Foundations of Computer Science, 2007.
  15. ^ Christos Dimitrakakis, Blaine Nelson, Aikaterini Mitrokotsa, Benjamin Rubinstein. Robust and Private Bayesian Inference. Algorithmic Learning Theory 2014
  16. ^ Warner, S. L. (March 1965). "Randomised response: a survey technique for eliminating evasive answer bias". Journal of the American Statistical Association. Taylor & Francis. 60 (309): 63–69. doi:10.1080/01621459.1965.10480775. JSTOR 2283137. PMID 12261830. S2CID 35435339.
  17. ^ Dwork, Cynthia. "A firm foundation for private data analysis." Communications of the ACM 54.1 (2011): 86–95, supra note 19, page 91.
  18. ^ Bambauer, Jane, Krishnamurty Muralidhar, and Rathindra Sarathy. "Fool's gold: an illustrated critique of differential privacy." Vand. J. Ent. & Tech. L. 16 (2013): 701.
  19. ^ SoK: Differential Privacies by Damien Desfontaines, Balázs Pejó. 2019.
  20. ^ Dwork, Cynthia, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. "Our data, ourselves: Privacy via distributed noise generation." In Advances in Cryptology – EUROCRYPT 2006, pp. 486–503. Springer Berlin Heidelberg, 2006.
  21. ^ Ashwin Machanavajjhala, Daniel Kifer, John M. Abowd, Johannes Gehrke, and Lars Vilhuber. "Privacy: Theory meets Practice on the Map". In Proceedings of the 24th International Conference on Data Engineering, ICDE) 2008.
  22. ^ Úlfar Erlingsson, Vasyl Pihur, Aleksandra Korolova. "RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response". In Proceedings of the 21st ACM Conference on Computer and Communications Security (CCS), 2014. doi:10.1145/2660267.2660348
  23. ^ google/rappor, GitHub, 2021-07-15
  24. ^ Tackling Urban Mobility with Technology by Andrew Eland. Google Policy Europe Blog, Nov 18, 2015.
  25. ^ "Apple – Press Info – Apple Previews iOS 10, the Biggest iOS Release Ever". Apple. Retrieved 20 June 2023.
  26. ^ Collecting telemetry data privately by Bolin Ding, Jana Kulkarni, Sergey Yekhanin. NIPS 2017.
  27. ^ Messing, Solomon; DeGregorio, Christina; Hillenbrand, Bennett; King, Gary; Mahanti, Saurav; Mukerjee, Zagreb; Nayak, Chaya; Persily, Nate; State, Bogdan (2020), Facebook Privacy-Protected Full URLs Data Set, Zagreb Mukerjee, Harvard Dataverse, doi:10.7910/dvn/tdoapg, retrieved 2023-02-08
  28. ^ Evans, Georgina; King, Gary (January 2023). "Statistically Valid Inferences from Differentially Private Data Releases, with Application to the Facebook URLs Dataset". Political Analysis. 31 (1): 1–21. doi:10.1017/pan.2022.1. ISSN 1047-1987. S2CID 211137209.
  29. ^ "Disclosure Avoidance for the 2020 Census: An Introduction". 2 November 2021.
  30. ^ "Technology Factsheet: Differential Privacy". Belfer Center for Science and International Affairs. Retrieved 2021-04-12.

Further reading Edit

  • A reading list on differential privacy
  • Abowd, John. 2017. “How Will Statistical Agencies Operate When All Data Are Private?”. Journal of Privacy and Confidentiality 7 (3). doi:10.29012/jpc.v7i3.404 (slides)
  • "Differential Privacy: A Primer for a Non-technical Audience", Kobbi Nissim, Thomas Steinke, Alexandra Wood, Micah Altman, Aaron Bembenek, Mark Bun, Marco Gaboardi, David R. O’Brien, and Salil Vadhan, Harvard Privacy Tools Project, February 14, 2018
  • Dinur, Irit and Kobbi Nissim. 2003. Revealing information while preserving privacy. In Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems(PODS '03). ACM, New York, NY, USA, 202–210. doi:10.1145/773153.773173.
  • Dwork, Cynthia, Frank McSherry, Kobbi Nissim, and Adam Smith. 2006. in Halevi, S. & Rabin, T. (Eds.) Calibrating Noise to Sensitivity in Private Data Analysis Theory of Cryptography: Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4–7, 2006. Proceedings, Springer Berlin Heidelberg, 265–284, doi:10.1007/11681878 14.
  • Dwork, Cynthia. 2006. Differential Privacy, 33rd International Colloquium on Automata, Languages and Programming, part II (ICALP 2006), Springer Verlag, 4052, 1–12, ISBN 3-540-35907-9.
  • Dwork, Cynthia and Aaron Roth. 2014. The Algorithmic Foundations of Differential Privacy. Foundations and Trends in Theoretical Computer Science. Vol. 9, Nos. 3–4. 211–407, doi:10.1561/0400000042.
  • Machanavajjhala, Ashwin, Daniel Kifer, John M. Abowd, Johannes Gehrke, and Lars Vilhuber. 2008. Privacy: Theory Meets Practice on the Map, International Conference on Data Engineering (ICDE) 2008: 277–286, doi:10.1109/ICDE.2008.4497436.
  • Dwork, Cynthia and Moni Naor. 2010. On the Difficulties of Disclosure Prevention in Statistical Databases or The Case for Differential Privacy, Journal of Privacy and Confidentiality: Vol. 2: Iss. 1, Article 8. Available at: http://repository.cmu.edu/jpc/vol2/iss1/8.
  • Kifer, Daniel and Ashwin Machanavajjhala. 2011. No free lunch in data privacy. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data (SIGMOD '11). ACM, New York, NY, USA, 193–204. doi:10.1145/1989323.1989345.
  • Erlingsson, Úlfar, Vasyl Pihur and Aleksandra Korolova. 2014. RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security (CCS '14). ACM, New York, NY, USA, 1054–1067. doi:10.1145/2660267.2660348.
  • Abowd, John M. and Ian M. Schmutte. 2017 . Revisiting the economics of privacy: Population statistics and confidentiality protection as public goods. Labor Dynamics Institute, Cornell University, Labor Dynamics Institute, Cornell University, at https://digitalcommons.ilr.cornell.edu/ldi/37/
  • Abowd, John M. and Ian M. Schmutte. Forthcoming. An Economic Analysis of Privacy Protection and Statistical Accuracy as Social Choices. American Economic Review, arXiv:1808.06303
  • Apple, Inc. 2016. Apple previews iOS 10, the biggest iOS release ever. Press Release (June 13). https://www.apple.com/newsroom/2016/06/apple-previews-ios-10-biggest-ios-release-ever.html.
  • Ding, Bolin, Janardhan Kulkarni, and Sergey Yekhanin 2017. Collecting Telemetry Data Privately, NIPS 2017.
  • http://www.win-vector.com/blog/2015/10/a-simpler-explanation-of-differential-privacy/
  • Ryffel, Theo, Andrew Trask, et al. "A generic framework for privacy preserving deep learning"

differential, privacy, approach, providing, privacy, while, sharing, information, about, group, individuals, describing, patterns, within, group, while, withholding, information, about, specific, individuals, this, done, making, arbitrary, small, changes, indi. Differential privacy DP is an approach for providing privacy while sharing information about a group of individuals by describing the patterns within the group while withholding information about specific individuals 1 2 This is done by making arbitrary small changes to individual data that do not change the statistics of interest Thus the data cannot be used to infer much about any individual Another way to describe differential privacy is as a constraint on the algorithms used to publish aggregate information about a statistical database which limits the disclosure of private information of records in the database For example differentially private algorithms are used by some government agencies to publish demographic information or other statistical aggregates while ensuring confidentiality of survey responses and by companies to collect information about user behavior while controlling what is visible even to internal analysts Roughly an algorithm is differentially private if an observer seeing its output cannot tell whether a particular individual s information was used in the computation Differential privacy is often discussed in the context of identifying individuals whose information may be in a database Although it does not directly refer to identification and reidentification attacks differentially private algorithms provably resist such attacks 3 Differential privacy was developed by cryptographers and thus is often associated with cryptography and draws much of its language from cryptography Contents 1 History 1 1 Historical background 2 Early research leading to differential privacy 3 21st century research into differential privacy 4 e differential privacy 4 1 Definition of e differential privacy 4 2 Composability 4 3 Robustness to post processing 4 4 Group privacy 5 e differentially private mechanisms 5 1 Sensitivity 5 2 The Laplace mechanism 5 3 Randomized response 5 4 Stable transformations 6 Other notions of differential privacy 7 Adoption of differential privacy in real world applications 8 Public purpose considerations 9 See also 9 1 Publications 9 2 Tutorials 10 References 11 Further readingHistory EditHistorical background Edit Official statistics organizations are charged with collecting information from individuals or establishments and publishing aggregate data to serve the public interest For example the 1790 United States Census collected information about individuals living in the United States and published tabulations based on sex age race and condition of servitude 4 Census records were originally posted but started with the 1840 Census they were collected under a promise of confidentiality that the information provided will be used for statistical purposes but that the publications will not produce information that can be traced back to a specific individual or establishment To accomplish the goal of confidentiality statistical organizations have long suppressed information in their publications For example in a table presenting the sales of each business in a town grouped by business category a cell that has information from only one company might be suppressed in order to maintain the confidentiality of that company s specific sales The adoption of electronic information processing systems by statistical agencies in the 1950s and 1960s dramatically increased the number of tables that a statistical organization could produce and in so doing significantly increased the potential for an improper disclosure of confidential information For example if a business that had its sales numbers suppressed also had those numbers appear in the total sales of a region then it might be possible to determine the suppressed value by subtracting the other sales from that total But there might also be combinations of additions and subtractions that might cause the private information to be revealed The number of combinations that needed to be checked increases exponentially with the number of publications and it is potentially unbounded if data users are able to make queries of the statistical database using an interactive query system Early research leading to differential privacy EditIn 1977 Tore Dalenius formalized the mathematics of cell suppression 5 Tore Dalenius was a Swedish statistician who contributed to statistical privacy through his 1977 paper that revealed a key point about statistical databases which was that databases shouldn t reveal information about an individual that isn t otherwise accessible 6 In 1979 Dorothy Denning Peter J Denning and Mayer D Schwartz formalized the concept of a Tracker an adversary that could learn the confidential contents of a statistical database by creating a series of targeted queries and remembering the results 7 This and future research showed that privacy properties in a database could only be preserved by considering each new query in light of possibly all previous queries This line of work is sometimes called query privacy with the final result being that tracking the impact of a query on the privacy of individuals in the database was NP hard 21st century research into differential privacy EditIn 2003 Kobbi Nissim and Irit Dinur demonstrated that it is impossible to publish arbitrary queries on a private statistical database without revealing some amount of private information and that the entire information content of the database can be revealed by publishing the results of a surprisingly small number of random queries far fewer than was implied by previous work 8 The general phenomenon is known as the Fundamental Law of Information Recovery and its key insight namely that in the most general case privacy cannot be protected without injecting some amount of noise led to development of differential privacy In 2006 Cynthia Dwork Frank McSherry Kobbi Nissim and Adam D Smith published an article formalizing the amount of noise that needed to be added and proposing a generalized mechanism for doing so 3 Their work was a co recipient of the 2016 TCC Test of Time Award 9 and the 2017 Godel Prize 10 Since then subsequent research has shown that there are many ways to produce very accurate statistics from the database while still ensuring high levels of privacy 1 e differential privacy EditThe 2006 Cynthia Dwork Frank McSherry Kobbi Nissim and Adam D Smith article introduced the concept of e differential privacy a mathematical definition for the privacy loss associated with any data release drawn from a statistical database Here the term statistical database means a set of data that are collected under the pledge of confidentiality for the purpose of producing statistics that by their production do not compromise the privacy of those individuals who provided the data The intuition for the 2006 definition of e differential privacy is that a person s privacy cannot be compromised by a statistical release if their data are not in the database Therefore with differential privacy the goal is to give each individual roughly the same privacy that would result from having their data removed That is the statistical functions run on the database should not overly depend on the data of any one individual Of course how much any individual contributes to the result of a database query depends in part on how many people s data are involved in the query If the database contains data from a single person that person s data contributes 100 If the database contains data from a hundred people each person s data contributes just 1 The key insight of differential privacy is that as the query is made on the data of fewer and fewer people more noise needs to be added to the query result to produce the same amount of privacy Hence the name of the 2006 paper Calibrating noise to sensitivity in private data analysis The 2006 paper presents both a mathematical definition of differential privacy and a mechanism based on the addition of Laplace noise i e noise coming from the Laplace distribution that satisfies the definition Definition of e differential privacy Edit Let e be a positive real number and A displaystyle mathcal A nbsp be a randomized algorithm that takes a dataset as input representing the actions of the trusted party holding the data Let im A displaystyle textrm im mathcal A nbsp denote the image of A displaystyle mathcal A nbsp The algorithm A displaystyle mathcal A nbsp is said to provide e differential privacy if for all datasets D 1 displaystyle D 1 nbsp and D 2 displaystyle D 2 nbsp that differ on a single element i e the data of one person and all subsets S displaystyle S nbsp of im A displaystyle textrm im mathcal A nbsp Pr A D 1 S Pr A D 2 S e e displaystyle frac Pr mathcal A D 1 in S Pr mathcal A D 2 in S leq e varepsilon nbsp where the probability is taken over the randomness used by the algorithm 11 Differential privacy offers strong and robust guarantees that facilitate modular design and analysis of differentially private mechanisms due to its composability robustness to post processing and graceful degradation in the presence of correlated data Composability Edit Self composability refers to the fact that the joint distribution of the outputs of possibly adaptively chosen differentially private mechanisms satisfies differential privacy Sequential composition If we query an e differential privacy mechanism t displaystyle t nbsp times and the randomization of the mechanism is independent for each query then the result would be e t displaystyle varepsilon t nbsp differentially private In the more general case if there are n displaystyle n nbsp independent mechanisms M 1 M n displaystyle mathcal M 1 dots mathcal M n nbsp whose privacy guarantees are e 1 e n displaystyle varepsilon 1 dots varepsilon n nbsp differential privacy respectively then any function g displaystyle g nbsp of them g M 1 M n displaystyle g mathcal M 1 dots mathcal M n nbsp is i 1 n e i displaystyle left sum limits i 1 n varepsilon i right nbsp differentially private 12 Parallel composition If the previous mechanisms are computed on disjoint subsets of the private database then the function g displaystyle g nbsp would be max i e i displaystyle max i varepsilon i nbsp differentially private instead 12 Robustness to post processing Edit For any deterministic or randomized function F displaystyle F nbsp defined over the image of the mechanism A displaystyle mathcal A nbsp if A displaystyle mathcal A nbsp satisfies e differential privacy so does F A displaystyle F mathcal A nbsp Together composability and robustness to post processing permit modular construction and analysis of differentially private mechanisms and motivate the concept of the privacy loss budget If all elements that access sensitive data of a complex mechanisms are separately differentially private so will be their combination followed by arbitrary post processing Group privacy Edit In general e differential privacy is designed to protect the privacy between neighboring databases which differ only in one row This means that no adversary with arbitrary auxiliary information can know if one particular participant submitted his information However this is also extendable We may want to protect databases differing in c displaystyle c nbsp rows which amounts to an adversary with arbitrary auxiliary information knowing if c displaystyle c nbsp particular participants submitted their information This can be achieved because if c displaystyle c nbsp items change the probability dilation is bounded by exp e c displaystyle exp varepsilon c nbsp instead of exp e displaystyle exp varepsilon nbsp 13 i e for D1 and D2 differing on c displaystyle c nbsp items Pr A D 1 S exp e c Pr A D 2 S displaystyle Pr mathcal A D 1 in S leq exp varepsilon c cdot Pr mathcal A D 2 in S nbsp Thus setting e instead to e c displaystyle varepsilon c nbsp achieves the desired result protection of c displaystyle c nbsp items In other words instead of having each item e differentially private protected now every group of c displaystyle c nbsp items is e differentially private protected and each item is e c displaystyle varepsilon c nbsp differentially private protected e differentially private mechanisms EditSince differential privacy is a probabilistic concept any differentially private mechanism is necessarily randomized Some of these like the Laplace mechanism described below rely on adding controlled noise to the function that we want to compute Others like the exponential mechanism 14 and posterior sampling 15 sample from a problem dependent family of distributions instead Sensitivity Edit Let d displaystyle d nbsp be a positive integer D displaystyle mathcal D nbsp be a collection of datasets and f D R d displaystyle f colon mathcal D rightarrow mathbb R d nbsp be a function The sensitivity 3 of a function denoted D f displaystyle Delta f nbsp is defined byD f max f D 1 f D 2 1 displaystyle Delta f max lVert f D 1 f D 2 rVert 1 nbsp where the maximum is over all pairs of datasets D 1 displaystyle D 1 nbsp and D 2 displaystyle D 2 nbsp in D displaystyle mathcal D nbsp differing in at most one element and 1 displaystyle lVert cdot rVert 1 nbsp denotes the ℓ 1 displaystyle ell 1 nbsp norm In the example of the medical database below if we consider f displaystyle f nbsp to be the function Q i displaystyle Q i nbsp then the sensitivity of the function is one since changing any one of the entries in the database causes the output of the function to change by either zero or one There are techniques which are described below using which we can create a differentially private algorithm for functions with low sensitivity The Laplace mechanism Edit See also Additive noise mechanisms The Laplace mechanism adds Laplace noise i e noise from the Laplace distribution which can be expressed by probability density function noise y exp y l displaystyle text noise y propto exp y lambda nbsp which has mean zero and standard deviation 2 l displaystyle sqrt 2 lambda nbsp Now in our case we define the output function of A displaystyle mathcal A nbsp as a real valued function called as the transcript output by A displaystyle mathcal A nbsp as T A x f x Y displaystyle mathcal T mathcal A x f x Y nbsp where Y Lap l displaystyle Y sim text Lap lambda nbsp and f displaystyle f nbsp is the original real valued query function we planned to execute on the database Now clearly T A x displaystyle mathcal T mathcal A x nbsp can be considered to be a continuous random variable where p d f T A D 1 x t p d f T A D 2 x t noise t f D 1 noise t f D 2 displaystyle frac mathrm pdf mathcal T mathcal A D 1 x t mathrm pdf mathcal T mathcal A D 2 x t frac text noise t f D 1 text noise t f D 2 nbsp which is at most e f D 1 f D 2 l e D f l displaystyle e frac f D 1 f D 2 lambda leq e frac Delta f lambda nbsp We can consider D f l displaystyle frac Delta f lambda nbsp to be the privacy factor e displaystyle varepsilon nbsp Thus T displaystyle mathcal T nbsp follows a differentially private mechanism as can be seen from the definition above If we try to use this concept in our diabetes example then it follows from the above derived fact that in order to have A displaystyle mathcal A nbsp as the e displaystyle varepsilon nbsp differential private algorithm we need to have l 1 e displaystyle lambda 1 varepsilon nbsp Though we have used Laplace noise here other forms of noise such as the Gaussian Noise can be employed but they may require a slight relaxation of the definition of differential privacy 13 According to this definition differential privacy is a condition on the release mechanism i e the trusted party releasing information about the dataset and not on the dataset itself Intuitively this means that for any two datasets that are similar a given differentially private algorithm will behave approximately the same on both datasets The definition gives a strong guarantee that presence or absence of an individual will not affect the final output of the algorithm significantly For example assume we have a database of medical records D 1 displaystyle D 1 nbsp where each record is a pair Name X where X displaystyle X nbsp is a Boolean denoting whether a person has diabetes or not For example Name Has Diabetes X Ross 1Monica 1Joey 0Phoebe 0Chandler 1Rachel 0Now suppose a malicious user often termed an adversary wants to find whether Chandler has diabetes or not Suppose he also knows in which row of the database Chandler resides Now suppose the adversary is only allowed to use a particular form of query Q i displaystyle Q i nbsp that returns the partial sum of the first i displaystyle i nbsp rows of column X displaystyle X nbsp in the database In order to find Chandler s diabetes status the adversary executes Q 5 D 1 displaystyle Q 5 D 1 nbsp and Q 4 D 1 displaystyle Q 4 D 1 nbsp then computes their difference In this example Q 5 D 1 3 displaystyle Q 5 D 1 3 nbsp and Q 4 D 1 2 displaystyle Q 4 D 1 2 nbsp so their difference is 1 This indicates that the Has Diabetes field in Chandler s row must be 1 This example highlights how individual information can be compromised even without explicitly querying for the information of a specific individual Continuing this example if we construct D 2 displaystyle D 2 nbsp by replacing Chandler 1 with Chandler 0 then this malicious adversary will be able to distinguish D 2 displaystyle D 2 nbsp from D 1 displaystyle D 1 nbsp by computing Q 5 Q 4 displaystyle Q 5 Q 4 nbsp for each dataset If the adversary were required to receive the values Q i displaystyle Q i nbsp via an e displaystyle varepsilon nbsp differentially private algorithm for a sufficiently small e displaystyle varepsilon nbsp then he or she would be unable to distinguish between the two datasets Randomized response Edit See also Local differential privacy A simple example especially developed in the social sciences 16 is to ask a person to answer the question Do you own the attribute A according to the following procedure Toss a coin If heads then toss the coin again ignoring the outcome and answer the question honestly If tails then toss the coin again and answer Yes if heads No if tails The seemingly redundant extra toss in the first case is needed in situations where just the act of tossing a coin may be observed by others even if the actual result stays hidden The confidentiality then arises from the refutability of the individual responses But overall these data with many responses are significant since positive responses are given to a quarter by people who do not have the attribute A and three quarters by people who actually possess it Thus if p is the true proportion of people with A then we expect to obtain 1 4 1 p 3 4 p 1 4 p 2 positive responses Hence it is possible to estimate p In particular if the attribute A is synonymous with illegal behavior then answering Yes is not incriminating insofar as the person has a probability of a Yes response whatever it may be Although this example inspired by randomized response might be applicable to microdata i e releasing datasets with each individual response by definition differential privacy excludes microdata releases and is only applicable to queries i e aggregating individual responses into one result as this would violate the requirements more specifically the plausible deniability that a subject participated or not 17 18 Stable transformations Edit A transformation T displaystyle T nbsp is c displaystyle c nbsp stable if the Hamming distance between T A displaystyle T A nbsp and T B displaystyle T B nbsp is at most c displaystyle c nbsp times the Hamming distance between A displaystyle A nbsp and B displaystyle B nbsp for any two databases A B displaystyle A B nbsp Theorem 2 in 12 asserts that if there is a mechanism M displaystyle M nbsp that is e displaystyle varepsilon nbsp differentially private then the composite mechanism M T displaystyle M circ T nbsp is e c displaystyle varepsilon times c nbsp differentially private This could be generalized to group privacy as the group size could be thought of as the Hamming distance h displaystyle h nbsp between A displaystyle A nbsp and B displaystyle B nbsp where A displaystyle A nbsp contains the group and B displaystyle B nbsp doesn t In this case M T displaystyle M circ T nbsp is e c h displaystyle varepsilon times c times h nbsp differentially private Other notions of differential privacy EditSince differential privacy is considered to be too strong or weak for some applications many versions of it have been proposed 19 The most widespread relaxation is e d differential privacy 20 which weakens the definition by allowing an additional small d density of probability on which the upper bound e does not hold Adoption of differential privacy in real world applications EditSee also Implementations of differentially private analyses To date there are over 12 real world deployments of differential privacy the most noteworthy being 2008 U S Census Bureau for showing commuting patterns 21 2014 Google s RAPPOR for telemetry such as learning statistics about unwanted software hijacking users settings 22 23 2015 Google for sharing historical traffic statistics 24 2016 Apple iOS 10 for use in Intelligent personal assistant technology 25 2017 Microsoft for telemetry in Windows 26 2020 Social Science One and Facebook a 55 trillion cell dataset for researchers to learn about elections and democracy 27 28 2021 The US Census Bureau uses differential privacy to release redistricting data from the 2020 Census 29 Public purpose considerations EditThere are several public purpose considerations regarding differential privacy that are important to consider especially for policymakers and policy focused audiences interested in the social opportunities and risks of the technology 30 Data utility and accuracy The main concern with differential privacy is the trade off between data utility and individual privacy If the privacy loss parameter is set to favor utility the privacy benefits are lowered less noise is injected into the system if the privacy loss parameter is set to favor heavy privacy the accuracy and utility of the dataset are lowered more noise is injected into the system It is important for policymakers to consider the trade offs posed by differential privacy in order to help set appropriate best practices and standards around the use of this privacy preserving practice especially considering the diversity in organizational use cases It is worth noting though that decreased accuracy and utility is a common issue among all statistical disclosure limitation methods and is not unique to differential privacy What is unique however is how policymakers researchers and implementers can consider mitigating against the risks presented through this trade off Data privacy and security Differential privacy provides a quantified measure of privacy loss and an upper bound and allows curators to choose the explicit trade off between privacy and accuracy It is robust to still unknown privacy attacks However it encourages greater data sharing which if done poorly increases privacy risk Differential privacy implies that privacy is protected but this depends very much on the privacy loss parameter chosen and may instead lead to a false sense of security Finally though it is robust against unforeseen future privacy attacks a countermeasure may be devised that we cannot predict See also EditImplementations of differentially private analyses deployments of differential privacy Quasi identifier Exponential mechanism differential privacy a technique for designing differentially private algorithms k anonymity Differentially private analysis of graphs Protected health information Local differential privacy PrivacyPublications Edit Calibrating noise to sensitivity in private data analysis Cynthia Dwork Frank McSherry Kobbi Nissim and Adam Smith 2006 In Proceedings of the Third conference on Theory of Cryptography TCC 06 Springer Verlag Berlin Heidelberg 265 284 https doi org 10 1007 11681878 14 This is the original publication of Differential Privacy and not the eponymous article by Dwork that was published the same year Differential Privacy A Survey of Results by Cynthia Dwork Microsoft Research April 2008 Presents what was discovered during the first two years of research on differential privacy The Algorithmic Foundations of Differential Privacy by Cynthia Dwork and Aaron Roth 2014 This is the open source textbook published by Dwork and Roth Learning Statistics with Privacy aided by the Flip of a Coin by Ulfar Erlingsson Google Research Blog October 2014 Google s use of local differential privacy in the Chrome Browser later abandoned Differential Privacy A Primer for a Non Technical Audience Alexandra Wood Micah Altman Aaron Bembenek Mark Bun Marco Gaboardi et al Vanderbilt Journal of Entertainment amp Technology LawVanderbilt Journal of Entertainment Volume 21 Issue 1 Fall 2018 A good introductory document but definitely not for non technical audiences Technology Factsheet Differential Privacy by Raina Gandhi and Amritha Jayanti Belfer Center for Science and International Affairs Fall 2020 Differential Privacy and the 2020 US Census MIT Case Studies in Social and Ethical Responsibilities of Computing no Winter 2022 January https doi org 10 21428 2c646de5 7ec6ab93 Tutorials Edit A Practical Beginner s Guide To Differential Privacy by Christine Task Purdue University April 2012References Edit a b Hilton Michael Differential Privacy A Historical Survey S2CID 16861132 a href Template Cite journal html title Template Cite journal cite journal a Cite journal requires journal help Dwork Cynthia 2008 04 25 Differential Privacy A Survey of Results In Agrawal Manindra Du Dingzhu Duan Zhenhua Li Angsheng eds Theory and Applications of Models of Computation Lecture Notes in Computer Science Vol 4978 Springer Berlin Heidelberg pp 1 19 doi 10 1007 978 3 540 79228 4 1 ISBN 978 3 540 79227 7 S2CID 2887752 a b c Calibrating Noise to Sensitivity in Private Data Analysis by Cynthia Dwork Frank McSherry Kobbi Nissim Adam Smith In Theory of Cryptography Conference TCC Springer 2006 doi 10 1007 11681878 14 The full version appears in Journal of Privacy and Confidentiality 7 3 17 51 doi 10 29012 jpc v7i3 405 1790 Census Records Tore Dalenius 1977 Towards a methodology for statistical disclosure control PDF Statistik Tidskrift 15 Dwork Cynthia 2006 Bugliesi Michele Preneel Bart Sassone Vladimiro Wegener Ingo eds Differential Privacy Automata Languages and Programming Lecture Notes in Computer Science Berlin Heidelberg Springer 1 12 doi 10 1007 11787006 1 ISBN 978 3 540 35908 1 Dorothy E Denning Peter J Denning Mayer D Schwartz March 1979 The Tracker A Threat to Statistical Database Security ACM Transactions on Database Systems 4 1 76 96 doi 10 1145 320064 320069 S2CID 207655625 Irit Dinur and Kobbi Nissim 2003 Revealing information while preserving privacy In Proceedings of the twenty second ACM SIGMOD SIGACT SIGART symposium on Principles of database systems PODS 03 ACM New York NY USA 202 210 doi 10 1145 773153 773173 TCC Test of Time Award 2017 Godel Prize The Algorithmic Foundations of Differential Privacy by Cynthia Dwork and Aaron Roth Foundations and Trends in Theoretical Computer Science Vol 9 no 3 4 pp 211 407 Aug 2014 doi 10 1561 0400000042 a b c Privacy integrated queries an extensible platform for privacy preserving data analysis by Frank D McSherry In Proceedings of the 35th SIGMOD International Conference on Management of Data SIGMOD 2009 doi 10 1145 1559845 1559850 a b Differential Privacy by Cynthia Dwork International Colloquium on Automata Languages and Programming ICALP 2006 p 1 12 doi 10 1007 11787006 1 F McSherry and K Talwar Mechasim Design via Differential Privacy Proceedings of the 48th Annual Symposium of Foundations of Computer Science 2007 Christos Dimitrakakis Blaine Nelson Aikaterini Mitrokotsa Benjamin Rubinstein Robust and Private Bayesian Inference Algorithmic Learning Theory 2014 Warner S L March 1965 Randomised response a survey technique for eliminating evasive answer bias Journal of the American Statistical Association Taylor amp Francis 60 309 63 69 doi 10 1080 01621459 1965 10480775 JSTOR 2283137 PMID 12261830 S2CID 35435339 Dwork Cynthia A firm foundation for private data analysis Communications of the ACM 54 1 2011 86 95 supra note 19 page 91 Bambauer Jane Krishnamurty Muralidhar and Rathindra Sarathy Fool s gold an illustrated critique of differential privacy Vand J Ent amp Tech L 16 2013 701 SoK Differential Privacies by Damien Desfontaines Balazs Pejo 2019 Dwork Cynthia Krishnaram Kenthapadi Frank McSherry Ilya Mironov and Moni Naor Our data ourselves Privacy via distributed noise generation In Advances in Cryptology EUROCRYPT 2006 pp 486 503 Springer Berlin Heidelberg 2006 Ashwin Machanavajjhala Daniel Kifer John M Abowd Johannes Gehrke and Lars Vilhuber Privacy Theory meets Practice on the Map In Proceedings of the 24th International Conference on Data Engineering ICDE 2008 Ulfar Erlingsson Vasyl Pihur Aleksandra Korolova RAPPOR Randomized Aggregatable Privacy Preserving Ordinal Response In Proceedings of the 21st ACM Conference on Computer and Communications Security CCS 2014 doi 10 1145 2660267 2660348 google rappor GitHub 2021 07 15 Tackling Urban Mobility with Technology by Andrew Eland Google Policy Europe Blog Nov 18 2015 Apple Press Info Apple Previews iOS 10 the Biggest iOS Release Ever Apple Retrieved 20 June 2023 Collecting telemetry data privately by Bolin Ding Jana Kulkarni Sergey Yekhanin NIPS 2017 Messing Solomon DeGregorio Christina Hillenbrand Bennett King Gary Mahanti Saurav Mukerjee Zagreb Nayak Chaya Persily Nate State Bogdan 2020 Facebook Privacy Protected Full URLs Data Set Zagreb Mukerjee Harvard Dataverse doi 10 7910 dvn tdoapg retrieved 2023 02 08 Evans Georgina King Gary January 2023 Statistically Valid Inferences from Differentially Private Data Releases with Application to the Facebook URLs Dataset Political Analysis 31 1 1 21 doi 10 1017 pan 2022 1 ISSN 1047 1987 S2CID 211137209 Disclosure Avoidance for the 2020 Census An Introduction 2 November 2021 Technology Factsheet Differential Privacy Belfer Center for Science and International Affairs Retrieved 2021 04 12 Further reading EditA reading list on differential privacy Abowd John 2017 How Will Statistical Agencies Operate When All Data Are Private Journal of Privacy and Confidentiality 7 3 doi 10 29012 jpc v7i3 404 slides Differential Privacy A Primer for a Non technical Audience Kobbi Nissim Thomas Steinke Alexandra Wood Micah Altman Aaron Bembenek Mark Bun Marco Gaboardi David R O Brien and Salil Vadhan Harvard Privacy Tools Project February 14 2018 Dinur Irit and Kobbi Nissim 2003 Revealing information while preserving privacy In Proceedings of the twenty second ACM SIGMOD SIGACT SIGART symposium on Principles of database systems PODS 03 ACM New York NY USA 202 210 doi 10 1145 773153 773173 Dwork Cynthia Frank McSherry Kobbi Nissim and Adam Smith 2006 in Halevi S amp Rabin T Eds Calibrating Noise to Sensitivity in Private Data Analysis Theory of Cryptography Third Theory of Cryptography Conference TCC 2006 New York NY USA March 4 7 2006 Proceedings Springer Berlin Heidelberg 265 284 doi 10 1007 11681878 14 Dwork Cynthia 2006 Differential Privacy 33rd International Colloquium on Automata Languages and Programming part II ICALP 2006 Springer Verlag 4052 1 12 ISBN 3 540 35907 9 Dwork Cynthia and Aaron Roth 2014 The Algorithmic Foundations of Differential Privacy Foundations and Trends in Theoretical Computer Science Vol 9 Nos 3 4 211 407 doi 10 1561 0400000042 Machanavajjhala Ashwin Daniel Kifer John M Abowd Johannes Gehrke and Lars Vilhuber 2008 Privacy Theory Meets Practice on the Map International Conference on Data Engineering ICDE 2008 277 286 doi 10 1109 ICDE 2008 4497436 Dwork Cynthia and Moni Naor 2010 On the Difficulties of Disclosure Prevention in Statistical Databases or The Case for Differential Privacy Journal of Privacy and Confidentiality Vol 2 Iss 1 Article 8 Available at http repository cmu edu jpc vol2 iss1 8 Kifer Daniel and Ashwin Machanavajjhala 2011 No free lunch in data privacy In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data SIGMOD 11 ACM New York NY USA 193 204 doi 10 1145 1989323 1989345 Erlingsson Ulfar Vasyl Pihur and Aleksandra Korolova 2014 RAPPOR Randomized Aggregatable Privacy Preserving Ordinal Response In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security CCS 14 ACM New York NY USA 1054 1067 doi 10 1145 2660267 2660348 Abowd John M and Ian M Schmutte 2017 Revisiting the economics of privacy Population statistics and confidentiality protection as public goods Labor Dynamics Institute Cornell University Labor Dynamics Institute Cornell University at https digitalcommons ilr cornell edu ldi 37 Abowd John M and Ian M Schmutte Forthcoming An Economic Analysis of Privacy Protection and Statistical Accuracy as Social Choices American Economic Review arXiv 1808 06303 Apple Inc 2016 Apple previews iOS 10 the biggest iOS release ever Press Release June 13 https www apple com newsroom 2016 06 apple previews ios 10 biggest ios release ever html Ding Bolin Janardhan Kulkarni and Sergey Yekhanin 2017 Collecting Telemetry Data Privately NIPS 2017 http www win vector com blog 2015 10 a simpler explanation of differential privacy Ryffel Theo Andrew Trask et al A generic framework for privacy preserving deep learning Retrieved from https en wikipedia org w index php title Differential privacy amp oldid 1180301782, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.