jmlr jmlr2013 jmlr2013-32 jmlr2013-32-reference knowledge-graph by maker-knowledge-mining

32 jmlr-2013-Differential Privacy for Functions and Functional Data


Source: pdf

Author: Rob Hall, Alessandro Rinaldo, Larry Wasserman

Abstract: Differential privacy is a rigorous cryptographically-motivated characterization of data privacy which may be applied when releasing summaries of a database. Previous work has focused mainly on methods for which the output is a finite dimensional vector, or an element of some discrete set. We develop methods for releasing functions while preserving differential privacy. Specifically, we show that adding an appropriate Gaussian process to the function of interest yields differential privacy. When the functions lie in the reproducing kernel Hilbert space (RKHS) generated by the covariance kernel of the Gaussian process, then the correct noise level is established by measuring the “sensitivity” of the function in the RKHS norm. As examples we consider kernel density estimation, kernel support vector machines, and functions in RKHSs. Keywords: differential privacy, density estimation, Gaussian processes, reproducing kernel Hilbert space


reference text

R.J. Adler. An Introduction to Continuity, Extrema, and Related Topics for General Gaussian Processes, volume 12 of Lecture Notes–Monograph series. Institute of Mathematical Statistics, 1990. R.J. Adler and J.E. Taylor. Random Fields and Geometry (Springer Monographs in Mathematics). Springer, 1 edition, June 2007. ISBN 0387481125. N. Aronszajn. Theory of reproducing kernels. Transactions of the American Mathematical Society, 68, 1950. B. Barak, K. Chaudhuri, C. Dwork, S. Kale, F. McSherry, and K. Talwar. Privacy, accuracy, and consistency too: a holistic solution to contingency table release. Proceedings of the TwentySixth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pages 273–282, 2007. 725 H ALL , R INALDO AND WASSERMAN A. Bertinet and Thomas C. Agnan. Reproducing Kernel Hilbert Spaces in Probability and Statistics. Kluwer Academic Publishers, 2004. P. Billingsley. Probability and Measure. Wiley-Interscience, 3 edition, 1995. O. Bousquet and A. Elisseeff. Stability and generalization. Journal of Machine Learning Research, 2:499–526, 2002. A. Charest. Creation and Analysis of Differentially-Private Synthetic Datasets. PhD thesis, Carnegie Mellon University, 2012. K. Chaudhuri and D. Hsu. Convergence rates for differentially private statistical estimation. In ICML’12, 2012. K. Chaudhuri and C. Monteleoni. Privacy preserving logistic regression. NIPS 2008, 2008. K. Chaudhuri and C. Monteleoni. Differentially private empirical risk minimization. Journal of Machine Learning Research, 12:1069–1109, 2011. S. Chawla, C. Dwork, F. McSherry, and K. Talwar. On the utility of privacy-preserving histograms. UAI, 2005. C. Dwork. Differential privacy. 33rd International Colloquium on Automata, Languages and Programming, pages 1–12, 2006. C. Dwork and J. Lei. Differential privacy and robust statistics. Proceedings of the 41st ACM Symposium on Theory of Computing, pages 371–380, May–June 2009. C. Dwork, K. Kenthapadi, F. McSherry, I. Mironov, and M. Naor. Our data, ourselves: Privacy via distributed noise generation. EUROCRYPT, pages 486–503, 2006a. C. Dwork, F. McSherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. Proceedings of the 3rd Theory of Cryptography Conference, pages 265–284, 2006b. C. Dwork, G.N. Rothblum, and S. Vadhan. Boosting and differential privacy. Foundations of Computer Science (FOCS), 2010 51st Annual IEEE Symposium, pages 51–60, 2010. M. Hardt and K. Talwar. On the geometry of differential privacy. STOC ’10 Proceedings of the 42nd ACM Symposium on Theory of computing, pages 705–714, 2010. M. Hardt, K. Ligett, and F. McSherry. A simple and practical algorithm for differentially private data release. Technical Report, 2010. S. Kasiviswanathan, H. Lee, K. Nissim, S. Raskhodnikova, and A. Smith. What can we learn privately? Proceedings of the 49th Annual IEEE Symposium on Foundations of Computer Science, pages 531–540, 2008. A. Machanavajjhala, D. Kifer, J. Abowd, J. Gehrke, and L. Vilhuber. Privacy: Theory meets Practice on the Map. Proceedings of the 24th International Conference on Data Engineering, pages 277– 286, 2008. 726 D IFFERENTIAL P RIVACY FOR F UNCTIONS A. McGregor, I. Mironov, T. Pitassi, O. Reingold, K. Talwar, and S. Vadhan. The limits of two-party differential privacy. In FOCS’10, pages 81–90, 2010. F. McSherry and I. Mironov. Differentially private recommender systems: building privacy into the net. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 627–636, New York, NY, USA, 2009. ACM. F. McSherry and K. Talwar. Mechanism design via differential privacy. Proceedings of the 48th Annual IEEE Symposium on Foundations of Computer Science, pages 94–103, 2007. K. Nissim, S. Raskhodnikova, and A. Smith. Smooth sensitivity and sampling in private data analysis. Proceedings of the 39th Annual ACM Symposium on the Theory of Computing, pages 75–84, 2007. E. Parzen. An approach to time series analysis. The Annals of Mathematical Statistics, 32(4): 951–989, 1961. E. Parzen. Probability density functionals and reproducing kernel hilbert spaces. Proceedings of the Symposium on Time Series Analysis, 196:155–169, 1963. J. Ramsay and B. Silverman. Functional Data Analysis. Springer, 1997. B.I.P. Rubinstein, P.L. Bartlett, L. Huang, and N. Taft. Learning in a large function space: Privacypreserving mechanisms for svm learning. Journal of Privacy and Confidentiality, 2010. D.W. Scott. Multivariate Density Estimation: Theory, Practice, and Visualization (Wiley Series in Probability and Statistics). Wiley, 1992. A. Smith. Privacy-preserving statistical estimation with optimal convergence rates. In Proceedings of the 43rd Annual ACM Symposium on the Theory of Computing, STOC ’11, pages 813–822, New York, NY, USA, 2011. ACM. L. Wasserman. All of Nonparametric Statistics. Springer, 2006. L. Wasserman and S. Zhou. A statistical framework for differential privacy. The Journal of the American Statistical Association, 105:375–389, 2010. 727