nips nips2008 nips2008-185 nips2008-185-reference knowledge-graph by maker-knowledge-mining

185 nips-2008-Privacy-preserving logistic regression

Source: pdf

Author: Kamalika Chaudhuri, Claire Monteleoni

Abstract: This paper addresses the important tradeoff between privacy and learnability, when designing algorithms for learning from private databases. We focus on privacy-preserving logistic regression. First we apply an idea of Dwork et al. [6] to design a privacy-preserving logistic regression algorithm. This involves bounding the sensitivity of regularized logistic regression, and perturbing the learned classiﬁer with noise proportional to the sensitivity. We then provide a privacy-preserving regularized logistic regression algorithm based on a new privacy-preserving technique: solving a perturbed optimization problem. We prove that our algorithm preserves privacy in the model due to [6]. We provide learning guarantees for both algorithms, which are tighter for our new algorithm, in cases in which one would typically apply logistic regression. Experiments demonstrate improved learning performance of our method, versus the sensitivity method. Our privacy-preserving technique does not depend on the sensitivity of the function, and extends easily to a class of convex loss functions. Our work also reveals an interesting connection between regularization and privacy. 1

reference text

[1] R. Agrawal and R. Srikant. Privacy-preserving data mining. SIGMOD Rec., 29(2):439–450, 2000.

[2] B. Barak, K. Chaudhuri, C. Dwork, S. Kale, F. McSherry, and K. Talwar. Privacy, accuracy, and consistency too: a holistic solution to contingency table release. In PODS, pages 273–282, 2007.

[3] A. Blum, K. Ligett, and A. Roth. A learning theory approach to non-interactive database privacy. In R. E. Ladner and C. Dwork, editors, STOC, pages 609–618. ACM, 2008.

[4] K. Chaudhuri and N. Mishra. When random sampling preserves privacy. In C. Dwork, editor, CRYPTO, volume 4117 of Lecture Notes in Computer Science, pages 198–213. Springer, 2006.

[5] C. Dwork. Differential privacy. In M. Bugliesi, B. Preneel, V. Sassone, and I. Wegener, editors, ICALP (2), volume 4052 of Lecture Notes in Computer Science, pages 1–12. Springer, 2006.

[6] C. Dwork, F. McSherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography Conference, pages 265–284, 2006.

[7] A. Evﬁmievski, J. Gehrke, and R. Srikant. Limiting privacy breaches in privacy preserving data mining. In PODS, pages 211–222, 2003.

[8] S. P. Kasiviswanathan, H. K. Lee, K. Nissim, S. Raskhodnikova, and A. Smith. What can we learn privately? In Proc. of Foundations of Computer Science, 2008.

[9] C. T. Kelley. Iterative Methods for Optimization. SIAM, 1999.

[10] A. Machanavajjhala, J. Gehrke, D. Kifer, and M. Venkitasubramaniam. l-diversity: Privacy beyond kanonymity. In ICDE, page 24, 2006.

[11] F. McSherry and K. Talwar. Mechanism design via differential privacy. In FOCS, pages 94–103, 2007.

[12] A. Narayanan and V. Shmatikov. Robust de-anonymization of large sparse datasets. In IEEE Symposium on Security and Privacy, pages 111–125. IEEE Computer Society, 2008.

[13] K. Nissim, S. Raskhodnikova, and A. Smith. Smooth sensitivity and sampling in private data analysis. In D. S. Johnson and U. Feige, editors, STOC, pages 75–84. ACM, 2007.

[14] P. Samarati and L. Sweeney. Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. In Proc. of the IEEE Symposium on Research in Security and Privacy, 1998.

[15] S. Shalev-Shwartz and N. Srebro. Svm optimization: Inverse dependence on training set size. In International Conference on Machine Learning(ICML), 2008.

[16] K. Sridharan, N. Srebro, and S. Shalev-Schwartz. Fast rates for regularized objectives. In Neural Information Processing Systems, 2008. 8