nips nips2013 nips2013-102 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Ziteng Wang, Kai Fan, Jiaqi Zhang, Liwei Wang
Abstract: We study differentially private mechanisms for answering smooth queries on databases consisting of data points in Rd . A K-smooth query is specified by a function whose partial derivatives up to order K are all bounded. We develop an -differentially private mechanism which for the class of K-smooth queries has K accuracy O(n− 2d+K / ). The mechanism first outputs a summary of the database. To obtain an answer of a query, the user runs a public evaluation algorithm which contains no information of the database. Outputting the summary runs in time d O(n1+ 2d+K ), and the evaluation algorithm for answering a query runs in time d+2+ 2d K ˜ O(n 2d+K ). Our mechanism is based on L∞ -approximation of (transformed) smooth functions by low degree even trigonometric polynomials with small and efficiently computable coefficients. 1
Reference: text
sentIndex sentText sentNum sentScore
1 cn Abstract We study differentially private mechanisms for answering smooth queries on databases consisting of data points in Rd . [sent-11, score-0.963]
2 A K-smooth query is specified by a function whose partial derivatives up to order K are all bounded. [sent-12, score-0.433]
3 We develop an -differentially private mechanism which for the class of K-smooth queries has K accuracy O(n− 2d+K / ). [sent-13, score-0.798]
4 The mechanism first outputs a summary of the database. [sent-14, score-0.493]
5 To obtain an answer of a query, the user runs a public evaluation algorithm which contains no information of the database. [sent-15, score-0.276]
6 Outputting the summary runs in time d O(n1+ 2d+K ), and the evaluation algorithm for answering a query runs in time d+2+ 2d K ˜ O(n 2d+K ). [sent-16, score-0.772]
7 Our mechanism is based on L∞ -approximation of (transformed) smooth functions by low degree even trigonometric polynomials with small and efficiently computable coefficients. [sent-17, score-0.779]
8 But when releasing statistics of sensitive data, one must tradeoff between the accuracy and the amount of privacy loss of the individuals in the database. [sent-20, score-0.313]
9 In this paper we consider differential privacy [9], which has become a standard concept of privacy. [sent-21, score-0.329]
10 Roughly speaking, a mechanism which releases information about the database is said to preserve differential privacy, if the change of a single database element does not affect the probability distribution of the output significantly. [sent-22, score-0.853]
11 It ensures that the risk of any individual to submit her information to the database is very small. [sent-24, score-0.216]
12 An adversary can discover almost nothing new from the database that contains the individual’s information compared with that from the database without the individual’s information. [sent-25, score-0.308]
13 Recently there have been extensive studies of machine learning, statistical estimation, and data mining under the differential privacy framework [29, 5, 18, 17, 6, 30, 20, 4]. [sent-26, score-0.349]
14 Accurately answering statistical queries is an important problem in differential privacy. [sent-27, score-0.534]
15 A simple and efficient method is the Laplace mechanism [9], which adds Laplace noise to the true answers. [sent-28, score-0.348]
16 Laplace mechanism is especially useful for query functions with low sensitivity, which is the maximal difference of the query values of two databases that are different in only one item. [sent-29, score-1.164]
17 A typical 1 class of queries that has low sensitivity is linear queries, whose sensitivity is O(1/n), where n is the size of the database. [sent-30, score-0.355]
18 If the number of queries is substantially larger than n2 , Laplace mechanism is not able to provide differentially private answers with nontrivial accuracy. [sent-33, score-0.911]
19 Considering that potentially there are many users and each user may submit a set of queries, limiting the number of total queries to be smaller than n2 is too restricted in some situations. [sent-34, score-0.343]
20 A remarkable result due to Blum, Ligett and Roth [2] shows that information theoretically it is possible for a mechanism to answer far more than n2 linear queries while preserving differential privacy and nontrivial accuracy simultaneously. [sent-35, score-1.141]
21 All these mechanisms are very powerful in the sense that they can answer general and adversely chosen queries. [sent-37, score-0.26]
22 On the other hand, even the fastest algorithms [16, 14] run in time linear in the size of the data universe to answer a query. [sent-38, score-0.366]
23 Often the size of the data universe is much larger than that of the database, so these mechanisms are inefficient. [sent-39, score-0.332]
24 Recently, [25] shows that there is no polynomial time algorithm that can answer n2+o(1) general queries while preserving privacy and accuracy (assuming the existence of one-way function). [sent-40, score-0.747]
25 Given the hardness result, recently there are growing interests in studying efficient and differentially private mechanisms for restricted class of queries. [sent-41, score-0.44]
26 From a practical point of view, if there exists a class of queries which is rich enough to contain most queries used in applications and allows one to develop fast mechanisms, then the hardness result is not a serious barrier for differential privacy. [sent-42, score-0.687]
27 One class of queries that attracts a lot of attentions is the k-way conjunctions. [sent-43, score-0.343]
28 A k-way conjunction query is specified by k features. [sent-46, score-0.368]
29 The query asks what fraction of the individual records in the database has all these k features being 1. [sent-47, score-0.546]
30 Another class of queries that yields efficient mechanisms is sparse query. [sent-50, score-0.396]
31 A query is m-sparse if it takes non-zero values on at most m elements in the data universe. [sent-51, score-0.368]
32 When the data universe is [−1, 1]d , where d is a constant, [2] considers rectangle queries. [sent-53, score-0.306]
33 A rectangle query is specified by an axis-aligned rectangle. [sent-54, score-0.425]
34 The answer to the query is the fraction of the data points that lie in the rectangle. [sent-55, score-0.524]
35 [2] shows that if [−1, 1]d is discretized to poly(n) bits of precision, then there are efficient mechanisms for the class of rectangle queries. [sent-56, score-0.187]
36 There are also works studying related range queries [19]. [sent-57, score-0.287]
37 In this paper we study smooth queries defined also on data universe [−1, 1]d for constant d. [sent-58, score-0.593]
38 A smooth query is specified by a smooth function, which has bounded partial derivatives up to a certain order. [sent-59, score-0.629]
39 The answer to the query is the average of the function values on data points in the database. [sent-60, score-0.505]
40 Our main result is an -differentially private mechanism for the class of K-smooth queries, which are specified by functions with bounded partial derivatives up to order K. [sent-63, score-0.6]
41 The mechanism has d 2d+K K ) (α, β)-accuracy, where α = O(n− 2d+K / ) for β ≥ e−O(n . [sent-64, score-0.321]
42 The mechanism first outputs a summary of the database. [sent-65, score-0.493]
43 To obtain an answer of a smooth query, the user runs a public evaluation procedure which contains no information of the database. [sent-66, score-0.374]
44 Outputting the summary has running time d+2+ 2d d K ˜ O n1+ 2d+K , and the evaluation procedure for answering a query runs in time O(n 2d+K ). [sent-67, score-0.785]
45 The mechanism has the advantage that both the accuracy and the running time for answering a query improve quickly as K/d increases (see also Table 1 in Section 3). [sent-68, score-0.938]
46 Our algorithm is a L∞ -approximation based mechanism and is motivated by [24], which considers approximation of k-way conjunctions by low degree polynomials. [sent-69, score-0.392]
47 The basic idea is to approximate the whole query class by linear combination of a small set of basis functions. [sent-70, score-0.455]
48 The technical difficulties lie in that in order that the approximation induces an efficient and differentially private mechanism, all the linear coefficients of the basis functions must be small and efficiently computable. [sent-71, score-0.357]
49 To guarantee these properties, we first transform the query function. [sent-72, score-0.368]
50 The smoothness of the functions also allows us to use an efficient numerical method to compute the coefficients to a precision so that the accuracy of the mechanism is not affected significantly. [sent-74, score-0.46]
51 2 Background Let D be a database containing n data points in the data universe X . [sent-75, score-0.383]
52 Typically, we assume that the data universe X = [−1, 1]d . [sent-77, score-0.229]
53 A sanitizer S which is an algorithm that maps input database into some range R is said to preserve ( , δ)-differential privacy, if for all pairs of neighbor databases D, D and for any subset A ⊂ R, it holds that P(S(D) ∈ A) ≤ P(S(D ) ∈ A) · e + δ. [sent-82, score-0.448]
54 Each linear query qf is specified by a function f which maps data 1 universe [−1, 1]d to R, and qf is defined by qf (D) := |D| x∈D f (x). [sent-85, score-1.444]
55 The accuracy of a mechanism with respect to Q is defined as follows. [sent-87, score-0.358]
56 A sanitizer S is said to have (α, β)accuracy for size n databases with respect to Q, if for every database D with |D| = n the following holds P(∃q ∈ Q, |S(D, q) − q(D)| ≥ α) ≤ β, where S(D, q) is the answer to q given by S. [sent-91, score-0.537]
57 We will make use of Laplace mechanism [9] in our algorithm. [sent-92, score-0.321]
58 We will design a differentially private mechanism which is accurate with respect to a query set Q possibly consisting of infinite number of queries. [sent-95, score-0.949]
59 Given a database D, the sanitizer outputs a summary which preserves differential privacy. [sent-96, score-0.608]
60 For any qf ∈ Q, the user makes use of an evaluation procedure to measure f on the summary and obtain an approximate answer of qf (D). [sent-97, score-0.899]
61 Although we may think of the evaluation procedure as part of the mechanism, it does not contain any information of the database and therefore is public. [sent-98, score-0.188]
62 We will study the running time for the sanitizer outputting the summary. [sent-99, score-0.383]
63 For the evaluation procedure, the running time per query is the focus. [sent-101, score-0.445]
64 In this work we will frequently use trigonometric polynomials. [sent-104, score-0.216]
65 For the univariate case, a function m p(θ) is called a trigonometric polynomial of degree m if p(θ) = a0 + l=1 (al cos lθ + bl sin lθ), where al , bl are constants. [sent-105, score-0.654]
66 If p(θ) is an even function, we say that it is an even trigonometm ric polynomial, and p(θ) = a0 + l=1 al cos lθ. [sent-106, score-0.27]
67 cos(ld θd ), then p is said to be an even trigonometric polynomial (with respect to each variable), and the degree of θi is the upper limit of li . [sent-116, score-0.358]
68 3 Efficient differentially private mechanism Let us first describe the set of queries considered in this work. [sent-117, score-0.847]
69 Since each query qf is specified by a function f , a set of queries QF can be specified by a set of functions F . [sent-118, score-0.948]
70 , kd ) is a d-tuple with nonnegative integers, then we define k k Dk := D1 1 · · · Dd d := 3 ∂ kd ∂ k1 · · · kd . [sent-126, score-0.234]
71 , md ), where m 1 Compute: Sum (D) = n x∈D cos (m1 θ1 (x)) . [sent-139, score-0.263]
72 cos (md θd (x)); Sum (D) ← Sum (D) + Lap Let Su(D) = Sum (D) m ∞ ≤t−1 td n ∞ ≤t−1 ; be a td dimensional vector; Return: Su(D). [sent-142, score-0.307]
73 K Input: A query qf , where f : [−1, 1]d → R and f ∈ CB , d Summary Su(D) (a t -dimensional vector). [sent-144, score-0.642]
74 , θd ) ∈ [−π, π]d ; Compute a trigonometric polynomial approximation pt (θ) of gf (θ), where the degree of each θi is t; // see Section 4 for details of computation. [sent-152, score-0.474]
75 Algorithm 2: Answering a query Let |k| := k1 + . [sent-157, score-0.368]
76 |k|≤K x∈[−1,1]d K We will study the set CB which contains all smooth functions whose derivatives up to order K have K ∞-norm upper bounded by a constant B > 0. [sent-162, score-0.183]
77 The set K K of queries specified by CB , denoted as QCB , is our focus. [sent-164, score-0.266]
78 It says that if the query class is specified by smooth functions, then there is a very efficient mechanism which preserves -differential privacy and good accuracy. [sent-167, score-1.086]
79 The mechanism consists of two parts: One for outputting a summary of the database, the other for answering a query. [sent-168, score-0.83]
80 The second part of the mechanism contains no private information of the database. [sent-170, score-0.468]
81 Let the query set be QCB = {qf = n x∈D f (x) : f ∈ CB }, where K ∈ N and B > 0 are constants. [sent-173, score-0.368]
82 Let the data universe be [−1, 1]d , where d ∈ N is a constant. [sent-174, score-0.229]
83 Then the mechanism S given in Algorithm 1 and Algorithm 2 satisfies that for any > 0, the following hold: 1) The mechanism is -differentially private. [sent-175, score-0.642]
84 1 d 2d+K ) 2) For any β ≥ 10 · e− 5 (n the mechanism is (α, β)-accurate, where α = O and the hidden constant depends only on d, K and B. [sent-176, score-0.321]
85 4) The running time for S to answer a query is O(n d+2+ 2d K 2d+K polylog(n)). [sent-179, score-0.548]
86 , the query functions only have the first order derivatives. [sent-185, score-0.408]
87 2) The running time for outputting the summary does not change too much, because reading through the database requires Ω(n) time. [sent-191, score-0.556]
88 3) The running time for answering a query reduces significantly from roughly O(n3/2 ) to nearly O(n 0 ) as K getting large. [sent-192, score-0.603]
89 In practice, the speed for answering a query may be more important than that for outputting the summary since the sanitizer only output the summary once. [sent-194, score-1.179]
90 Thus having an nc -time (c 1) algorithm for query answering will be appealing. [sent-195, score-0.565]
91 It also transforms the data universe from [−1, 1]d to [−π, π]d . [sent-204, score-0.229]
92 To compute the summary, the mechanism just gives noisy answers to queries specified by even trigonometric monomials cos(m1 θ1 ) . [sent-206, score-0.865]
93 For each 1 trigonometric monomial, the highest degree of any variable is t := maxd md = O(n 2d+K ). [sent-210, score-0.354]
94 To answer a query specified by a smooth function f , the mechanism computes a trigonometric polynomial approximation of gf . [sent-212, score-1.322]
95 The answer to the query qf is a linear combination of the summary by the coefficients of the approximation trigonometric polynomial. [sent-213, score-1.158]
96 If these conditions hold, then the mechanism just outputs noisy answers to the set of queries specified by the basis functions as the summary. [sent-216, score-0.733]
97 When answering a query, the mechanism computes the coefficients with which the linear combination of the basis functions approximate the query function. [sent-217, score-0.958]
98 The answer to the query is simply the inner product of the coefficients and the summary vector. [sent-218, score-0.646]
99 The following theorem guarantees that by change of variables and using even trigonometric polynomials as the basis functions, the class of smooth functions has all the three properties described above. [sent-219, score-0.491]
100 Then, there is an even trigonometric polynomial p whose degree of each variable is t(γ) = p(θ1 , . [sent-230, score-0.32]
wordName wordTfidf (topN-words)
[('query', 0.368), ('mechanism', 0.321), ('qf', 0.274), ('queries', 0.266), ('privacy', 0.23), ('universe', 0.229), ('trigonometric', 0.216), ('cos', 0.201), ('outputting', 0.199), ('answering', 0.169), ('database', 0.154), ('private', 0.147), ('laplace', 0.143), ('summary', 0.141), ('sanitizer', 0.141), ('answer', 0.137), ('gf', 0.129), ('differentially', 0.113), ('peking', 0.113), ('mechanisms', 0.103), ('cb', 0.1), ('differential', 0.099), ('smooth', 0.098), ('moe', 0.092), ('lap', 0.075), ('kd', 0.071), ('databases', 0.067), ('eecs', 0.065), ('smoothness', 0.062), ('md', 0.062), ('su', 0.061), ('coef', 0.059), ('rectangle', 0.057), ('qcb', 0.056), ('cients', 0.056), ('polynomial', 0.053), ('poly', 0.053), ('polynomials', 0.053), ('td', 0.053), ('perception', 0.052), ('degree', 0.051), ('laboratory', 0.048), ('al', 0.047), ('releasing', 0.046), ('derivatives', 0.045), ('school', 0.044), ('bl', 0.043), ('running', 0.043), ('xd', 0.042), ('preserves', 0.042), ('functions', 0.04), ('user', 0.039), ('said', 0.038), ('submit', 0.038), ('ld', 0.038), ('basis', 0.038), ('answers', 0.037), ('accuracy', 0.037), ('public', 0.036), ('evaluation', 0.034), ('outputs', 0.031), ('sensitivity', 0.031), ('integers', 0.031), ('runs', 0.03), ('hardness', 0.029), ('nc', 0.028), ('adds', 0.027), ('class', 0.027), ('nontrivial', 0.027), ('dk', 0.026), ('pt', 0.025), ('releases', 0.025), ('attracts', 0.025), ('monomials', 0.025), ('attentions', 0.025), ('maxd', 0.025), ('monomial', 0.025), ('maps', 0.025), ('performances', 0.024), ('ideally', 0.024), ('preserving', 0.024), ('individual', 0.024), ('preserve', 0.023), ('arccos', 0.023), ('roughly', 0.023), ('combination', 0.022), ('privately', 0.022), ('ric', 0.022), ('studying', 0.021), ('nonnegative', 0.021), ('adversely', 0.02), ('polylog', 0.02), ('partial', 0.02), ('output', 0.02), ('extensive', 0.02), ('considers', 0.02), ('sup', 0.019), ('change', 0.019), ('lie', 0.019), ('attack', 0.019)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000002 102 nips-2013-Efficient Algorithm for Privately Releasing Smooth Queries
Author: Ziteng Wang, Kai Fan, Jiaqi Zhang, Liwei Wang
Abstract: We study differentially private mechanisms for answering smooth queries on databases consisting of data points in Rd . A K-smooth query is specified by a function whose partial derivatives up to order K are all bounded. We develop an -differentially private mechanism which for the class of K-smooth queries has K accuracy O(n− 2d+K / ). The mechanism first outputs a summary of the database. To obtain an answer of a query, the user runs a public evaluation algorithm which contains no information of the database. Outputting the summary runs in time d O(n1+ 2d+K ), and the evaluation algorithm for answering a query runs in time d+2+ 2d K ˜ O(n 2d+K ). Our mechanism is based on L∞ -approximation of (transformed) smooth functions by low degree even trigonometric polynomials with small and efficiently computable coefficients. 1
2 0.29226398 177 nips-2013-Local Privacy and Minimax Bounds: Sharp Rates for Probability Estimation
Author: John Duchi, Martin J. Wainwright, Michael Jordan
Abstract: We provide a detailed study of the estimation of probability distributions— discrete and continuous—in a stringent setting in which data is kept private even from the statistician. We give sharp minimax rates of convergence for estimation in these locally private settings, exhibiting fundamental trade-offs between privacy and convergence rate, as well as providing tools to allow movement along the privacy-statistical efficiency continuum. One of the consequences of our results is that Warner’s classical work on randomized response is an optimal way to perform survey sampling while maintaining privacy of the respondents. 1
3 0.27167112 14 nips-2013-A Stability-based Validation Procedure for Differentially Private Machine Learning
Author: Kamalika Chaudhuri, Staal A. Vinterbo
Abstract: Differential privacy is a cryptographically motivated definition of privacy which has gained considerable attention in the algorithms, machine-learning and datamining communities. While there has been an explosion of work on differentially private machine learning algorithms, a major barrier to achieving end-to-end differential privacy in practical machine learning applications is the lack of an effective procedure for differentially private parameter tuning, or, determining the parameter value, such as a bin size in a histogram, or a regularization parameter, that is suitable for a particular application. In this paper, we introduce a generic validation procedure for differentially private machine learning algorithms that apply when a certain stability condition holds on the training algorithm and the validation performance metric. The training data size and the privacy budget used for training in our procedure is independent of the number of parameter values searched over. We apply our generic procedure to two fundamental tasks in statistics and machine-learning – training a regularized linear classifier and building a histogram density estimator that result in end-toend differentially private solutions for these problems. 1
4 0.18189195 309 nips-2013-Statistical Active Learning Algorithms
Author: Maria-Florina Balcan, Vitaly Feldman
Abstract: We describe a framework for designing efficient active learning algorithms that are tolerant to random classification noise and differentially-private. The framework is based on active learning algorithms that are statistical in the sense that they rely on estimates of expectations of functions of filtered random examples. It builds on the powerful statistical query framework of Kearns [30]. We show that any efficient active statistical learning algorithm can be automatically converted to an efficient active learning algorithm which is tolerant to random classification noise as well as other forms of “uncorrelated” noise. We show that commonly studied concept classes including thresholds, rectangles, and linear separators can be efficiently actively learned in our framework. These results combined with our generic conversion lead to the first computationally-efficient algorithms for actively learning some of these concept classes in the presence of random classification noise that provide exponential improvement in the dependence on the error over their passive counterparts. In addition, we show that our algorithms can be automatically converted to efficient active differentially-private algorithms. This leads to the first differentially-private active learning algorithms with exponential label savings over the passive case. 1
5 0.17887317 2 nips-2013-(Nearly) Optimal Algorithms for Private Online Learning in Full-information and Bandit Settings
Author: Abhradeep Guha Thakurta, Adam Smith
Abstract: We give differentially private algorithms for a large class of online learning algorithms, in both the full information and bandit settings. Our algorithms aim to minimize a convex loss function which is a sum of smaller convex loss terms, one for each data point. To design our algorithms, we modify the popular mirror descent approach, or rather a variant called follow the approximate leader. The technique leads to the first nonprivate algorithms for private online learning in the bandit setting. In the full information setting, our algorithms improve over the regret bounds of previous work (due to Dwork, Naor, Pitassi and Rothblum (2010) and Jain, Kothari and Thakurta (2012)). In many cases, our algorithms (in both settings) match the dependence on the input length, T , of the optimal nonprivate regret bounds up to logarithmic factors in T . Our algorithms require logarithmic space and update time. 1
6 0.13863096 57 nips-2013-Beyond Pairwise: Provably Fast Algorithms for Approximate $k$-Way Similarity Search
7 0.091186091 25 nips-2013-Adaptive Anonymity via $b$-Matching
8 0.082166761 192 nips-2013-Minimax Theory for High-dimensional Gaussian Mixtures with Sparse Mean Separation
9 0.075679407 169 nips-2013-Learning to Prune in Metric and Non-Metric Spaces
10 0.073655762 349 nips-2013-Visual Concept Learning: Combining Machine Vision and Bayesian Generalization on Concept Hierarchies
11 0.068626344 355 nips-2013-Which Space Partitioning Tree to Use for Search?
12 0.064507127 107 nips-2013-Embed and Project: Discrete Sampling with Universal Hashing
13 0.062676832 149 nips-2013-Latent Structured Active Learning
14 0.060969759 293 nips-2013-Sign Cauchy Projections and Chi-Square Kernel
15 0.054017548 326 nips-2013-The Power of Asymmetry in Binary Hashing
16 0.049002841 42 nips-2013-Auditing: Active Learning with Outcome-Dependent Query Costs
17 0.047638267 359 nips-2013-Σ-Optimality for Active Learning on Gaussian Random Fields
18 0.047472805 137 nips-2013-High-Dimensional Gaussian Process Bandits
19 0.042841487 201 nips-2013-Multi-Task Bayesian Optimization
20 0.038771663 60 nips-2013-Buy-in-Bulk Active Learning
topicId topicWeight
[(0, 0.112), (1, 0.023), (2, 0.052), (3, -0.076), (4, 0.074), (5, 0.046), (6, -0.023), (7, -0.033), (8, 0.027), (9, 0.36), (10, 0.106), (11, -0.063), (12, -0.155), (13, 0.268), (14, -0.039), (15, 0.036), (16, -0.018), (17, 0.019), (18, 0.034), (19, -0.017), (20, 0.088), (21, -0.069), (22, 0.026), (23, 0.026), (24, 0.057), (25, -0.09), (26, -0.004), (27, 0.028), (28, 0.025), (29, 0.031), (30, 0.005), (31, -0.011), (32, -0.009), (33, 0.075), (34, -0.065), (35, 0.006), (36, -0.008), (37, 0.083), (38, -0.026), (39, -0.079), (40, 0.015), (41, 0.037), (42, -0.0), (43, 0.007), (44, 0.031), (45, 0.033), (46, -0.036), (47, 0.012), (48, -0.024), (49, -0.019)]
simIndex simValue paperId paperTitle
same-paper 1 0.96214122 102 nips-2013-Efficient Algorithm for Privately Releasing Smooth Queries
Author: Ziteng Wang, Kai Fan, Jiaqi Zhang, Liwei Wang
Abstract: We study differentially private mechanisms for answering smooth queries on databases consisting of data points in Rd . A K-smooth query is specified by a function whose partial derivatives up to order K are all bounded. We develop an -differentially private mechanism which for the class of K-smooth queries has K accuracy O(n− 2d+K / ). The mechanism first outputs a summary of the database. To obtain an answer of a query, the user runs a public evaluation algorithm which contains no information of the database. Outputting the summary runs in time d O(n1+ 2d+K ), and the evaluation algorithm for answering a query runs in time d+2+ 2d K ˜ O(n 2d+K ). Our mechanism is based on L∞ -approximation of (transformed) smooth functions by low degree even trigonometric polynomials with small and efficiently computable coefficients. 1
2 0.75723249 14 nips-2013-A Stability-based Validation Procedure for Differentially Private Machine Learning
Author: Kamalika Chaudhuri, Staal A. Vinterbo
Abstract: Differential privacy is a cryptographically motivated definition of privacy which has gained considerable attention in the algorithms, machine-learning and datamining communities. While there has been an explosion of work on differentially private machine learning algorithms, a major barrier to achieving end-to-end differential privacy in practical machine learning applications is the lack of an effective procedure for differentially private parameter tuning, or, determining the parameter value, such as a bin size in a histogram, or a regularization parameter, that is suitable for a particular application. In this paper, we introduce a generic validation procedure for differentially private machine learning algorithms that apply when a certain stability condition holds on the training algorithm and the validation performance metric. The training data size and the privacy budget used for training in our procedure is independent of the number of parameter values searched over. We apply our generic procedure to two fundamental tasks in statistics and machine-learning – training a regularized linear classifier and building a histogram density estimator that result in end-toend differentially private solutions for these problems. 1
3 0.73353326 177 nips-2013-Local Privacy and Minimax Bounds: Sharp Rates for Probability Estimation
Author: John Duchi, Martin J. Wainwright, Michael Jordan
Abstract: We provide a detailed study of the estimation of probability distributions— discrete and continuous—in a stringent setting in which data is kept private even from the statistician. We give sharp minimax rates of convergence for estimation in these locally private settings, exhibiting fundamental trade-offs between privacy and convergence rate, as well as providing tools to allow movement along the privacy-statistical efficiency continuum. One of the consequences of our results is that Warner’s classical work on randomized response is an optimal way to perform survey sampling while maintaining privacy of the respondents. 1
4 0.51192933 25 nips-2013-Adaptive Anonymity via $b$-Matching
Author: Krzysztof M. Choromanski, Tony Jebara, Kui Tang
Abstract: The adaptive anonymity problem is formalized where each individual shares their data along with an integer value to indicate their personal level of desired privacy. This problem leads to a generalization of k-anonymity to the b-matching setting. Novel algorithms and theory are provided to implement this type of anonymity. The relaxation achieves better utility, admits theoretical privacy guarantees that are as strong, and, most importantly, accommodates a variable level of anonymity for each individual. Empirical results confirm improved utility on benchmark and social data-sets.
5 0.49984139 2 nips-2013-(Nearly) Optimal Algorithms for Private Online Learning in Full-information and Bandit Settings
Author: Abhradeep Guha Thakurta, Adam Smith
Abstract: We give differentially private algorithms for a large class of online learning algorithms, in both the full information and bandit settings. Our algorithms aim to minimize a convex loss function which is a sum of smaller convex loss terms, one for each data point. To design our algorithms, we modify the popular mirror descent approach, or rather a variant called follow the approximate leader. The technique leads to the first nonprivate algorithms for private online learning in the bandit setting. In the full information setting, our algorithms improve over the regret bounds of previous work (due to Dwork, Naor, Pitassi and Rothblum (2010) and Jain, Kothari and Thakurta (2012)). In many cases, our algorithms (in both settings) match the dependence on the input length, T , of the optimal nonprivate regret bounds up to logarithmic factors in T . Our algorithms require logarithmic space and update time. 1
6 0.48536259 309 nips-2013-Statistical Active Learning Algorithms
7 0.4537372 57 nips-2013-Beyond Pairwise: Provably Fast Algorithms for Approximate $k$-Way Similarity Search
8 0.36574858 42 nips-2013-Auditing: Active Learning with Outcome-Dependent Query Costs
9 0.35384873 293 nips-2013-Sign Cauchy Projections and Chi-Square Kernel
10 0.33666807 60 nips-2013-Buy-in-Bulk Active Learning
11 0.31758624 326 nips-2013-The Power of Asymmetry in Binary Hashing
12 0.31510553 199 nips-2013-More data speeds up training time in learning halfspaces over sparse vectors
13 0.29739138 107 nips-2013-Embed and Project: Discrete Sampling with Universal Hashing
14 0.28549665 169 nips-2013-Learning to Prune in Metric and Non-Metric Spaces
15 0.28212613 149 nips-2013-Latent Structured Active Learning
16 0.25849569 142 nips-2013-Information-theoretic lower bounds for distributed statistical estimation with communication constraints
17 0.25817308 355 nips-2013-Which Space Partitioning Tree to Use for Search?
18 0.2366606 359 nips-2013-Σ-Optimality for Active Learning on Gaussian Random Fields
19 0.22681102 192 nips-2013-Minimax Theory for High-dimensional Gaussian Mixtures with Sparse Mean Separation
20 0.21844883 80 nips-2013-Data-driven Distributionally Robust Polynomial Optimization
topicId topicWeight
[(2, 0.012), (16, 0.022), (33, 0.11), (34, 0.116), (41, 0.038), (49, 0.047), (54, 0.215), (56, 0.104), (70, 0.03), (76, 0.031), (85, 0.039), (89, 0.022), (93, 0.03), (95, 0.096)]
simIndex simValue paperId paperTitle
1 0.85385269 139 nips-2013-How to Hedge an Option Against an Adversary: Black-Scholes Pricing is Minimax Optimal
Author: Jacob Abernethy, Peter Bartlett, Rafael Frongillo, Andre Wibisono
Abstract: We consider a popular problem in finance, option pricing, through the lens of an online learning game between Nature and an Investor. In the Black-Scholes option pricing model from 1973, the Investor can continuously hedge the risk of an option by trading the underlying asset, assuming that the asset’s price fluctuates according to Geometric Brownian Motion (GBM). We consider a worst-case model, in which Nature chooses a sequence of price fluctuations under a cumulative quadratic volatility constraint, and the Investor can make a sequence of hedging decisions. Our main result is to show that the value of our proposed game, which is the “regret” of hedging strategy, converges to the Black-Scholes option price. We use significantly weaker assumptions than previous work—for instance, we allow large jumps in the asset price—and show that the Black-Scholes hedging strategy is near-optimal for the Investor even in this non-stochastic framework. 1
same-paper 2 0.80369651 102 nips-2013-Efficient Algorithm for Privately Releasing Smooth Queries
Author: Ziteng Wang, Kai Fan, Jiaqi Zhang, Liwei Wang
Abstract: We study differentially private mechanisms for answering smooth queries on databases consisting of data points in Rd . A K-smooth query is specified by a function whose partial derivatives up to order K are all bounded. We develop an -differentially private mechanism which for the class of K-smooth queries has K accuracy O(n− 2d+K / ). The mechanism first outputs a summary of the database. To obtain an answer of a query, the user runs a public evaluation algorithm which contains no information of the database. Outputting the summary runs in time d O(n1+ 2d+K ), and the evaluation algorithm for answering a query runs in time d+2+ 2d K ˜ O(n 2d+K ). Our mechanism is based on L∞ -approximation of (transformed) smooth functions by low degree even trigonometric polynomials with small and efficiently computable coefficients. 1
3 0.71166867 268 nips-2013-Reflection methods for user-friendly submodular optimization
Author: Stefanie Jegelka, Francis Bach, Suvrit Sra
Abstract: Recently, it has become evident that submodularity naturally captures widely occurring concepts in machine learning, signal processing and computer vision. Consequently, there is need for efficient optimization procedures for submodular functions, especially for minimization problems. While general submodular minimization is challenging, we propose a new method that exploits existing decomposability of submodular functions. In contrast to previous approaches, our method is neither approximate, nor impractical, nor does it need any cumbersome parameter tuning. Moreover, it is easy to implement and parallelize. A key component of our method is a formulation of the discrete submodular minimization problem as a continuous best approximation problem that is solved through a sequence of reflections, and its solution can be easily thresholded to obtain an optimal discrete solution. This method solves both the continuous and discrete formulations of the problem, and therefore has applications in learning, inference, and reconstruction. In our experiments, we illustrate the benefits of our method on two image segmentation tasks. 1
4 0.69660217 220 nips-2013-On the Complexity and Approximation of Binary Evidence in Lifted Inference
Author: Guy van den Broeck, Adnan Darwiche
Abstract: Lifted inference algorithms exploit symmetries in probabilistic models to speed up inference. They show impressive performance when calculating unconditional probabilities in relational models, but often resort to non-lifted inference when computing conditional probabilities. The reason is that conditioning on evidence breaks many of the model’s symmetries, which can preempt standard lifting techniques. Recent theoretical results show, for example, that conditioning on evidence which corresponds to binary relations is #P-hard, suggesting that no lifting is to be expected in the worst case. In this paper, we balance this negative result by identifying the Boolean rank of the evidence as a key parameter for characterizing the complexity of conditioning in lifted inference. In particular, we show that conditioning on binary evidence with bounded Boolean rank is efficient. This opens up the possibility of approximating evidence by a low-rank Boolean matrix factorization, which we investigate both theoretically and empirically. 1
Author: Po-Ling Loh, Martin J. Wainwright
Abstract: We establish theoretical results concerning local optima of regularized M estimators, where both loss and penalty functions are allowed to be nonconvex. Our results show that as long as the loss satisfies restricted strong convexity and the penalty satisfies suitable regularity conditions, any local optimum of the composite objective lies within statistical precision of the true parameter vector. Our theory covers a broad class of nonconvex objective functions, including corrected versions of the Lasso for errors-in-variables linear models and regression in generalized linear models using nonconvex regularizers such as SCAD and MCP. On the optimization side, we show that a simple adaptation of composite gradient descent may be used to compute a global optimum up to the statistical precision stat in log(1/ stat ) iterations, the fastest possible rate for any first-order method. We provide simulations to illustrate the sharpness of our theoretical predictions. 1
6 0.67886609 154 nips-2013-Learning Gaussian Graphical Models with Observed or Latent FVSs
7 0.67109001 309 nips-2013-Statistical Active Learning Algorithms
8 0.66866249 184 nips-2013-Marginals-to-Models Reducibility
9 0.6646713 359 nips-2013-Σ-Optimality for Active Learning on Gaussian Random Fields
10 0.65965974 78 nips-2013-Curvature and Optimal Algorithms for Learning and Minimizing Submodular Functions
11 0.65932941 149 nips-2013-Latent Structured Active Learning
12 0.6571511 97 nips-2013-Distributed Submodular Maximization: Identifying Representative Elements in Massive Data
13 0.65638644 239 nips-2013-Optimistic policy iteration and natural actor-critic: A unifying view and a non-optimality result
14 0.65491974 101 nips-2013-EDML for Learning Parameters in Directed and Undirected Graphical Models
15 0.65442812 116 nips-2013-Fantope Projection and Selection: A near-optimal convex relaxation of sparse PCA
16 0.65385228 319 nips-2013-Submodular Optimization with Submodular Cover and Submodular Knapsack Constraints
17 0.65306062 333 nips-2013-Trading Computation for Communication: Distributed Stochastic Dual Coordinate Ascent
18 0.65218985 282 nips-2013-Robust Multimodal Graph Matching: Sparse Coding Meets Graph Matching
19 0.65213674 45 nips-2013-BIG & QUIC: Sparse Inverse Covariance Estimation for a Million Variables
20 0.65184945 249 nips-2013-Polar Operators for Structured Sparse Estimation