nips nips2010 nips2010-175 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Manas Pathak, Shantanu Rane, Bhiksha Raj
Abstract: As increasing amounts of sensitive personal information finds its way into data repositories, it is important to develop analysis mechanisms that can derive aggregate information from these repositories without revealing information about individual data instances. Though the differential privacy model provides a framework to analyze such mechanisms for databases belonging to a single party, this framework has not yet been considered in a multi-party setting. In this paper, we propose a privacy-preserving protocol for composing a differentially private aggregate classifier using classifiers trained locally by separate mutually untrusting parties. The protocol allows these parties to interact with an untrusted curator to construct additive shares of a perturbed aggregate classifier. We also present a detailed theoretical analysis containing a proof of differential privacy of the perturbed aggregate classifier and a bound on the excess risk introduced by the perturbation. We verify the bound with an experimental evaluation on a real dataset. 1
Reference: text
sentIndex sentText sentNum sentScore
1 edu Abstract As increasing amounts of sensitive personal information finds its way into data repositories, it is important to develop analysis mechanisms that can derive aggregate information from these repositories without revealing information about individual data instances. [sent-7, score-0.353]
2 Though the differential privacy model provides a framework to analyze such mechanisms for databases belonging to a single party, this framework has not yet been considered in a multi-party setting. [sent-8, score-0.457]
3 In this paper, we propose a privacy-preserving protocol for composing a differentially private aggregate classifier using classifiers trained locally by separate mutually untrusting parties. [sent-9, score-0.92]
4 The protocol allows these parties to interact with an untrusted curator to construct additive shares of a perturbed aggregate classifier. [sent-10, score-1.33]
5 We also present a detailed theoretical analysis containing a proof of differential privacy of the perturbed aggregate classifier and a bound on the excess risk introduced by the perturbation. [sent-11, score-1.021]
6 In the process, however, they risk compromising the privacy of the individuals by releasing sensitive information such as their medical or financial records, addresses and telephone numbers, preferences of various kinds which the individuals may not want exposed. [sent-15, score-0.445]
7 In this paper, we address the problem of learning a classifier from a multi-party collection of such private data. [sent-17, score-0.229]
8 The conditions we impose are that (a) None of the parties are willing to share the data with one another or with any third party (e. [sent-30, score-0.693]
9 Within SMC individual parties use a combination of cryptographic techniques and oblivious transfer to jointly compute a function of their private data [3, 4, 5]. [sent-35, score-0.782]
10 The techniques typically provide guarantees that none of the parties learn anything about the individual data besides what may be inferred from the final result of the computation. [sent-36, score-0.577]
11 Moreover, for all but the simplest computational problems, SMC protocols tend to be highly expensive, requiring iterated encryption and decryption and repeated communication of encrypted partial results between participating parties. [sent-39, score-0.213]
12 An alternative theoretical model for protecting the privacy of individual data instances is differential privacy [6]. [sent-40, score-0.821]
13 A mechanism evaluated over a database is said to satisfy differential privacy if the probability of the mechanism producing a particular output is almost the same regardless of the presence or absence of any individual data instance in the database. [sent-42, score-0.597]
14 Differential privacy provides statistical guarantees that the output of the computation does not carry information about individual data instances. [sent-43, score-0.335]
15 We provide an alternative solution: within our approach the individual parties locally compute an optimal classifier with their data. [sent-45, score-0.577]
16 The individual classifiers are then averaged to obtain the final aggregate classifier. [sent-46, score-0.306]
17 The aggregation is performed through a secure protocol that also adds a stochastic component to the averaged classifier, such that the resulting aggregate classifier is differentially private, i. [sent-47, score-0.691]
18 We prove that the addition of the noise does indeed result in a differentially private classifier. [sent-54, score-0.438]
19 We also provide a bound on the true excess risk of the differentially private averaged classifier compared to the optimal classifier trained on the combined data. [sent-55, score-0.682]
20 2 Differential Privacy In this paper, we consider the differential privacy model introduced by Dwork [6]. [sent-57, score-0.43]
21 Given any two databases D and D differing by one element, which we will refer to as adjacent databases, a randomized query function M is said to be differentially private if the probability that M produces a response S on D is close to the probability that M produces the same response S on D . [sent-58, score-0.427]
22 Definition A randomized function M with a well-defined probability density P satisfies differential privacy if, for all adjacent databases D and D and for any S ∈ range(M ), log P (M (D) = S) P (M (D ) = S) ≤ . [sent-60, score-0.457]
23 A classifier satisfying differential privacy implies that no additional details about the individual training data instances can be obtained with certainty from output of the learning algorithm, beyond the a priori background knowledge. [sent-62, score-0.588]
24 Differential privacy provides an ad omnia guarantee as opposed to most other models that provide ad hoc guarantees against a specific set of attacks and adversarial behaviors. [sent-63, score-0.286]
25 By evaluating the differentially private classifier over a large number of test instances, an adversary cannot learn the exact form of the training data. [sent-64, score-0.452]
26 [7] proposed the exponential mechanism for creating functions satisfying differential privacy by adding a perturbation term from the Laplace distribution scaled by the sensitivity of the function. [sent-67, score-0.591]
27 Chaudhuri and Monteleoni [8] use the exponential mechanism [7] to create a differentially private logistic regression classifier by perturbing the estimated parameters with multivariate Laplacian noise scaled by the sensitivity of the classifier. [sent-68, score-0.538]
28 They also propose another method to learn classifiers satisfying differential privacy by adding a linear perturbation term to the objective function which is scaled by Laplacian noise. [sent-69, score-0.514]
29 [9] show we can create a differentially private function by adding noise from Laplace distribution scaled by the smooth sensitivity of the function. [sent-71, score-0.506]
30 They also propose the sample and aggregate framework for replacing the original function with a related function for which the smooth sensitivity can be easily computed. [sent-73, score-0.302]
31 Smith [10] presents a method for differentially private unbiased MLE using this framework. [sent-74, score-0.4]
32 All the previous methods are inherently designed for the case where a single curator has access to the entire data and is interested in releasing a differentially private function computed over the data. [sent-75, score-0.541]
33 To the best of our knowledge and belief, ours is the first method designed for releasing a differentially private classifier computed over training data owned by different parties who do not wish to disclose the data to each other. [sent-76, score-1.024]
34 Our technique was principally motivated by the sample and aggregate framework, where we considered the samples to be owned by individual parties. [sent-77, score-0.364]
35 Similar to [10], we choose a simple average as the aggregation function and the parties together release the perturbed aggregate classifier which satisfies differential privacy. [sent-78, score-1.065]
36 In the multi-party case, however, adding the perturbation to the classifier is no longer straightforward and it is necessary to provide a secure protocol to do this. [sent-79, score-0.269]
37 3 Multiparty Classification Protocol The problem we address is as follows: a number of parties P1 , . [sent-80, score-0.504]
38 1 Training Local Classifiers on Individual Datasets Each party Pj uses their data set (x, y)|j to learn an 2 regularized logistic regression classifier with ˆ weights wj . [sent-89, score-0.245]
39 The parties then collaborate 1 ˆ ˆ to compute an aggregate classifier given by ws = K j wj + η, where η is a d-dimensional 2 random variable sampled from a Laplace distribution scaled with the parameter n(1) λ and n(1) = minj nj . [sent-94, score-1.237]
40 As we shall see later, composing an aggregate classifier in this manner incurs only a wellbounded excess risk over training a classifier directly on the union of all data while enabling the parties to maintain their privacy. [sent-95, score-0.978]
41 1 that the noise term η ensures that ˆ the classifier ws satisfies differential privacy, i. [sent-97, score-0.475]
42 , that individual data instances cannot be discerned from the aggregate classifier. [sent-99, score-0.362]
43 We note that the parties Pj cannot simply take 3 ˆ their individually trained classifiers wj , perturb them with a noise vector and publish the perturbed classifiers, because aggregating such classifiers will not give the correct η ∼ Lap 2/(n(1) λ) in general. [sent-103, score-0.814]
44 Since individual parties cannot simply add noise to their classifiers to impose differential privacy, the actual averaging operation must be performed such that the individual parties do not expose their own classifiers or the number of data instances they possess. [sent-104, score-1.344]
45 We therefore use a private multiparty protocol, interacting with an untrusted curator “Charlie” to perform the averaging. [sent-105, score-0.429]
46 The outcome of the protocol is such that each of the parties obtain additive shares of the final classifier ˆ ˆ ws , such that these shares must be added to obtain ws . [sent-106, score-1.5]
47 Privacy-Preserving Protocol We use asymmetric key additively homomorphic encryption [11]. [sent-108, score-0.213]
48 For an additively homomorphic encryption function ξ(·), ξ(a) ξ(b) = ξ(a + b), ξ(a)b = ξ(ab). [sent-110, score-0.213]
49 For the ensuing protocol, encryption keys are considered public and decryption keys are privately owned by the specified parties. [sent-114, score-0.231]
50 Assuming the parties to be honest-but-curious, the steps of the protocol are as follows. [sent-115, score-0.632]
51 Each party Pj computes nj = aj + bj , where aj and bj are integers representing additive shares of the database lengths nj for j = 1, 2, . [sent-119, score-0.634]
52 The parties Pj mutually agree on a permutation π1 on the index vector (1, 2, . [sent-125, score-0.564]
53 Then, each party Pj sends its share aj to party Pπ1 (j) , and sends its share bj to Charlie with the index changed according to the permutation. [sent-130, score-0.535]
54 Thus, after this step, the parties have permuted additive shares given by π1 (a) while Charlie has permuted additive shares π1 (b). [sent-131, score-0.958]
55 The parties Pj generate a key pair (pk,sk) where pk is a public key for homomorphic encryption and sk is the secret decryption key known only to the parties but not to Charlie. [sent-133, score-1.349]
56 The parties send ξ(π1 (a)) = π1 (ξ(a)) to Charlie. [sent-135, score-0.504]
57 He sends ξ(π2 (π1 (a) + r)) to the individual parties in the following order: First element to P1 , second element to P2 ,. [sent-142, score-0.579]
58 Note also that, adding the vector collectively owned by the parties and the vector owned by Charlie would give π2 (π1 (a) + r) + π2 (π1 (b) − r) = π2 (π1 (a + b)) = π2 (π1 (n)). [sent-153, score-0.62]
59 , K}, these comparisons ˜ ˜ b b can be solved by any implementation of a secure millionaire protocol [2]. [sent-161, score-0.234]
60 Charlie holds only an additive share of minj nj and thus cannot know the true length of the smallest database. [sent-164, score-0.229]
61 He generates a key-pair (pk ,sk ) for additive homomorphic function ζ(·) where only the encryption −1 key pk is publicly available to the parties Pj . [sent-170, score-0.806]
62 Charlie then transmits ζ(π2 (u)) = −1 π2 (ζ(u)) to the parties Pj . [sent-171, score-0.527]
63 The parties mutually obtain a permuted vector π1 (π2 (ζ(u))) = ζ(v) where π1 inverts the permutation π1 originally applied by the parties Pj in Stage I. [sent-173, score-1.115]
64 However, since the parties Pj cannot decrypt ζ(·), they cannot find out this index. [sent-175, score-0.504]
65 All parties Pj now compute a d-dimensional noise vector ψ such that, for i = 1, . [sent-186, score-0.542]
66 At this stage, the parties and Charlie have the following d-dimensional vectors: Charlie has K(η + s), ˆ ˆ P1 has w1 − Ks, and all other parties Pj , j = 2, . [sent-207, score-1.008]
67 None of the K + 1 participants can share this data for fear of compromising differential privacy. [sent-211, score-0.226]
68 Finally, Charlie and the K database-owning parties run a simple secure function evaluation protocol [13], at the end of which each of the K + 1 participants obtains an ˆ additive share of K ws . [sent-213, score-1.191]
69 This protocol is provably private against honest but curious participants when there are no collisions. [sent-214, score-0.39]
70 3 Testing Phase A test participant Dave having a test data instance x ∈ Rd is interested in applying the trained ˆ classifier adds the published shares and divides by K to get the differentially private classifier ws . [sent-219, score-0.855]
71 1 Proof of Differential Privacy We show that the perturbed aggregate classifier satisfies differential privacy. [sent-222, score-0.532]
72 This would imply a change in one element in the training dataset of one party ˆj and thereby a change in the corresponding learned vector ws . [sent-231, score-0.508]
73 Assuming that the change is in the ˆ dataset of the party Pj , the change in the learned vector is only going to be in wj ; let denote the new ˆ ˆ ˆ ˆ classifier by wj . [sent-232, score-0.356]
74 1, we bound the sensitivity of wj as wj − wj 1 ≤ nj2 λ . [sent-234, score-0.337]
75 ˆ We first establish a bound on the 2 norm of the difference between the aggregate classifier w and the classifier w∗ trained over the entire training data. [sent-239, score-0.381]
76 Given the aggregate classifier w, the classifier w∗ trained over the entire training data and n(1) is the size of the smallest training dataset, ˆ w − w∗ 2 6 ≤ K −1 . [sent-245, score-0.421]
77 The largest possible n ˆ value for n(1) is K in which case all parties having an equal amount of training data and w will be closest to w∗ . [sent-248, score-0.531]
78 In the one party case for K = 1, the bound indicates that norm of the difference ˆ would be upper bounded by zero, which is a valid sanity check as the aggregate classifier w is the same as w∗ . [sent-249, score-0.458]
79 We use this result to establish a bound on the empirical risk of the perturbed aggregate classifier ˆ ˆ ws = w + η over the empirical risk of the unperturbed classifier w∗ in the following theorem. [sent-250, score-0.892]
80 The bound increases for smaller values of implying a tighter definition of differential privacy, indicating a clear trade-off between privacy and utility. [sent-256, score-0.47]
81 The bound is also inversely proportional to n2 implying (1) an increase in excess risk when the parties have training datasets of disparate sizes. [sent-257, score-0.799]
82 In the limiting case → ∞, we are adding a perturbation term η sampled from a Laplacian distribution of infinitesimally small variance resulting in the perturbed classifier being almost as same as ˆ using the unperturbed aggregate classifier w satisfying a very loose definition of differential privacy. [sent-258, score-0.646]
83 2, the excess error in using an aggregate classifier is inversely proportional to the size of the smallest dataset n(1) and in the one party case K = 1, the bound ˆ becomes zero as the aggregate classifier w is the same as w∗ . [sent-261, score-0.94]
84 Also, for a small value of in the one party case K = 1 and n(1) = n, our bound reduces to that in Lemma 3 of [8], ˆ J(ws ) ≤ J(w∗ ) + 2d2 (λ + 1) log2 n2 2 λ 2 d δ . [sent-262, score-0.201]
85 (4) While the previous theorem gives us a bound on the empirical excess risk over a given training ˆ dataset, it is important to consider a bound on the true excess risk of ws over w∗ . [sent-263, score-0.726]
86 Let us denote the ˜ ˆ ˆ ˆ true risk of the classifier ws by J(ws ) = E[J(ws )] and similarly, the true risk of the classifier w∗ by ˜ J(w∗ ) = E[J(w∗ )]. [sent-264, score-0.411]
87 In the following theorem, we apply the result from [14] which uses the bound on the empirical excess risk to form a bound on the true excess risk. [sent-265, score-0.347]
88 Experiments We perform an empirical evaluation of the proposed differentially private classifier to obtain a characterization of the increase in the error due to perturbation. [sent-270, score-0.4]
89 5 non-private all data DP all data DP aggregate n(1)=6512 DP aggregate n(1)=4884 DP aggregate n(1)=3256 0. [sent-272, score-0.771]
90 4 ε ˆ Figure 2: Classifier performance evaluated for w∗ , w∗ + η, and ws for different data splits vs. [sent-286, score-0.293]
91 The choice of the dataset is motivated as a realistic example for application of data privacy techniques. [sent-288, score-0.313]
92 1 In Figure 2, we compare the test error of perturbed aggregate classifiers trained over data from five parties for different values of . [sent-293, score-0.949]
93 We also compare with the error of the classifier trained using combined training data and its perturbed version satisfying differential privacy. [sent-295, score-0.407]
94 The perturbed aggregate classifier which is trained using maximum n(1) = 6512 does consistently better than for lower values of n(1) which is same as our theory suggested. [sent-297, score-0.445]
95 Also, the test error for all perturbed aggregate classifiers drops with , but comparatively faster for even split and converges to the test error of the classifier trained over the combined data. [sent-298, score-0.467]
96 As expected, the differentially private classifier trained over the entire training data does much better than the perturbed aggregate classifiers with an error equal to the unperturbed classifier except for small values of . [sent-299, score-0.925]
97 The lower error of this classifier is at the cost of the loss in privacy of the parties as they would need to share the data in order to train the classifier over combined data. [sent-300, score-0.84]
98 6 Conclusion We proposed a method for composing an aggregate classifier satisfying -differential privacy from classifiers locally trained by multiple untrusting parties. [sent-301, score-0.704]
99 The upper bound on the excess risk of the perturbed aggregate classifer as compared to the optimal classifier trained over the complete data without privacy constraints is inversely proportional to the privacy parameter , suggesting an inherent tradeoff between privacy and utility. [sent-302, score-1.547]
100 In future work, we seek to generalize the theoretical analysis of the perturbed aggregate classifier to the setting in which each party has data generated from a different distribution. [sent-305, score-0.549]
wordName wordTfidf (topN-words)
[('parties', 0.504), ('ws', 0.293), ('privacy', 0.286), ('aggregate', 0.257), ('charlie', 0.24), ('private', 0.229), ('er', 0.208), ('differentially', 0.171), ('party', 0.161), ('classi', 0.149), ('differential', 0.144), ('perturbed', 0.131), ('protocol', 0.128), ('encryption', 0.12), ('pj', 0.12), ('curator', 0.106), ('secure', 0.106), ('shares', 0.105), ('excess', 0.104), ('wj', 0.084), ('additive', 0.072), ('homomorphic', 0.067), ('multiparty', 0.067), ('risk', 0.059), ('ers', 0.058), ('owned', 0.058), ('secret', 0.058), ('trained', 0.057), ('instances', 0.056), ('database', 0.054), ('smallest', 0.053), ('decryption', 0.053), ('smc', 0.053), ('unperturbed', 0.053), ('permuted', 0.05), ('individual', 0.049), ('aj', 0.049), ('nj', 0.046), ('sensitivity', 0.045), ('pk', 0.043), ('stage', 0.042), ('inversely', 0.041), ('bound', 0.04), ('encrypted', 0.04), ('jaideep', 0.04), ('vaidya', 0.04), ('noise', 0.038), ('releasing', 0.035), ('nissim', 0.035), ('perturbation', 0.035), ('laplace', 0.033), ('participants', 0.033), ('dwork', 0.032), ('mechanism', 0.032), ('adult', 0.032), ('permutation', 0.03), ('minj', 0.03), ('index', 0.03), ('aggregation', 0.029), ('share', 0.028), ('obtains', 0.027), ('composing', 0.027), ('dataset', 0.027), ('training', 0.027), ('databases', 0.027), ('atallah', 0.027), ('bhiksha', 0.027), ('decrypts', 0.027), ('inverts', 0.027), ('kantarcioglu', 0.027), ('kobbi', 0.027), ('plaintext', 0.027), ('rane', 0.027), ('restated', 0.027), ('turbed', 0.027), ('untrusted', 0.027), ('untrusting', 0.027), ('additively', 0.026), ('sends', 0.026), ('satisfying', 0.026), ('bj', 0.026), ('adversary', 0.025), ('dp', 0.025), ('possess', 0.024), ('none', 0.024), ('datasets', 0.024), ('personal', 0.024), ('locally', 0.024), ('murat', 0.023), ('repositories', 0.023), ('obliviously', 0.023), ('cynthia', 0.023), ('obfuscated', 0.023), ('transmits', 0.023), ('scaled', 0.023), ('please', 0.023), ('combined', 0.022), ('individuals', 0.022), ('uci', 0.022), ('compromising', 0.021)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000002 175 nips-2010-Multiparty Differential Privacy via Aggregation of Locally Trained Classifiers
Author: Manas Pathak, Shantanu Rane, Bhiksha Raj
Abstract: As increasing amounts of sensitive personal information finds its way into data repositories, it is important to develop analysis mechanisms that can derive aggregate information from these repositories without revealing information about individual data instances. Though the differential privacy model provides a framework to analyze such mechanisms for databases belonging to a single party, this framework has not yet been considered in a multi-party setting. In this paper, we propose a privacy-preserving protocol for composing a differentially private aggregate classifier using classifiers trained locally by separate mutually untrusting parties. The protocol allows these parties to interact with an untrusted curator to construct additive shares of a perturbed aggregate classifier. We also present a detailed theoretical analysis containing a proof of differential privacy of the perturbed aggregate classifier and a bound on the excess risk introduced by the perturbation. We verify the bound with an experimental evaluation on a real dataset. 1
2 0.34919572 216 nips-2010-Probabilistic Inference and Differential Privacy
Author: Oliver Williams, Frank Mcsherry
Abstract: We identify and investigate a strong connection between probabilistic inference and differential privacy, the latter being a recent privacy definition that permits only indirect observation of data through noisy measurement. Previous research on differential privacy has focused on designing measurement processes whose output is likely to be useful on its own. We consider the potential of applying probabilistic inference to the measurements and measurement process to derive posterior distributions over the data sets and model parameters thereof. We find that probabilistic inference can improve accuracy, integrate multiple observations, measure uncertainty, and even provide posterior distributions over quantities that were not directly measured. 1
3 0.1039341 272 nips-2010-Towards Holistic Scene Understanding: Feedback Enabled Cascaded Classification Models
Author: Congcong Li, Adarsh Kowdle, Ashutosh Saxena, Tsuhan Chen
Abstract: In many machine learning domains (such as scene understanding), several related sub-tasks (such as scene categorization, depth estimation, object detection) operate on the same raw data and provide correlated outputs. Each of these tasks is often notoriously hard, and state-of-the-art classifiers already exist for many subtasks. It is desirable to have an algorithm that can capture such correlation without requiring to make any changes to the inner workings of any classifier. We propose Feedback Enabled Cascaded Classification Models (FE-CCM), that maximizes the joint likelihood of the sub-tasks, while requiring only a ‘black-box’ interface to the original classifier for each sub-task. We use a two-layer cascade of classifiers, which are repeated instantiations of the original ones, with the output of the first layer fed into the second layer as input. Our training method involves a feedback step that allows later classifiers to provide earlier classifiers information about what error modes to focus on. We show that our method significantly improves performance in all the sub-tasks in two different domains: (i) scene understanding, where we consider depth estimation, scene categorization, event categorization, object detection, geometric labeling and saliency detection, and (ii) robotic grasping, where we consider grasp point detection and object classification. 1
4 0.097097903 89 nips-2010-Factorized Latent Spaces with Structured Sparsity
Author: Yangqing Jia, Mathieu Salzmann, Trevor Darrell
Abstract: Recent approaches to multi-view learning have shown that factorizing the information into parts that are shared across all views and parts that are private to each view could effectively account for the dependencies and independencies between the different input modalities. Unfortunately, these approaches involve minimizing non-convex objective functions. In this paper, we propose an approach to learning such factorized representations inspired by sparse coding techniques. In particular, we show that structured sparsity allows us to address the multiview learning problem by alternately solving two convex optimization problems. Furthermore, the resulting factorized latent spaces generalize over existing approaches in that they allow having latent dimensions shared between any subset of the views instead of between all the views only. We show that our approach outperforms state-of-the-art methods on the task of human pose estimation. 1
5 0.083017275 243 nips-2010-Smoothness, Low Noise and Fast Rates
Author: Nathan Srebro, Karthik Sridharan, Ambuj Tewari
Abstract: √ ˜ We establish an excess risk bound of O HR2 + HL∗ Rn for ERM with an H-smooth loss n function and a hypothesis class with Rademacher complexity Rn , where L∗ is the best risk achievable by the hypothesis class. For typical hypothesis classes where Rn = R/n, this translates to ˜ ˜ a learning rate of O (RH/n) in the separable (L∗ = 0) case and O RH/n + L∗ RH/n more generally. We also provide similar guarantees for online and stochastic convex optimization of a smooth non-negative objective. 1
6 0.067646869 132 nips-2010-Joint Cascade Optimization Using A Product Of Boosted Classifiers
7 0.06409429 282 nips-2010-Variable margin losses for classifier design
8 0.062042717 192 nips-2010-Online Classification with Specificity Constraints
9 0.060110517 228 nips-2010-Reverse Multi-Label Learning
10 0.057217196 15 nips-2010-A Theory of Multiclass Boosting
11 0.055952031 52 nips-2010-Convex Multiple-Instance Learning by Estimating Likelihood Ratio
12 0.054854859 24 nips-2010-Active Learning Applied to Patient-Adaptive Heartbeat Classification
13 0.054137919 173 nips-2010-Multi-View Active Learning in the Non-Realizable Case
14 0.053774267 94 nips-2010-Feature Set Embedding for Incomplete Data
15 0.049343556 31 nips-2010-An analysis on negative curvature induced by singularity in multi-layer neural-network learning
16 0.045701481 86 nips-2010-Exploiting weakly-labeled Web images to improve object classification: a domain adaptation approach
17 0.042242471 260 nips-2010-Sufficient Conditions for Generating Group Level Sparsity in a Robust Minimax Framework
18 0.04194406 22 nips-2010-Active Estimation of F-Measures
19 0.041842248 135 nips-2010-Label Embedding Trees for Large Multi-Class Tasks
20 0.040402781 104 nips-2010-Generative Local Metric Learning for Nearest Neighbor Classification
topicId topicWeight
[(0, 0.126), (1, 0.04), (2, 0.041), (3, -0.056), (4, 0.03), (5, 0.063), (6, -0.08), (7, -0.017), (8, -0.051), (9, -0.069), (10, -0.013), (11, 0.01), (12, 0.007), (13, -0.027), (14, -0.045), (15, 0.036), (16, 0.034), (17, 0.154), (18, -0.034), (19, -0.016), (20, -0.047), (21, -0.022), (22, -0.11), (23, -0.026), (24, 0.023), (25, 0.01), (26, 0.03), (27, 0.143), (28, -0.054), (29, -0.062), (30, 0.218), (31, -0.18), (32, -0.042), (33, -0.117), (34, -0.129), (35, -0.112), (36, 0.159), (37, -0.001), (38, -0.219), (39, -0.206), (40, -0.229), (41, -0.178), (42, -0.051), (43, 0.093), (44, -0.106), (45, -0.081), (46, -0.039), (47, -0.265), (48, 0.109), (49, -0.005)]
simIndex simValue paperId paperTitle
same-paper 1 0.92211699 175 nips-2010-Multiparty Differential Privacy via Aggregation of Locally Trained Classifiers
Author: Manas Pathak, Shantanu Rane, Bhiksha Raj
Abstract: As increasing amounts of sensitive personal information finds its way into data repositories, it is important to develop analysis mechanisms that can derive aggregate information from these repositories without revealing information about individual data instances. Though the differential privacy model provides a framework to analyze such mechanisms for databases belonging to a single party, this framework has not yet been considered in a multi-party setting. In this paper, we propose a privacy-preserving protocol for composing a differentially private aggregate classifier using classifiers trained locally by separate mutually untrusting parties. The protocol allows these parties to interact with an untrusted curator to construct additive shares of a perturbed aggregate classifier. We also present a detailed theoretical analysis containing a proof of differential privacy of the perturbed aggregate classifier and a bound on the excess risk introduced by the perturbation. We verify the bound with an experimental evaluation on a real dataset. 1
2 0.80241835 216 nips-2010-Probabilistic Inference and Differential Privacy
Author: Oliver Williams, Frank Mcsherry
Abstract: We identify and investigate a strong connection between probabilistic inference and differential privacy, the latter being a recent privacy definition that permits only indirect observation of data through noisy measurement. Previous research on differential privacy has focused on designing measurement processes whose output is likely to be useful on its own. We consider the potential of applying probabilistic inference to the measurements and measurement process to derive posterior distributions over the data sets and model parameters thereof. We find that probabilistic inference can improve accuracy, integrate multiple observations, measure uncertainty, and even provide posterior distributions over quantities that were not directly measured. 1
3 0.35247204 272 nips-2010-Towards Holistic Scene Understanding: Feedback Enabled Cascaded Classification Models
Author: Congcong Li, Adarsh Kowdle, Ashutosh Saxena, Tsuhan Chen
Abstract: In many machine learning domains (such as scene understanding), several related sub-tasks (such as scene categorization, depth estimation, object detection) operate on the same raw data and provide correlated outputs. Each of these tasks is often notoriously hard, and state-of-the-art classifiers already exist for many subtasks. It is desirable to have an algorithm that can capture such correlation without requiring to make any changes to the inner workings of any classifier. We propose Feedback Enabled Cascaded Classification Models (FE-CCM), that maximizes the joint likelihood of the sub-tasks, while requiring only a ‘black-box’ interface to the original classifier for each sub-task. We use a two-layer cascade of classifiers, which are repeated instantiations of the original ones, with the output of the first layer fed into the second layer as input. Our training method involves a feedback step that allows later classifiers to provide earlier classifiers information about what error modes to focus on. We show that our method significantly improves performance in all the sub-tasks in two different domains: (i) scene understanding, where we consider depth estimation, scene categorization, event categorization, object detection, geometric labeling and saliency detection, and (ii) robotic grasping, where we consider grasp point detection and object classification. 1
4 0.3488867 24 nips-2010-Active Learning Applied to Patient-Adaptive Heartbeat Classification
Author: Jenna Wiens, John V. Guttag
Abstract: While clinicians can accurately identify different types of heartbeats in electrocardiograms (ECGs) from different patients, researchers have had limited success in applying supervised machine learning to the same task. The problem is made challenging by the variety of tasks, inter- and intra-patient differences, an often severe class imbalance, and the high cost of getting cardiologists to label data for individual patients. We address these difficulties using active learning to perform patient-adaptive and task-adaptive heartbeat classification. When tested on a benchmark database of cardiologist annotated ECG recordings, our method had considerably better performance than other recently proposed methods on the two primary classification tasks recommended by the Association for the Advancement of Medical Instrumentation. Additionally, our method required over 90% less patient-specific training data than the methods to which we compared it. 1
5 0.30902204 125 nips-2010-Inference and communication in the game of Password
Author: Yang Xu, Charles Kemp
Abstract: Communication between a speaker and hearer will be most efficient when both parties make accurate inferences about the other. We study inference and communication in a television game called Password, where speakers must convey secret words to hearers by providing one-word clues. Our working hypothesis is that human communication is relatively efficient, and we use game show data to examine three predictions. First, we predict that speakers and hearers are both considerate, and that both take the other’s perspective into account. Second, we predict that speakers and hearers are calibrated, and that both make accurate assumptions about the strategy used by the other. Finally, we predict that speakers and hearers are collaborative, and that they tend to share the cognitive burden of communication equally. We find evidence in support of all three predictions, and demonstrate in addition that efficient communication tends to break down when speakers and hearers are placed under time pressure.
6 0.29501072 173 nips-2010-Multi-View Active Learning in the Non-Realizable Case
7 0.28727648 132 nips-2010-Joint Cascade Optimization Using A Product Of Boosted Classifiers
8 0.28543541 52 nips-2010-Convex Multiple-Instance Learning by Estimating Likelihood Ratio
9 0.27712563 94 nips-2010-Feature Set Embedding for Incomplete Data
10 0.2740244 89 nips-2010-Factorized Latent Spaces with Structured Sparsity
11 0.27055201 15 nips-2010-A Theory of Multiclass Boosting
12 0.26930377 31 nips-2010-An analysis on negative curvature induced by singularity in multi-layer neural-network learning
13 0.25379601 192 nips-2010-Online Classification with Specificity Constraints
14 0.24901913 284 nips-2010-Variational bounds for mixed-data factor analysis
15 0.23209655 282 nips-2010-Variable margin losses for classifier design
16 0.23138791 205 nips-2010-Permutation Complexity Bound on Out-Sample Error
17 0.2243751 9 nips-2010-A New Probabilistic Model for Rank Aggregation
18 0.2217242 86 nips-2010-Exploiting weakly-labeled Web images to improve object classification: a domain adaptation approach
19 0.22119188 62 nips-2010-Discriminative Clustering by Regularized Information Maximization
20 0.21487407 278 nips-2010-Universal Consistency of Multi-Class Support Vector Classification
topicId topicWeight
[(13, 0.037), (17, 0.011), (27, 0.06), (30, 0.049), (35, 0.02), (45, 0.15), (50, 0.059), (52, 0.019), (60, 0.029), (77, 0.033), (78, 0.018), (81, 0.349), (90, 0.074)]
simIndex simValue paperId paperTitle
same-paper 1 0.73824018 175 nips-2010-Multiparty Differential Privacy via Aggregation of Locally Trained Classifiers
Author: Manas Pathak, Shantanu Rane, Bhiksha Raj
Abstract: As increasing amounts of sensitive personal information finds its way into data repositories, it is important to develop analysis mechanisms that can derive aggregate information from these repositories without revealing information about individual data instances. Though the differential privacy model provides a framework to analyze such mechanisms for databases belonging to a single party, this framework has not yet been considered in a multi-party setting. In this paper, we propose a privacy-preserving protocol for composing a differentially private aggregate classifier using classifiers trained locally by separate mutually untrusting parties. The protocol allows these parties to interact with an untrusted curator to construct additive shares of a perturbed aggregate classifier. We also present a detailed theoretical analysis containing a proof of differential privacy of the perturbed aggregate classifier and a bound on the excess risk introduced by the perturbation. We verify the bound with an experimental evaluation on a real dataset. 1
Author: Li-jia Li, Hao Su, Li Fei-fei, Eric P. Xing
Abstract: Robust low-level image features have been proven to be effective representations for a variety of visual recognition tasks such as object recognition and scene classification; but pixels, or even local image patches, carry little semantic meanings. For high level visual tasks, such low-level image representations are potentially not enough. In this paper, we propose a high-level image representation, called the Object Bank, where an image is represented as a scale-invariant response map of a large number of pre-trained generic object detectors, blind to the testing dataset or visual task. Leveraging on the Object Bank representation, superior performances on high level visual recognition tasks can be achieved with simple off-the-shelf classifiers such as logistic regression and linear SVM. Sparsity algorithms make our representation more efficient and scalable for large scene datasets, and reveal semantically meaningful feature patterns.
3 0.48085138 178 nips-2010-Multivariate Dyadic Regression Trees for Sparse Learning Problems
Author: Han Liu, Xi Chen
Abstract: We propose a new nonparametric learning method based on multivariate dyadic regression trees (MDRTs). Unlike traditional dyadic decision trees (DDTs) or classification and regression trees (CARTs), MDRTs are constructed using penalized empirical risk minimization with a novel sparsity-inducing penalty. Theoretically, we show that MDRTs can simultaneously adapt to the unknown sparsity and smoothness of the true regression functions, and achieve the nearly optimal rates of convergence (in a minimax sense) for the class of (α, C)-smooth functions. Empirically, MDRTs can simultaneously conduct function estimation and variable selection in high dimensions. To make MDRTs applicable for large-scale learning problems, we propose a greedy heuristics. The superior performance of MDRTs are demonstrated on both synthetic and real datasets. 1
4 0.48069519 117 nips-2010-Identifying graph-structured activation patterns in networks
Author: James Sharpnack, Aarti Singh
Abstract: We consider the problem of identifying an activation pattern in a complex, largescale network that is embedded in very noisy measurements. This problem is relevant to several applications, such as identifying traces of a biochemical spread by a sensor network, expression levels of genes, and anomalous activity or congestion in the Internet. Extracting such patterns is a challenging task specially if the network is large (pattern is very high-dimensional) and the noise is so excessive that it masks the activity at any single node. However, typically there are statistical dependencies in the network activation process that can be leveraged to fuse the measurements of multiple nodes and enable reliable extraction of highdimensional noisy patterns. In this paper, we analyze an estimator based on the graph Laplacian eigenbasis, and establish the limits of mean square error recovery of noisy patterns arising from a probabilistic (Gaussian or Ising) model based on an arbitrary graph structure. We consider both deterministic and probabilistic network evolution models, and our results indicate that by leveraging the network interaction structure, it is possible to consistently recover high-dimensional patterns even when the noise variance increases with network size. 1
5 0.47985765 51 nips-2010-Construction of Dependent Dirichlet Processes based on Poisson Processes
Author: Dahua Lin, Eric Grimson, John W. Fisher
Abstract: We present a novel method for constructing dependent Dirichlet processes. The approach exploits the intrinsic relationship between Dirichlet and Poisson processes in order to create a Markov chain of Dirichlet processes suitable for use as a prior over evolving mixture models. The method allows for the creation, removal, and location variation of component models over time while maintaining the property that the random measures are marginally DP distributed. Additionally, we derive a Gibbs sampling algorithm for model inference and test it on both synthetic and real data. Empirical results demonstrate that the approach is effective in estimating dynamically varying mixture models. 1
6 0.47827211 238 nips-2010-Short-term memory in neuronal networks through dynamical compressed sensing
7 0.47625029 132 nips-2010-Joint Cascade Optimization Using A Product Of Boosted Classifiers
8 0.47502697 199 nips-2010-Optimal learning rates for Kernel Conjugate Gradient regression
9 0.47472101 282 nips-2010-Variable margin losses for classifier design
10 0.4737756 87 nips-2010-Extended Bayesian Information Criteria for Gaussian Graphical Models
11 0.47091511 63 nips-2010-Distributed Dual Averaging In Networks
12 0.4705449 80 nips-2010-Estimation of Renyi Entropy and Mutual Information Based on Generalized Nearest-Neighbor Graphs
13 0.46933603 260 nips-2010-Sufficient Conditions for Generating Group Level Sparsity in a Robust Minimax Framework
14 0.46923319 148 nips-2010-Learning Networks of Stochastic Differential Equations
15 0.46915838 158 nips-2010-Learning via Gaussian Herding
16 0.4689756 250 nips-2010-Spectral Regularization for Support Estimation
17 0.46843877 155 nips-2010-Learning the context of a category
18 0.46839619 280 nips-2010-Unsupervised Kernel Dimension Reduction
19 0.46829391 70 nips-2010-Efficient Optimization for Discriminative Latent Class Models
20 0.46803787 265 nips-2010-The LASSO risk: asymptotic results and real world examples