nips nips2010 nips2010-175 knowledge-graph by maker-knowledge-mining

175 nips-2010-Multiparty Differential Privacy via Aggregation of Locally Trained Classifiers

Source: pdf

Author: Manas Pathak, Shantanu Rane, Bhiksha Raj

Abstract: As increasing amounts of sensitive personal information ﬁnds its way into data repositories, it is important to develop analysis mechanisms that can derive aggregate information from these repositories without revealing information about individual data instances. Though the differential privacy model provides a framework to analyze such mechanisms for databases belonging to a single party, this framework has not yet been considered in a multi-party setting. In this paper, we propose a privacy-preserving protocol for composing a differentially private aggregate classiﬁer using classiﬁers trained locally by separate mutually untrusting parties. The protocol allows these parties to interact with an untrusted curator to construct additive shares of a perturbed aggregate classiﬁer. We also present a detailed theoretical analysis containing a proof of differential privacy of the perturbed aggregate classiﬁer and a bound on the excess risk introduced by the perturbation. We verify the bound with an experimental evaluation on a real dataset. 1

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu Abstract As increasing amounts of sensitive personal information ﬁnds its way into data repositories, it is important to develop analysis mechanisms that can derive aggregate information from these repositories without revealing information about individual data instances. [sent-7, score-0.353]

2 Though the differential privacy model provides a framework to analyze such mechanisms for databases belonging to a single party, this framework has not yet been considered in a multi-party setting. [sent-8, score-0.457]

3 In this paper, we propose a privacy-preserving protocol for composing a differentially private aggregate classiﬁer using classiﬁers trained locally by separate mutually untrusting parties. [sent-9, score-0.92]

4 The protocol allows these parties to interact with an untrusted curator to construct additive shares of a perturbed aggregate classiﬁer. [sent-10, score-1.33]

5 We also present a detailed theoretical analysis containing a proof of differential privacy of the perturbed aggregate classiﬁer and a bound on the excess risk introduced by the perturbation. [sent-11, score-1.021]

6 In the process, however, they risk compromising the privacy of the individuals by releasing sensitive information such as their medical or ﬁnancial records, addresses and telephone numbers, preferences of various kinds which the individuals may not want exposed. [sent-15, score-0.445]

7 In this paper, we address the problem of learning a classiﬁer from a multi-party collection of such private data. [sent-17, score-0.229]

8 The conditions we impose are that (a) None of the parties are willing to share the data with one another or with any third party (e. [sent-30, score-0.693]

9 Within SMC individual parties use a combination of cryptographic techniques and oblivious transfer to jointly compute a function of their private data [3, 4, 5]. [sent-35, score-0.782]

10 The techniques typically provide guarantees that none of the parties learn anything about the individual data besides what may be inferred from the ﬁnal result of the computation. [sent-36, score-0.577]

11 Moreover, for all but the simplest computational problems, SMC protocols tend to be highly expensive, requiring iterated encryption and decryption and repeated communication of encrypted partial results between participating parties. [sent-39, score-0.213]

12 An alternative theoretical model for protecting the privacy of individual data instances is differential privacy [6]. [sent-40, score-0.821]

13 A mechanism evaluated over a database is said to satisfy differential privacy if the probability of the mechanism producing a particular output is almost the same regardless of the presence or absence of any individual data instance in the database. [sent-42, score-0.597]

14 Differential privacy provides statistical guarantees that the output of the computation does not carry information about individual data instances. [sent-43, score-0.335]

15 We provide an alternative solution: within our approach the individual parties locally compute an optimal classiﬁer with their data. [sent-45, score-0.577]

16 The individual classiﬁers are then averaged to obtain the ﬁnal aggregate classiﬁer. [sent-46, score-0.306]

17 The aggregation is performed through a secure protocol that also adds a stochastic component to the averaged classiﬁer, such that the resulting aggregate classiﬁer is differentially private, i. [sent-47, score-0.691]

18 We prove that the addition of the noise does indeed result in a differentially private classiﬁer. [sent-54, score-0.438]

19 We also provide a bound on the true excess risk of the differentially private averaged classiﬁer compared to the optimal classiﬁer trained on the combined data. [sent-55, score-0.682]

20 2 Differential Privacy In this paper, we consider the differential privacy model introduced by Dwork [6]. [sent-57, score-0.43]

21 Given any two databases D and D differing by one element, which we will refer to as adjacent databases, a randomized query function M is said to be differentially private if the probability that M produces a response S on D is close to the probability that M produces the same response S on D . [sent-58, score-0.427]

22 Deﬁnition A randomized function M with a well-deﬁned probability density P satisﬁes differential privacy if, for all adjacent databases D and D and for any S ∈ range(M ), log P (M (D) = S) P (M (D ) = S) ≤ . [sent-60, score-0.457]

23 A classiﬁer satisfying differential privacy implies that no additional details about the individual training data instances can be obtained with certainty from output of the learning algorithm, beyond the a priori background knowledge. [sent-62, score-0.588]

24 Differential privacy provides an ad omnia guarantee as opposed to most other models that provide ad hoc guarantees against a speciﬁc set of attacks and adversarial behaviors. [sent-63, score-0.286]

25 By evaluating the differentially private classiﬁer over a large number of test instances, an adversary cannot learn the exact form of the training data. [sent-64, score-0.452]

26 [7] proposed the exponential mechanism for creating functions satisfying differential privacy by adding a perturbation term from the Laplace distribution scaled by the sensitivity of the function. [sent-67, score-0.591]

27 Chaudhuri and Monteleoni [8] use the exponential mechanism [7] to create a differentially private logistic regression classiﬁer by perturbing the estimated parameters with multivariate Laplacian noise scaled by the sensitivity of the classiﬁer. [sent-68, score-0.538]

28 They also propose another method to learn classiﬁers satisfying differential privacy by adding a linear perturbation term to the objective function which is scaled by Laplacian noise. [sent-69, score-0.514]

29 [9] show we can create a differentially private function by adding noise from Laplace distribution scaled by the smooth sensitivity of the function. [sent-71, score-0.506]

30 They also propose the sample and aggregate framework for replacing the original function with a related function for which the smooth sensitivity can be easily computed. [sent-73, score-0.302]

31 Smith [10] presents a method for differentially private unbiased MLE using this framework. [sent-74, score-0.4]

32 All the previous methods are inherently designed for the case where a single curator has access to the entire data and is interested in releasing a differentially private function computed over the data. [sent-75, score-0.541]

33 To the best of our knowledge and belief, ours is the ﬁrst method designed for releasing a differentially private classiﬁer computed over training data owned by different parties who do not wish to disclose the data to each other. [sent-76, score-1.024]

34 Our technique was principally motivated by the sample and aggregate framework, where we considered the samples to be owned by individual parties. [sent-77, score-0.364]

35 Similar to [10], we choose a simple average as the aggregation function and the parties together release the perturbed aggregate classiﬁer which satisﬁes differential privacy. [sent-78, score-1.065]

36 In the multi-party case, however, adding the perturbation to the classiﬁer is no longer straightforward and it is necessary to provide a secure protocol to do this. [sent-79, score-0.269]

37 3 Multiparty Classiﬁcation Protocol The problem we address is as follows: a number of parties P1 , . [sent-80, score-0.504]

38 1 Training Local Classiﬁers on Individual Datasets Each party Pj uses their data set (x, y)|j to learn an 2 regularized logistic regression classiﬁer with ˆ weights wj . [sent-89, score-0.245]

39 The parties then collaborate 1 ˆ ˆ to compute an aggregate classiﬁer given by ws = K j wj + η, where η is a d-dimensional 2 random variable sampled from a Laplace distribution scaled with the parameter n(1) λ and n(1) = minj nj . [sent-94, score-1.237]

40 As we shall see later, composing an aggregate classiﬁer in this manner incurs only a wellbounded excess risk over training a classiﬁer directly on the union of all data while enabling the parties to maintain their privacy. [sent-95, score-0.978]

41 1 that the noise term η ensures that ˆ the classiﬁer ws satisﬁes differential privacy, i. [sent-97, score-0.475]

42 , that individual data instances cannot be discerned from the aggregate classiﬁer. [sent-99, score-0.362]

43 We note that the parties Pj cannot simply take 3 ˆ their individually trained classiﬁers wj , perturb them with a noise vector and publish the perturbed classiﬁers, because aggregating such classiﬁers will not give the correct η ∼ Lap 2/(n(1) λ) in general. [sent-103, score-0.814]

44 Since individual parties cannot simply add noise to their classiﬁers to impose differential privacy, the actual averaging operation must be performed such that the individual parties do not expose their own classiﬁers or the number of data instances they possess. [sent-104, score-1.344]

45 We therefore use a private multiparty protocol, interacting with an untrusted curator “Charlie” to perform the averaging. [sent-105, score-0.429]

46 The outcome of the protocol is such that each of the parties obtain additive shares of the ﬁnal classiﬁer ˆ ˆ ws , such that these shares must be added to obtain ws . [sent-106, score-1.5]

47 Privacy-Preserving Protocol We use asymmetric key additively homomorphic encryption [11]. [sent-108, score-0.213]

48 For an additively homomorphic encryption function ξ(·), ξ(a) ξ(b) = ξ(a + b), ξ(a)b = ξ(ab). [sent-110, score-0.213]

49 For the ensuing protocol, encryption keys are considered public and decryption keys are privately owned by the speciﬁed parties. [sent-114, score-0.231]

50 Assuming the parties to be honest-but-curious, the steps of the protocol are as follows. [sent-115, score-0.632]

51 Each party Pj computes nj = aj + bj , where aj and bj are integers representing additive shares of the database lengths nj for j = 1, 2, . [sent-119, score-0.634]

52 The parties Pj mutually agree on a permutation π1 on the index vector (1, 2, . [sent-125, score-0.564]

53 Then, each party Pj sends its share aj to party Pπ1 (j) , and sends its share bj to Charlie with the index changed according to the permutation. [sent-130, score-0.535]

54 Thus, after this step, the parties have permuted additive shares given by π1 (a) while Charlie has permuted additive shares π1 (b). [sent-131, score-0.958]

55 The parties Pj generate a key pair (pk,sk) where pk is a public key for homomorphic encryption and sk is the secret decryption key known only to the parties but not to Charlie. [sent-133, score-1.349]

56 The parties send ξ(π1 (a)) = π1 (ξ(a)) to Charlie. [sent-135, score-0.504]

57 He sends ξ(π2 (π1 (a) + r)) to the individual parties in the following order: First element to P1 , second element to P2 ,. [sent-142, score-0.579]

58 Note also that, adding the vector collectively owned by the parties and the vector owned by Charlie would give π2 (π1 (a) + r) + π2 (π1 (b) − r) = π2 (π1 (a + b)) = π2 (π1 (n)). [sent-153, score-0.62]

59 , K}, these comparisons ˜ ˜ b b can be solved by any implementation of a secure millionaire protocol [2]. [sent-161, score-0.234]

60 Charlie holds only an additive share of minj nj and thus cannot know the true length of the smallest database. [sent-164, score-0.229]

61 He generates a key-pair (pk ,sk ) for additive homomorphic function ζ(·) where only the encryption −1 key pk is publicly available to the parties Pj . [sent-170, score-0.806]

62 Charlie then transmits ζ(π2 (u)) = −1 π2 (ζ(u)) to the parties Pj . [sent-171, score-0.527]

63 The parties mutually obtain a permuted vector π1 (π2 (ζ(u))) = ζ(v) where π1 inverts the permutation π1 originally applied by the parties Pj in Stage I. [sent-173, score-1.115]

64 However, since the parties Pj cannot decrypt ζ(·), they cannot ﬁnd out this index. [sent-175, score-0.504]

65 All parties Pj now compute a d-dimensional noise vector ψ such that, for i = 1, . [sent-186, score-0.542]

66 At this stage, the parties and Charlie have the following d-dimensional vectors: Charlie has K(η + s), ˆ ˆ P1 has w1 − Ks, and all other parties Pj , j = 2, . [sent-207, score-1.008]

67 None of the K + 1 participants can share this data for fear of compromising differential privacy. [sent-211, score-0.226]

68 Finally, Charlie and the K database-owning parties run a simple secure function evaluation protocol [13], at the end of which each of the K + 1 participants obtains an ˆ additive share of K ws . [sent-213, score-1.191]

69 This protocol is provably private against honest but curious participants when there are no collisions. [sent-214, score-0.39]

70 3 Testing Phase A test participant Dave having a test data instance x ∈ Rd is interested in applying the trained ˆ classiﬁer adds the published shares and divides by K to get the differentially private classiﬁer ws . [sent-219, score-0.855]

71 1 Proof of Differential Privacy We show that the perturbed aggregate classiﬁer satisﬁes differential privacy. [sent-222, score-0.532]

72 This would imply a change in one element in the training dataset of one party ˆj and thereby a change in the corresponding learned vector ws . [sent-231, score-0.508]

73 Assuming that the change is in the ˆ dataset of the party Pj , the change in the learned vector is only going to be in wj ; let denote the new ˆ ˆ ˆ ˆ classiﬁer by wj . [sent-232, score-0.356]

74 1, we bound the sensitivity of wj as wj − wj 1 ≤ nj2 λ . [sent-234, score-0.337]

75 ˆ We ﬁrst establish a bound on the 2 norm of the difference between the aggregate classiﬁer w and the classiﬁer w∗ trained over the entire training data. [sent-239, score-0.381]

76 Given the aggregate classiﬁer w, the classiﬁer w∗ trained over the entire training data and n(1) is the size of the smallest training dataset, ˆ w − w∗ 2 6 ≤ K −1 . [sent-245, score-0.421]

77 The largest possible n ˆ value for n(1) is K in which case all parties having an equal amount of training data and w will be closest to w∗ . [sent-248, score-0.531]

78 In the one party case for K = 1, the bound indicates that norm of the difference ˆ would be upper bounded by zero, which is a valid sanity check as the aggregate classiﬁer w is the same as w∗ . [sent-249, score-0.458]

79 We use this result to establish a bound on the empirical risk of the perturbed aggregate classiﬁer ˆ ˆ ws = w + η over the empirical risk of the unperturbed classiﬁer w∗ in the following theorem. [sent-250, score-0.892]

80 The bound increases for smaller values of implying a tighter deﬁnition of differential privacy, indicating a clear trade-off between privacy and utility. [sent-256, score-0.47]

81 The bound is also inversely proportional to n2 implying (1) an increase in excess risk when the parties have training datasets of disparate sizes. [sent-257, score-0.799]

82 In the limiting case → ∞, we are adding a perturbation term η sampled from a Laplacian distribution of inﬁnitesimally small variance resulting in the perturbed classiﬁer being almost as same as ˆ using the unperturbed aggregate classiﬁer w satisfying a very loose deﬁnition of differential privacy. [sent-258, score-0.646]

83 2, the excess error in using an aggregate classiﬁer is inversely proportional to the size of the smallest dataset n(1) and in the one party case K = 1, the bound ˆ becomes zero as the aggregate classiﬁer w is the same as w∗ . [sent-261, score-0.94]

84 Also, for a small value of in the one party case K = 1 and n(1) = n, our bound reduces to that in Lemma 3 of [8], ˆ J(ws ) ≤ J(w∗ ) + 2d2 (λ + 1) log2 n2 2 λ 2 d δ . [sent-262, score-0.201]

85 (4) While the previous theorem gives us a bound on the empirical excess risk over a given training ˆ dataset, it is important to consider a bound on the true excess risk of ws over w∗ . [sent-263, score-0.726]

86 Let us denote the ˜ ˆ ˆ ˆ true risk of the classiﬁer ws by J(ws ) = E[J(ws )] and similarly, the true risk of the classiﬁer w∗ by ˜ J(w∗ ) = E[J(w∗ )]. [sent-264, score-0.411]

87 In the following theorem, we apply the result from [14] which uses the bound on the empirical excess risk to form a bound on the true excess risk. [sent-265, score-0.347]

88 Experiments We perform an empirical evaluation of the proposed differentially private classiﬁer to obtain a characterization of the increase in the error due to perturbation. [sent-270, score-0.4]

89 5 non-private all data DP all data DP aggregate n(1)=6512 DP aggregate n(1)=4884 DP aggregate n(1)=3256 0. [sent-272, score-0.771]

90 4 ε ˆ Figure 2: Classiﬁer performance evaluated for w∗ , w∗ + η, and ws for different data splits vs. [sent-286, score-0.293]

91 The choice of the dataset is motivated as a realistic example for application of data privacy techniques. [sent-288, score-0.313]

92 1 In Figure 2, we compare the test error of perturbed aggregate classiﬁers trained over data from ﬁve parties for different values of . [sent-293, score-0.949]

93 We also compare with the error of the classiﬁer trained using combined training data and its perturbed version satisfying differential privacy. [sent-295, score-0.407]

94 The perturbed aggregate classiﬁer which is trained using maximum n(1) = 6512 does consistently better than for lower values of n(1) which is same as our theory suggested. [sent-297, score-0.445]

95 Also, the test error for all perturbed aggregate classiﬁers drops with , but comparatively faster for even split and converges to the test error of the classiﬁer trained over the combined data. [sent-298, score-0.467]

96 As expected, the differentially private classiﬁer trained over the entire training data does much better than the perturbed aggregate classiﬁers with an error equal to the unperturbed classiﬁer except for small values of . [sent-299, score-0.925]

97 The lower error of this classiﬁer is at the cost of the loss in privacy of the parties as they would need to share the data in order to train the classiﬁer over combined data. [sent-300, score-0.84]

98 6 Conclusion We proposed a method for composing an aggregate classiﬁer satisfying -differential privacy from classiﬁers locally trained by multiple untrusting parties. [sent-301, score-0.704]

99 The upper bound on the excess risk of the perturbed aggregate classifer as compared to the optimal classiﬁer trained over the complete data without privacy constraints is inversely proportional to the privacy parameter , suggesting an inherent tradeoff between privacy and utility. [sent-302, score-1.547]

100 In future work, we seek to generalize the theoretical analysis of the perturbed aggregate classiﬁer to the setting in which each party has data generated from a different distribution. [sent-305, score-0.549]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('parties', 0.504), ('ws', 0.293), ('privacy', 0.286), ('aggregate', 0.257), ('charlie', 0.24), ('private', 0.229), ('er', 0.208), ('differentially', 0.171), ('party', 0.161), ('classi', 0.149), ('differential', 0.144), ('perturbed', 0.131), ('protocol', 0.128), ('encryption', 0.12), ('pj', 0.12), ('curator', 0.106), ('secure', 0.106), ('shares', 0.105), ('excess', 0.104), ('wj', 0.084), ('additive', 0.072), ('homomorphic', 0.067), ('multiparty', 0.067), ('risk', 0.059), ('ers', 0.058), ('owned', 0.058), ('secret', 0.058), ('trained', 0.057), ('instances', 0.056), ('database', 0.054), ('smallest', 0.053), ('decryption', 0.053), ('smc', 0.053), ('unperturbed', 0.053), ('permuted', 0.05), ('individual', 0.049), ('aj', 0.049), ('nj', 0.046), ('sensitivity', 0.045), ('pk', 0.043), ('stage', 0.042), ('inversely', 0.041), ('bound', 0.04), ('encrypted', 0.04), ('jaideep', 0.04), ('vaidya', 0.04), ('noise', 0.038), ('releasing', 0.035), ('nissim', 0.035), ('perturbation', 0.035), ('laplace', 0.033), ('participants', 0.033), ('dwork', 0.032), ('mechanism', 0.032), ('adult', 0.032), ('permutation', 0.03), ('minj', 0.03), ('index', 0.03), ('aggregation', 0.029), ('share', 0.028), ('obtains', 0.027), ('composing', 0.027), ('dataset', 0.027), ('training', 0.027), ('databases', 0.027), ('atallah', 0.027), ('bhiksha', 0.027), ('decrypts', 0.027), ('inverts', 0.027), ('kantarcioglu', 0.027), ('kobbi', 0.027), ('plaintext', 0.027), ('rane', 0.027), ('restated', 0.027), ('turbed', 0.027), ('untrusted', 0.027), ('untrusting', 0.027), ('additively', 0.026), ('sends', 0.026), ('satisfying', 0.026), ('bj', 0.026), ('adversary', 0.025), ('dp', 0.025), ('possess', 0.024), ('none', 0.024), ('datasets', 0.024), ('personal', 0.024), ('locally', 0.024), ('murat', 0.023), ('repositories', 0.023), ('obliviously', 0.023), ('cynthia', 0.023), ('obfuscated', 0.023), ('transmits', 0.023), ('scaled', 0.023), ('please', 0.023), ('combined', 0.022), ('individuals', 0.022), ('uci', 0.022), ('compromising', 0.021)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000002 175 nips-2010-Multiparty Differential Privacy via Aggregation of Locally Trained Classifiers

Author: Manas Pathak, Shantanu Rane, Bhiksha Raj

2 0.34919572 216 nips-2010-Probabilistic Inference and Differential Privacy

Author: Oliver Williams, Frank Mcsherry

Abstract: We identify and investigate a strong connection between probabilistic inference and differential privacy, the latter being a recent privacy deﬁnition that permits only indirect observation of data through noisy measurement. Previous research on differential privacy has focused on designing measurement processes whose output is likely to be useful on its own. We consider the potential of applying probabilistic inference to the measurements and measurement process to derive posterior distributions over the data sets and model parameters thereof. We ﬁnd that probabilistic inference can improve accuracy, integrate multiple observations, measure uncertainty, and even provide posterior distributions over quantities that were not directly measured. 1

3 0.1039341 272 nips-2010-Towards Holistic Scene Understanding: Feedback Enabled Cascaded Classification Models

Author: Congcong Li, Adarsh Kowdle, Ashutosh Saxena, Tsuhan Chen

Abstract: In many machine learning domains (such as scene understanding), several related sub-tasks (such as scene categorization, depth estimation, object detection) operate on the same raw data and provide correlated outputs. Each of these tasks is often notoriously hard, and state-of-the-art classiﬁers already exist for many subtasks. It is desirable to have an algorithm that can capture such correlation without requiring to make any changes to the inner workings of any classiﬁer. We propose Feedback Enabled Cascaded Classiﬁcation Models (FE-CCM), that maximizes the joint likelihood of the sub-tasks, while requiring only a ‘black-box’ interface to the original classiﬁer for each sub-task. We use a two-layer cascade of classiﬁers, which are repeated instantiations of the original ones, with the output of the ﬁrst layer fed into the second layer as input. Our training method involves a feedback step that allows later classiﬁers to provide earlier classiﬁers information about what error modes to focus on. We show that our method signiﬁcantly improves performance in all the sub-tasks in two different domains: (i) scene understanding, where we consider depth estimation, scene categorization, event categorization, object detection, geometric labeling and saliency detection, and (ii) robotic grasping, where we consider grasp point detection and object classiﬁcation. 1

4 0.097097903 89 nips-2010-Factorized Latent Spaces with Structured Sparsity

Author: Yangqing Jia, Mathieu Salzmann, Trevor Darrell

Abstract: Recent approaches to multi-view learning have shown that factorizing the information into parts that are shared across all views and parts that are private to each view could effectively account for the dependencies and independencies between the different input modalities. Unfortunately, these approaches involve minimizing non-convex objective functions. In this paper, we propose an approach to learning such factorized representations inspired by sparse coding techniques. In particular, we show that structured sparsity allows us to address the multiview learning problem by alternately solving two convex optimization problems. Furthermore, the resulting factorized latent spaces generalize over existing approaches in that they allow having latent dimensions shared between any subset of the views instead of between all the views only. We show that our approach outperforms state-of-the-art methods on the task of human pose estimation. 1

5 0.083017275 243 nips-2010-Smoothness, Low Noise and Fast Rates

Author: Nathan Srebro, Karthik Sridharan, Ambuj Tewari

Abstract: √ ˜ We establish an excess risk bound of O HR2 + HL∗ Rn for ERM with an H-smooth loss n function and a hypothesis class with Rademacher complexity Rn , where L∗ is the best risk achievable by the hypothesis class. For typical hypothesis classes where Rn = R/n, this translates to ˜ ˜ a learning rate of O (RH/n) in the separable (L∗ = 0) case and O RH/n + L∗ RH/n more generally. We also provide similar guarantees for online and stochastic convex optimization of a smooth non-negative objective. 1

6 0.067646869 132 nips-2010-Joint Cascade Optimization Using A Product Of Boosted Classifiers

7 0.06409429 282 nips-2010-Variable margin losses for classifier design

8 0.062042717 192 nips-2010-Online Classification with Specificity Constraints

9 0.060110517 228 nips-2010-Reverse Multi-Label Learning

10 0.057217196 15 nips-2010-A Theory of Multiclass Boosting

11 0.055952031 52 nips-2010-Convex Multiple-Instance Learning by Estimating Likelihood Ratio

12 0.054854859 24 nips-2010-Active Learning Applied to Patient-Adaptive Heartbeat Classification

13 0.054137919 173 nips-2010-Multi-View Active Learning in the Non-Realizable Case

14 0.053774267 94 nips-2010-Feature Set Embedding for Incomplete Data

15 0.049343556 31 nips-2010-An analysis on negative curvature induced by singularity in multi-layer neural-network learning

16 0.045701481 86 nips-2010-Exploiting weakly-labeled Web images to improve object classification: a domain adaptation approach

17 0.042242471 260 nips-2010-Sufficient Conditions for Generating Group Level Sparsity in a Robust Minimax Framework

18 0.04194406 22 nips-2010-Active Estimation of F-Measures

19 0.041842248 135 nips-2010-Label Embedding Trees for Large Multi-Class Tasks

20 0.040402781 104 nips-2010-Generative Local Metric Learning for Nearest Neighbor Classification

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.126), (1, 0.04), (2, 0.041), (3, -0.056), (4, 0.03), (5, 0.063), (6, -0.08), (7, -0.017), (8, -0.051), (9, -0.069), (10, -0.013), (11, 0.01), (12, 0.007), (13, -0.027), (14, -0.045), (15, 0.036), (16, 0.034), (17, 0.154), (18, -0.034), (19, -0.016), (20, -0.047), (21, -0.022), (22, -0.11), (23, -0.026), (24, 0.023), (25, 0.01), (26, 0.03), (27, 0.143), (28, -0.054), (29, -0.062), (30, 0.218), (31, -0.18), (32, -0.042), (33, -0.117), (34, -0.129), (35, -0.112), (36, 0.159), (37, -0.001), (38, -0.219), (39, -0.206), (40, -0.229), (41, -0.178), (42, -0.051), (43, 0.093), (44, -0.106), (45, -0.081), (46, -0.039), (47, -0.265), (48, 0.109), (49, -0.005)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.92211699 175 nips-2010-Multiparty Differential Privacy via Aggregation of Locally Trained Classifiers

Author: Manas Pathak, Shantanu Rane, Bhiksha Raj

2 0.80241835 216 nips-2010-Probabilistic Inference and Differential Privacy

Author: Oliver Williams, Frank Mcsherry

3 0.35247204 272 nips-2010-Towards Holistic Scene Understanding: Feedback Enabled Cascaded Classification Models

Author: Congcong Li, Adarsh Kowdle, Ashutosh Saxena, Tsuhan Chen

4 0.3488867 24 nips-2010-Active Learning Applied to Patient-Adaptive Heartbeat Classification

Author: Jenna Wiens, John V. Guttag

Abstract: While clinicians can accurately identify different types of heartbeats in electrocardiograms (ECGs) from different patients, researchers have had limited success in applying supervised machine learning to the same task. The problem is made challenging by the variety of tasks, inter- and intra-patient differences, an often severe class imbalance, and the high cost of getting cardiologists to label data for individual patients. We address these difﬁculties using active learning to perform patient-adaptive and task-adaptive heartbeat classiﬁcation. When tested on a benchmark database of cardiologist annotated ECG recordings, our method had considerably better performance than other recently proposed methods on the two primary classiﬁcation tasks recommended by the Association for the Advancement of Medical Instrumentation. Additionally, our method required over 90% less patient-speciﬁc training data than the methods to which we compared it. 1

5 0.30902204 125 nips-2010-Inference and communication in the game of Password

Author: Yang Xu, Charles Kemp

Abstract: Communication between a speaker and hearer will be most efﬁcient when both parties make accurate inferences about the other. We study inference and communication in a television game called Password, where speakers must convey secret words to hearers by providing one-word clues. Our working hypothesis is that human communication is relatively efﬁcient, and we use game show data to examine three predictions. First, we predict that speakers and hearers are both considerate, and that both take the other’s perspective into account. Second, we predict that speakers and hearers are calibrated, and that both make accurate assumptions about the strategy used by the other. Finally, we predict that speakers and hearers are collaborative, and that they tend to share the cognitive burden of communication equally. We ﬁnd evidence in support of all three predictions, and demonstrate in addition that efﬁcient communication tends to break down when speakers and hearers are placed under time pressure.

6 0.29501072 173 nips-2010-Multi-View Active Learning in the Non-Realizable Case

7 0.28727648 132 nips-2010-Joint Cascade Optimization Using A Product Of Boosted Classifiers

8 0.28543541 52 nips-2010-Convex Multiple-Instance Learning by Estimating Likelihood Ratio

9 0.27712563 94 nips-2010-Feature Set Embedding for Incomplete Data

10 0.2740244 89 nips-2010-Factorized Latent Spaces with Structured Sparsity

11 0.27055201 15 nips-2010-A Theory of Multiclass Boosting

12 0.26930377 31 nips-2010-An analysis on negative curvature induced by singularity in multi-layer neural-network learning

13 0.25379601 192 nips-2010-Online Classification with Specificity Constraints

14 0.24901913 284 nips-2010-Variational bounds for mixed-data factor analysis

15 0.23209655 282 nips-2010-Variable margin losses for classifier design

16 0.23138791 205 nips-2010-Permutation Complexity Bound on Out-Sample Error

17 0.2243751 9 nips-2010-A New Probabilistic Model for Rank Aggregation

18 0.2217242 86 nips-2010-Exploiting weakly-labeled Web images to improve object classification: a domain adaptation approach

19 0.22119188 62 nips-2010-Discriminative Clustering by Regularized Information Maximization

20 0.21487407 278 nips-2010-Universal Consistency of Multi-Class Support Vector Classification

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(13, 0.037), (17, 0.011), (27, 0.06), (30, 0.049), (35, 0.02), (45, 0.15), (50, 0.059), (52, 0.019), (60, 0.029), (77, 0.033), (78, 0.018), (81, 0.349), (90, 0.074)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.73824018 175 nips-2010-Multiparty Differential Privacy via Aggregation of Locally Trained Classifiers

Author: Manas Pathak, Shantanu Rane, Bhiksha Raj

2 0.48284101 186 nips-2010-Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification

Author: Li-jia Li, Hao Su, Li Fei-fei, Eric P. Xing

Abstract: Robust low-level image features have been proven to be effective representations for a variety of visual recognition tasks such as object recognition and scene classiﬁcation; but pixels, or even local image patches, carry little semantic meanings. For high level visual tasks, such low-level image representations are potentially not enough. In this paper, we propose a high-level image representation, called the Object Bank, where an image is represented as a scale-invariant response map of a large number of pre-trained generic object detectors, blind to the testing dataset or visual task. Leveraging on the Object Bank representation, superior performances on high level visual recognition tasks can be achieved with simple off-the-shelf classiﬁers such as logistic regression and linear SVM. Sparsity algorithms make our representation more efﬁcient and scalable for large scene datasets, and reveal semantically meaningful feature patterns.

3 0.48085138 178 nips-2010-Multivariate Dyadic Regression Trees for Sparse Learning Problems

Author: Han Liu, Xi Chen

Abstract: We propose a new nonparametric learning method based on multivariate dyadic regression trees (MDRTs). Unlike traditional dyadic decision trees (DDTs) or classiﬁcation and regression trees (CARTs), MDRTs are constructed using penalized empirical risk minimization with a novel sparsity-inducing penalty. Theoretically, we show that MDRTs can simultaneously adapt to the unknown sparsity and smoothness of the true regression functions, and achieve the nearly optimal rates of convergence (in a minimax sense) for the class of (α, C)-smooth functions. Empirically, MDRTs can simultaneously conduct function estimation and variable selection in high dimensions. To make MDRTs applicable for large-scale learning problems, we propose a greedy heuristics. The superior performance of MDRTs are demonstrated on both synthetic and real datasets. 1

4 0.48069519 117 nips-2010-Identifying graph-structured activation patterns in networks

Author: James Sharpnack, Aarti Singh

Abstract: We consider the problem of identifying an activation pattern in a complex, largescale network that is embedded in very noisy measurements. This problem is relevant to several applications, such as identifying traces of a biochemical spread by a sensor network, expression levels of genes, and anomalous activity or congestion in the Internet. Extracting such patterns is a challenging task specially if the network is large (pattern is very high-dimensional) and the noise is so excessive that it masks the activity at any single node. However, typically there are statistical dependencies in the network activation process that can be leveraged to fuse the measurements of multiple nodes and enable reliable extraction of highdimensional noisy patterns. In this paper, we analyze an estimator based on the graph Laplacian eigenbasis, and establish the limits of mean square error recovery of noisy patterns arising from a probabilistic (Gaussian or Ising) model based on an arbitrary graph structure. We consider both deterministic and probabilistic network evolution models, and our results indicate that by leveraging the network interaction structure, it is possible to consistently recover high-dimensional patterns even when the noise variance increases with network size. 1

5 0.47985765 51 nips-2010-Construction of Dependent Dirichlet Processes based on Poisson Processes

Author: Dahua Lin, Eric Grimson, John W. Fisher

Abstract: We present a novel method for constructing dependent Dirichlet processes. The approach exploits the intrinsic relationship between Dirichlet and Poisson processes in order to create a Markov chain of Dirichlet processes suitable for use as a prior over evolving mixture models. The method allows for the creation, removal, and location variation of component models over time while maintaining the property that the random measures are marginally DP distributed. Additionally, we derive a Gibbs sampling algorithm for model inference and test it on both synthetic and real data. Empirical results demonstrate that the approach is effective in estimating dynamically varying mixture models. 1

6 0.47827211 238 nips-2010-Short-term memory in neuronal networks through dynamical compressed sensing

7 0.47625029 132 nips-2010-Joint Cascade Optimization Using A Product Of Boosted Classifiers

8 0.47502697 199 nips-2010-Optimal learning rates for Kernel Conjugate Gradient regression

9 0.47472101 282 nips-2010-Variable margin losses for classifier design

10 0.4737756 87 nips-2010-Extended Bayesian Information Criteria for Gaussian Graphical Models

11 0.47091511 63 nips-2010-Distributed Dual Averaging In Networks

12 0.4705449 80 nips-2010-Estimation of Renyi Entropy and Mutual Information Based on Generalized Nearest-Neighbor Graphs

13 0.46933603 260 nips-2010-Sufficient Conditions for Generating Group Level Sparsity in a Robust Minimax Framework

14 0.46923319 148 nips-2010-Learning Networks of Stochastic Differential Equations

15 0.46915838 158 nips-2010-Learning via Gaussian Herding

16 0.4689756 250 nips-2010-Spectral Regularization for Support Estimation

17 0.46843877 155 nips-2010-Learning the context of a category

18 0.46839619 280 nips-2010-Unsupervised Kernel Dimension Reduction

19 0.46829391 70 nips-2010-Efficient Optimization for Discriminative Latent Class Models

20 0.46803787 265 nips-2010-The LASSO risk: asymptotic results and real world examples