nips nips2008 nips2008-159 knowledge-graph by maker-knowledge-mining

159 nips-2008-On Bootstrapping the ROC Curve


Source: pdf

Author: Patrice Bertail, Stéphan J. Clémençcon, Nicolas Vayatis

Abstract: This paper is devoted to thoroughly investigating how to bootstrap the ROC curve, a widely used visual tool for evaluating the accuracy of test/scoring statistics in the bipartite setup. The issue of confidence bands for the ROC curve is considered and a resampling procedure based on a smooth version of the empirical distribution called the ”smoothed bootstrap” is introduced. Theoretical arguments and simulation results are presented to show that the ”smoothed bootstrap” is preferable to a ”naive” bootstrap in order to construct accurate confidence bands. 1

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 fr Abstract This paper is devoted to thoroughly investigating how to bootstrap the ROC curve, a widely used visual tool for evaluating the accuracy of test/scoring statistics in the bipartite setup. [sent-6, score-0.535]

2 The issue of confidence bands for the ROC curve is considered and a resampling procedure based on a smooth version of the empirical distribution called the ”smoothed bootstrap” is introduced. [sent-7, score-0.458]

3 Theoretical arguments and simulation results are presented to show that the ”smoothed bootstrap” is preferable to a ”naive” bootstrap in order to construct accurate confidence bands. [sent-8, score-0.505]

4 The latter consists of determining, based on training data, a test statistic s(X) (also called a scoring function) with a ROC curve ”as high as possible” at all points of the ROC space. [sent-11, score-0.291]

5 Given a candidate s(X), it is thus of prime importance to assess its performance by computing a confidence band for the corresponding ROC curve, in a data-driven fashion preferably. [sent-12, score-0.079]

6 Indeed, in such a functional setup, resampling-based procedures should naturally be preferred to those relying on computing/simulating the (gaussian) limiting distribution, as first observed in [19, 21, 20], where the use of the bootstrap is promoted for building confidence bands in the ROC space. [sent-13, score-0.68]

7 By building on recent works, see [17, 12], it is the purpose of this paper to investigate how the bootstrap approach should be practically implemented based on a thorough analysis of the asymptotic properties of empirical ROC curves. [sent-14, score-0.618]

8 Beyond the pointwise analysis developed in the studies mentioned above, here we tackle the problem from a functional angle, considering the entire ROC curve or parts of it. [sent-15, score-0.212]

9 This viewpoint indeed appears as particularly relevant in scoring applications. [sent-16, score-0.1]

10 Although the asymptotic results established in this paper are of a theoretical nature, they are considerably meaningful from a computational perspective. [sent-17, score-0.084]

11 It turns out indeed that smoothing is the 1 key ingredient for the bootstrap confidence band to be accurate, whereas a naive bootstrap approach would yield bands of low coverage probability in this case and should be consequently avoided by practicioners for analyzing ROC curves. [sent-18, score-1.403]

12 The smoothed bootstrap algorithm is presented in Section 3, together with the theoretical results establishing its asymptotic accuracy as well as preliminary simulation results illustrating the impact of smoothing on the bootstrap performance. [sent-22, score-1.271]

13 In Section 4, the gain in terms of convergence rate acquired by the smoothing step is thoroughly discussed. [sent-23, score-0.13]

14 2 Background Here we briefly recall basic concepts of the bipartite ranking problem as well as key results related to the statistical estimation of ROC curves. [sent-25, score-0.086]

15 We also set out the notations that shall be needed throughout the paper. [sent-26, score-0.061]

16 1 Assumptions and notation In the bipartite ranking problem, the problem is to order all the elements X of a set X by degree of relevance, when relevancy may be observed through some binary indicator variable Y . [sent-29, score-0.086]

17 The challenge is thus to build a scoring function s : X → R from sampling data, so as to rank the observations x by increasing order of their score s(x) as accurately as possible: the higher the score s(X) is, the more likely one should observe Y = +1. [sent-34, score-0.073]

18 A standard way of measuring the ranking performance consists of plotting the ROC curve, namely the graph of the mapping −1 ROCs : α ∈ (0, 1) → 1 − (Gs ◦ Hs )(1 − α), where Gs (respectively Hs ) denotes s(X)’s cdf conditioned on Y = +1 (resp. [sent-36, score-0.149]

19 conditioned on Y = −1) and F −1 (α) = inf{x ∈ R/ F (x) ≥ α} the generalized inverse of any cdf F on R. [sent-37, score-0.072]

20 It boils down to plotting the true positive rate versus the false positive rate when testing the assumption ”H0 : Y = −1” based on the statistic s(X). [sent-38, score-0.228]

21 ∀α ∈ (0, 1), ROCη (α) ≥ ROCs (α), for any scoring function s(x)). [sent-41, score-0.073]

22 Practical learning strategies for selecting a good scoring function are based on training data Dn = {(Xi , Yi )}1≤i≤n and should thus rely on accurate empirical estimates of the true ROC curves. [sent-43, score-0.157]

23 In order to obtain smoothed versions Gs (x) and Fs (x) of the latter cdfs, a typical choice consists of picking instead a function K(u) of the form v≥0 Kh (u − v)dv, with Kh (u) = h−1 K(h−1 · u) where K ≥ 0 is a regularizing Parzen-Rosenblatt kernel (i. [sent-46, score-0.222]

24 a bounded square integrable function such that K(v)dv = 1) and h > 0 is the smoothing bandwidth, see Remark 1 for a practical view of smoothing. [sent-48, score-0.06]

25 When it comes to measure closeness between curves in the ROC space, various metrics may be used, see [9]. [sent-51, score-0.082]

26 Viewing the ROC space as a subset of the Skorohod’s space D([0, 1]) of c` d-l` g functions f : [0, 1] → R, the standard metric induced by the sup norm a a ||. [sent-52, score-0.107]

27 As shall be seen below, asymptotic arguments for grounding the bootstrapping of the empirical ROC curve fluctuations, when measured in terms of the sup norm ||. [sent-54, score-0.47]

28 However, given the geometry of empirical ROC curves, this metric is not always convenient for our purpose and may produce very wide, and thus non informative confidence bands. [sent-56, score-0.084]

29 For analyzing stepwise graphs, such as empirical ROC curves, we shall consider the closely related pseudo-metric defined as follows: ∀(f1 , f2 ) ∈ D([0, 1])2 , dB (f1 , f2 ) = sup dB (f1 , f2 ; t), t∈[0,1] −1 −1 where dB (f1 , f2 ; t) = min{|f1 (t) − f2 (t)|, |f2 ◦ f1 (t) − t|, |f1 ◦ f2 (t) − t|. [sent-57, score-0.202]

30 The major advantage of considering this pseudo-metric is that it provides a control on vertical and horizontal jumps of ROC curves both at the same time, treating both types of error in a symmetric fashion. [sent-59, score-0.08]

31 Equipped with this pseudo-metric, two piecewise constant ROC curves may be close to each other, even if their jumps do not exactly match. [sent-60, score-0.08]

32 This is clearly appropriate for describing the fluctuations of the empirical ROC curve (and the deviation between the latter and its bootstrap counterpart as well). [sent-61, score-0.732]

33 This way, dB permits to construct builds bands of reasonable size, well adapted to the stepwise shape of empirical ROC curves, with better coverage probabilities. [sent-62, score-0.353]

34 Eventually, denote by P the joint distribution of (Z, Y ) on R × {−1, +1} and by Pn its empirical version based on the sample Dn = {(Zi , Yi )}1≤i≤n . [sent-71, score-0.067]

35 This (gaussian) approximation plays a crucial role in understanding the asymptotic behavior of the empirical ROC curve and of its bootstrap counterpart. [sent-75, score-0.784]

36 H1 The slope of the ROC curve is bounded: supα∈[0,1] {g(H −1 (α))/h(H −1 (α))} < ∞. [sent-77, score-0.142]

37 Then, 3 (i) the empirical ROC curve is strongly consistent: sup |ROCn (α) − ROC(α)| → 0 a. [sent-82, score-0.278]

38 ρ1 (γ) = γ, ρ2 (γ) = γ − 1 + ε, ε > 0, if γ > 1 These results may be immediately derived from classical strong approximations for the empirical and quantile processes, see [5, 18]). [sent-85, score-0.165]

39 Incidentally, we mention that the approximation rate is not √ always log2 (n)/ n, contrarily to what is claimed in [18]. [sent-86, score-0.075]

40 To avoid explicit computation of density estimates, bootstrap confidence sets should be certainly preferred in practice. [sent-88, score-0.485]

41 We shall also consider √ d∗ = ndB (ROC∗ , ROC), (3) n √ whose random fluctuations, given Dn , are expected to mimic those of dn = ndB (ROC, ROC). [sent-94, score-0.227]

42 Firstly, the target of the bootstrap procedure is here a distribution on a path space, the ROC space being viewed as a subspace of Dn ([0, 1]), equipped with either ||. [sent-96, score-0.517]

43 Secondly, both rn and dn are functionals of the quantile process {H −1 (α)}α∈[0,1] . [sent-100, score-0.347]

44 resampling from the raw empirical distribution) generally provides bad approximations of the distribution of empirical quantiles in practice: the rate of convergence for a given quantile is indeed of order OP (n−1/4 ), see [7], whereas the rate of the gaussian approximation is n−1/2 . [sent-103, score-0.437]

45 In a similar fashion to what is generally recommended for empirical quantiles, we suggest to implement a smoothed version of the bootstrap algorithm in order to improve the approximation rate of ||rn ||∞ ’s distribution, respectively of dn ’s distribution . [sent-105, score-0.989]

46 In short, this boils down to resampling the data from a smoothed version of the empirical distribution Pn . [sent-106, score-0.287]

47 1 The Algorithm Here we describe the algorithm for building a confidence band at level 1 − in the ROC space from sampling data Dn = {(Zi , Yi ); 1 ≤ i ≤ n}. [sent-108, score-0.061]

48 Based on Dn , compute the empirical class cdf estimates G and H, as well as their smoothed versions G and H. [sent-112, score-0.343]

49 Plot the ROC curve estimate: ROC(α) = 1 − G ◦ H −1 (1 − α), α ∈ [0, 1]. [sent-113, score-0.142]

50 From the smooth distribution estimate n− n+ Pn (dz, y) = I{y=+1} G(dz) + I{y=−1} H(dz), n n ∗ ∗ draw a bootstrap sample Dn = {(Zi , Yi∗ )}1≤i≤n conditioned on Dn . [sent-115, score-0.522]

51 Based on Dn , compute the bootstrap versions of the empirical class cdf estimates G∗ ∗ and H . [sent-117, score-0.671]

52 Plot the bootstrap ROC curve ROC∗ (α) = 1 − G∗ ◦ H ∗−1 (1 − α), α ∈ [0, 1]. [sent-118, score-0.627]

53 Eventually, get the bootstrap confidence bands at level 1− defined by the ball of center √ ∗ ROC and radius δ / n in D([0, 1]), where δ is defined by P∗ (||rn ||∞ ≤ δ ) = 1 − ∗ ∗ in the case of the sup norm or by P (dn ≤ δ ) = 1 − , when considering the dB distance, denoting by P∗ (. [sent-120, score-0.745]

54 Remark 1 (M ONTE -C ARLO APPROXIMATION) From a computational angle, the true smoothed bootstrap distribution must be approximated in its turn, using a Monte-Carlo approximation scheme. [sent-123, score-0.666]

55 Regarding the choice of the number of bootstrap replications, picking B = n does not modify the rate of convergence. [sent-125, score-0.536]

56 However, choosing B of magnitude comparable to n so that (1 + B) is an integer may be more appropriate: the -quantile of the approximate bootstrap distribution is the uniquely defined and this will not modify the rate of convergence neither, see [15]. [sent-126, score-0.536]

57 Remark 2 (O N TUNING PARAMETERS) The primary tuning parameters of the Algorithm are those related to the smoothing stage. [sent-127, score-0.06]

58 When using a gaussian regularizing kernel, one should typically choose a bandwidth hn of order n−1/5 in order to minimize the mean square error. [sent-128, score-0.082]

59 Remark 3 (O N RECENTERING) From the asymptotic analysis viewpoint, it would be fairly equivalent to recenter by a smoothed version of the original empirical curve ROC(. [sent-129, score-0.432]

60 However, numerically speaking, computing the sup norm of the estimate (2) is much more tractable, insofar as it solely requires to evaluate the distance between piecewise constant curves over the pooled set of jump points. [sent-132, score-0.186]

61 It should also be noticed that smoothing the original curve, as proposed in [17], should be also avoided in practice, since it hides the jump locations, which constitute the essential part of the information. [sent-133, score-0.112]

62 2 Asymptotic analysis We now investigate the accuracy of the bootstrap estimate output by the Algorithm. [sent-135, score-0.485]

63 The functional nature of the approximation result below is essential, since it should be enhanced that, in most ranking applications, assessing the uncertainty about the whole estimated ROC curve, or some part of it at least, is what really matters. [sent-137, score-0.104]

64 Assume further that smoothed versions of the cdf’s G and H are computed at step 1 using a scaled kernel Khn (u) with hn ↓ 0 as n → ∞ in a way that nh3 → ∞ and nh5 log2 n → 0. [sent-143, score-0.212]

65 Although its rate is slower than the one of the gaussian approximation (1), the smoothed bootstrap method remains very appealing from a computational perspective, the construction of confidence bands from simulated brownian bridges being very difficult to implement in practice. [sent-145, score-0.931]

66 As shall be seen below, the rate reached by the smoothed bootstrap distribution is nevertheless a great improvement, compared to the naive bootstrap approach (see the discussion below). [sent-146, score-1.27]

67 For instance, it could be applied to summary statistics involving a specific piece of the ROC curve only in order to focus on the ”best instances” [3], or more classically to the area under the ROC curve (AUC). [sent-148, score-0.284]

68 However, in the latter case, due to the fact that this particular summary statistic is of the form of a U -statistic [2], the naive bootstrap rate is faster than the one we obtained here (of order n−1 ). [sent-149, score-0.664]

69 3 Simulation results The striking advantage of the smoothed bootstrap is the improved rate of convergence of the resulting estimator. [sent-151, score-0.693]

70 Furthermore, choosing dB for measuring the magnitude order of curve fluctuations has an even larger impact on the accuracy of the empirical bands. [sent-152, score-0.209]

71 As an illustration of this theoretical result, we now display simulation results, emphasizing the gain acquired by smoothing and considering the pseudo-metric dB . [sent-153, score-0.078]

72 We present confidence bands for a single trajectory and the estimation of the coverage probability of the bands for a simple binormal model: Yi = +1 if β0 + β1 X + ε > 0, and Yi = −1 otherwise, where ε and X are independent standard normal r. [sent-154, score-0.43]

73 In this example, the scoring function s(x) is the maximum likelihood estimator of the probit model on the training set. [sent-157, score-0.073]

74 ||∞ yields very large bands with coverage probability close to 1! [sent-162, score-0.26]

75 Though still large, bands based on the pseudo-metric dB are clearly much more informative (see Fig. [sent-163, score-0.17]

76 It should be noticed that the coverage improvement obtained by smoothing is clearer in the pontwise estimation setup (here α = 0. [sent-165, score-0.168]

77 Table 1: Empirical coverage probabilities for 95% empirical bands/intervals. [sent-167, score-0.157]

78 M ETHOD NAIVE B OOTSTRAP ||rn ||∞ S MOOTHED B OOTSTRAP ||rn ||∞ NAIVE B OOTSTRAP dn S MOOTHED B OOTSTRAP dn NAIVE B OOTSTRAP rn (0. [sent-168, score-0.455]

79 0 False positive rate Figure 3: Ponctual smooth bootstrap confidence interval Figure 1: ROC confidence bands. [sent-227, score-0.573]

80 4 Discussion Let us now give an insight into the reason why the smoothed bootstrap procedure outperforms the bootstrap without smoothing. [sent-228, score-1.127]

81 In most statistical problems where the nonparametric bootstrap is useful, there is no particular reason for implementing it from a smoothed version of the empirical df rather from the raw empirical distribution itself, see [22]. [sent-229, score-0.793]

82 However, in the present case, smoothing affects the rate of convergence. [sent-230, score-0.111]

83 Suppose indeed that the bootstrap process (2) is built by drawing from the raw cdf’s G and H instead of their smoothed versions at step 2 of the Algorithm. [sent-231, score-0.707]

84 Hence, the naive bootstrap rate induces an error of order O(n−1/4 ) which cannot be improved, whereas it may be shown that the rate n−2/5 is attained by the smoothed bootstrap (in a similar fashion to the functional setup), provided that the amount of smoothing is properly chosen. [sent-233, score-1.384]

85 A classical way of improving the pointwise approximation rate consists of bootstrapping a standardized version of the r. [sent-236, score-0.165]

86 It is natural to consider, as standardization factor, the square root of an estimate of the asymptotic variance: σ 2 (α) = var(z (n) (α)) = α(1 − α) g(H −1 (1 − α))2 ROC(α)(1 − ROC(α)) . [sent-239, score-0.096]

87 + −1 (1 − α))2 1 − p h(H p (4) 2 An estimate σn of plug-in type could be considered, obtained by plugging n+ /n, ROC and ˜ smoothed density estimators h = H and g = G into (4) instead of their (unknown) theoretical ˜ counterparts. [sent-240, score-0.175]

88 More interestingly, from a computational viewpoint, a bootstrap estimator of the variance could also be used. [sent-241, score-0.485]

89 Following the argument used in [17] for a smoothed original estimate of the ROC curve, one may show that a smoothed bootstrap of the studentized statistic rn (α)/σn (α) √ yields a better pointwise rate of convergence than 1/ n, the one of the gaussian approximation in the Central Limit Theorem. [sent-242, score-1.089]

90 Precisely, for a given α ∈]0, 1[, if the bandwidth used in the computation 7 2 of σn (α) is chosen of order n−1/3 , we have: ∗ rn (α) sup P∗ ≤t −P ∗ σn (α) t∈R rn (α) ≤t σn (α) = OP 1 n2/3 , (5) 2 ∗2 denoting σn (α)’s bootstrap counterpart by σn (α). [sent-243, score-0.776]

91 Notice that the bandwidth used in the standardization step (i. [sent-244, score-0.069]

92 This time, the smoothed (studentized) bootstrap method widely outperforms the gaussian approach, when the matter is to build confidence intervals for the ordinate ROC(α) of a point of abciss α on the empirical ROC curve. [sent-248, score-0.709]

93 However, it is not clear yet, whether this result remains true for confidence bands, when considering the whole ROC curve (this would actually require to establish an Edgeworth expansion for the supremum ||rn /σn ||∞ ). [sent-249, score-0.142]

94 On constructing accurate confidence bands for ROC curves e ¸ through smooth resampling, http://hal. [sent-255, score-0.269]

95 Weak convergence of smoothed and nonsmoothed bootstrap quantile estimates. [sent-290, score-0.721]

96 The geometry of roc space: understanding machine learning metrics through roc isometrics. [sent-298, score-1.346]

97 Bayesian ROC curve estimation under binormality using a partial likelihood based on ranks. [sent-313, score-0.142]

98 Strong approximations for resample quantile process and application to ROC methodology. [sent-318, score-0.098]

99 On the number of bootstrap simulations required to construct a confidence interval. [sent-331, score-0.485]

100 Confidence bands for ROC curves: methods and an empirical study. [sent-352, score-0.237]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('roc', 0.663), ('bootstrap', 0.485), ('dn', 0.187), ('bands', 0.17), ('smoothed', 0.157), ('curve', 0.142), ('dence', 0.127), ('ootstrap', 0.105), ('db', 0.096), ('coverage', 0.09), ('rn', 0.081), ('quantile', 0.079), ('scoring', 0.073), ('cdf', 0.072), ('sup', 0.069), ('empirical', 0.067), ('asymptotic', 0.066), ('dz', 0.065), ('curves', 0.062), ('band', 0.061), ('smoothing', 0.06), ('uctuation', 0.06), ('moothed', 0.06), ('statistic', 0.059), ('ranking', 0.055), ('gs', 0.053), ('rocs', 0.052), ('con', 0.052), ('naive', 0.052), ('rate', 0.051), ('pn', 0.048), ('mencon', 0.048), ('bootstrapping', 0.045), ('pointwise', 0.045), ('macskassy', 0.045), ('resampling', 0.042), ('uctuations', 0.042), ('hs', 0.04), ('shall', 0.04), ('bandwidth', 0.039), ('remark', 0.038), ('smooth', 0.037), ('op', 0.036), ('jump', 0.034), ('cl', 0.033), ('equipped', 0.032), ('bipartite', 0.031), ('bertail', 0.03), ('ghosal', 0.03), ('ndb', 0.03), ('rocn', 0.03), ('skorohod', 0.03), ('standardization', 0.03), ('studentized', 0.03), ('versions', 0.03), ('yi', 0.029), ('kh', 0.029), ('viewpoint', 0.027), ('stepwise', 0.026), ('provost', 0.026), ('hn', 0.025), ('functional', 0.025), ('false', 0.024), ('hausdorff', 0.024), ('vayatis', 0.024), ('replications', 0.024), ('approximation', 0.024), ('zi', 0.023), ('brownian', 0.022), ('bridges', 0.022), ('plotting', 0.022), ('umr', 0.022), ('counterpart', 0.021), ('boils', 0.021), ('annals', 0.021), ('notations', 0.021), ('norm', 0.021), ('law', 0.021), ('publication', 0.02), ('quantiles', 0.02), ('metrics', 0.02), ('angle', 0.02), ('arguments', 0.02), ('thoroughly', 0.019), ('owing', 0.019), ('approximations', 0.019), ('jumps', 0.018), ('noticed', 0.018), ('regularizing', 0.018), ('diagnosis', 0.018), ('theoretical', 0.018), ('fashion', 0.018), ('drawing', 0.018), ('relevance', 0.018), ('dv', 0.017), ('latter', 0.017), ('raw', 0.017), ('metric', 0.017), ('estimates', 0.017), ('ful', 0.017)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999988 159 nips-2008-On Bootstrapping the ROC Curve

Author: Patrice Bertail, Stéphan J. Clémençcon, Nicolas Vayatis

Abstract: This paper is devoted to thoroughly investigating how to bootstrap the ROC curve, a widely used visual tool for evaluating the accuracy of test/scoring statistics in the bipartite setup. The issue of confidence bands for the ROC curve is considered and a resampling procedure based on a smooth version of the empirical distribution called the ”smoothed bootstrap” is introduced. Theoretical arguments and simulation results are presented to show that the ”smoothed bootstrap” is preferable to a ”naive” bootstrap in order to construct accurate confidence bands. 1

2 0.67855823 174 nips-2008-Overlaying classifiers: a practical approach for optimal ranking

Author: Stéphan J. Clémençcon, Nicolas Vayatis

Abstract: ROC curves are one of the most widely used displays to evaluate performance of scoring functions. In the paper, we propose a statistical method for directly optimizing the ROC curve. The target is known to be the regression function up to an increasing transformation and this boils down to recovering the level sets of the latter. We propose to use classifiers obtained by empirical risk minimization of a weighted classification error and then to construct a scoring rule by overlaying these classifiers. We show the consistency and rate of convergence to the optimal ROC curve of this procedure in terms of supremum norm and also, as a byproduct of the analysis, we derive an empirical estimate of the optimal ROC curve. 1

3 0.28159666 72 nips-2008-Empirical performance maximization for linear rank statistics

Author: Stéphan J. Clémençcon, Nicolas Vayatis

Abstract: The ROC curve is known to be the golden standard for measuring performance of a test/scoring statistic regarding its capacity of discrimination between two populations in a wide variety of applications, ranging from anomaly detection in signal processing to information retrieval, through medical diagnosis. Most practical performance measures used in scoring applications such as the AUC, the local AUC, the p-norm push, the DCG and others, can be seen as summaries of the ROC curve. This paper highlights the fact that many of these empirical criteria can be expressed as (conditional) linear rank statistics. We investigate the properties of empirical maximizers of such performance criteria and provide preliminary results for the concentration properties of a novel class of random variables that we will call a linear rank process. 1

4 0.069154121 82 nips-2008-Fast Computation of Posterior Mode in Multi-Level Hierarchical Models

Author: Liang Zhang, Deepak Agarwal

Abstract: Multi-level hierarchical models provide an attractive framework for incorporating correlations induced in a response variable that is organized hierarchically. Model fitting is challenging, especially for a hierarchy with a large number of nodes. We provide a novel algorithm based on a multi-scale Kalman filter that is both scalable and easy to implement. For Gaussian response, we show our method provides the maximum a-posteriori (MAP) parameter estimates; for non-Gaussian response, parameter estimation is performed through a Laplace approximation. However, the Laplace approximation provides biased parameter estimates that is corrected through a parametric bootstrap procedure. We illustrate through simulation studies and analyses of real world data sets in health care and online advertising.

5 0.053153936 142 nips-2008-Multi-Level Active Prediction of Useful Image Annotations for Recognition

Author: Sudheendra Vijayanarasimhan, Kristen Grauman

Abstract: We introduce a framework for actively learning visual categories from a mixture of weakly and strongly labeled image examples. We propose to allow the categorylearner to strategically choose what annotations it receives—based on both the expected reduction in uncertainty as well as the relative costs of obtaining each annotation. We construct a multiple-instance discriminative classifier based on the initial training data. Then all remaining unlabeled and weakly labeled examples are surveyed to actively determine which annotation ought to be requested next. After each request, the current classifier is incrementally updated. Unlike previous work, our approach accounts for the fact that the optimal use of manual annotation may call for a combination of labels at multiple levels of granularity (e.g., a full segmentation on some images and a present/absent flag on others). As a result, it is possible to learn more accurate category models with a lower total expenditure of manual annotation effort. 1

6 0.053147607 93 nips-2008-Global Ranking Using Continuous Conditional Random Fields

7 0.052972414 97 nips-2008-Hierarchical Fisher Kernels for Longitudinal Data

8 0.045255173 39 nips-2008-Bounding Performance Loss in Approximate MDP Homomorphisms

9 0.04276238 76 nips-2008-Estimation of Information Theoretic Measures for Continuous Random Variables

10 0.041778274 224 nips-2008-Structured ranking learning using cumulative distribution networks

11 0.039825097 111 nips-2008-Kernel Change-point Analysis

12 0.037661579 190 nips-2008-Reconciling Real Scores with Binary Comparisons: A New Logistic Based Model for Ranking

13 0.036785744 177 nips-2008-Particle Filter-based Policy Gradient in POMDPs

14 0.036416061 225 nips-2008-Supervised Bipartite Graph Inference

15 0.034707729 135 nips-2008-Model Selection in Gaussian Graphical Models: High-Dimensional Consistency of \boldmath$\ell 1$-regularized MLE

16 0.033679143 44 nips-2008-Characteristic Kernels on Groups and Semigroups

17 0.033473279 117 nips-2008-Learning Taxonomies by Dependence Maximization

18 0.03312853 173 nips-2008-Optimization on a Budget: A Reinforcement Learning Approach

19 0.033049244 78 nips-2008-Exact Convex Confidence-Weighted Learning

20 0.033020332 151 nips-2008-Non-parametric Regression Between Manifolds


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.125), (1, -0.016), (2, -0.077), (3, 0.038), (4, -0.053), (5, -0.059), (6, 0.435), (7, -0.139), (8, 0.28), (9, 0.409), (10, -0.1), (11, -0.178), (12, 0.1), (13, -0.064), (14, -0.098), (15, -0.051), (16, -0.127), (17, -0.005), (18, -0.228), (19, -0.168), (20, -0.059), (21, -0.009), (22, -0.016), (23, 0.069), (24, -0.071), (25, -0.004), (26, 0.054), (27, -0.018), (28, -0.095), (29, 0.07), (30, -0.072), (31, 0.051), (32, 0.013), (33, -0.045), (34, -0.02), (35, 0.007), (36, 0.074), (37, -0.004), (38, -0.05), (39, 0.031), (40, 0.069), (41, -0.027), (42, 0.0), (43, -0.011), (44, -0.003), (45, 0.012), (46, -0.026), (47, 0.021), (48, -0.039), (49, -0.027)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.98524737 159 nips-2008-On Bootstrapping the ROC Curve

Author: Patrice Bertail, Stéphan J. Clémençcon, Nicolas Vayatis

Abstract: This paper is devoted to thoroughly investigating how to bootstrap the ROC curve, a widely used visual tool for evaluating the accuracy of test/scoring statistics in the bipartite setup. The issue of confidence bands for the ROC curve is considered and a resampling procedure based on a smooth version of the empirical distribution called the ”smoothed bootstrap” is introduced. Theoretical arguments and simulation results are presented to show that the ”smoothed bootstrap” is preferable to a ”naive” bootstrap in order to construct accurate confidence bands. 1

2 0.97297037 174 nips-2008-Overlaying classifiers: a practical approach for optimal ranking

Author: Stéphan J. Clémençcon, Nicolas Vayatis

Abstract: ROC curves are one of the most widely used displays to evaluate performance of scoring functions. In the paper, we propose a statistical method for directly optimizing the ROC curve. The target is known to be the regression function up to an increasing transformation and this boils down to recovering the level sets of the latter. We propose to use classifiers obtained by empirical risk minimization of a weighted classification error and then to construct a scoring rule by overlaying these classifiers. We show the consistency and rate of convergence to the optimal ROC curve of this procedure in terms of supremum norm and also, as a byproduct of the analysis, we derive an empirical estimate of the optimal ROC curve. 1

3 0.77384257 72 nips-2008-Empirical performance maximization for linear rank statistics

Author: Stéphan J. Clémençcon, Nicolas Vayatis

Abstract: The ROC curve is known to be the golden standard for measuring performance of a test/scoring statistic regarding its capacity of discrimination between two populations in a wide variety of applications, ranging from anomaly detection in signal processing to information retrieval, through medical diagnosis. Most practical performance measures used in scoring applications such as the AUC, the local AUC, the p-norm push, the DCG and others, can be seen as summaries of the ROC curve. This paper highlights the fact that many of these empirical criteria can be expressed as (conditional) linear rank statistics. We investigate the properties of empirical maximizers of such performance criteria and provide preliminary results for the concentration properties of a novel class of random variables that we will call a linear rank process. 1

4 0.23672849 39 nips-2008-Bounding Performance Loss in Approximate MDP Homomorphisms

Author: Jonathan Taylor, Doina Precup, Prakash Panagaden

Abstract: We define a metric for measuring behavior similarity between states in a Markov decision process (MDP), which takes action similarity into account. We show that the kernel of our metric corresponds exactly to the classes of states defined by MDP homomorphisms (Ravindran & Barto, 2003). We prove that the difference in the optimal value function of different states can be upper-bounded by the value of this metric, and that the bound is tighter than previous bounds provided by bisimulation metrics (Ferns et al. 2004, 2005). Our results hold both for discrete and for continuous actions. We provide an algorithm for constructing approximate homomorphisms, by using this metric to identify states that can be grouped together, as well as actions that can be matched. Previous research on this topic is based mainly on heuristics.

5 0.18226509 225 nips-2008-Supervised Bipartite Graph Inference

Author: Yoshihiro Yamanishi

Abstract: We formulate the problem of bipartite graph inference as a supervised learning problem, and propose a new method to solve it from the viewpoint of distance metric learning. The method involves the learning of two mappings of the heterogeneous objects to a unified Euclidean space representing the network topology of the bipartite graph, where the graph is easy to infer. The algorithm can be formulated as an optimization problem in a reproducing kernel Hilbert space. We report encouraging results on the problem of compound-protein interaction network reconstruction from chemical structure data and genomic sequence data. 1

6 0.17256846 97 nips-2008-Hierarchical Fisher Kernels for Longitudinal Data

7 0.15682542 13 nips-2008-Adapting to a Market Shock: Optimal Sequential Market-Making

8 0.15502132 178 nips-2008-Performance analysis for L\ 2 kernel classification

9 0.15166011 142 nips-2008-Multi-Level Active Prediction of Useful Image Annotations for Recognition

10 0.13558757 82 nips-2008-Fast Computation of Posterior Mode in Multi-Level Hierarchical Models

11 0.13269232 30 nips-2008-Bayesian Experimental Design of Magnetic Resonance Imaging Sequences

12 0.13238269 173 nips-2008-Optimization on a Budget: A Reinforcement Learning Approach

13 0.13161048 111 nips-2008-Kernel Change-point Analysis

14 0.13070479 74 nips-2008-Estimating the Location and Orientation of Complex, Correlated Neural Activity using MEG

15 0.13012764 85 nips-2008-Fast Rates for Regularized Objectives

16 0.12710342 66 nips-2008-Dynamic visual attention: searching for coding length increments

17 0.12506008 188 nips-2008-QUIC-SVD: Fast SVD Using Cosine Trees

18 0.12327798 93 nips-2008-Global Ranking Using Continuous Conditional Random Fields

19 0.12313157 76 nips-2008-Estimation of Information Theoretic Measures for Continuous Random Variables

20 0.12063941 107 nips-2008-Influence of graph construction on graph-based clustering measures


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(6, 0.069), (7, 0.048), (12, 0.023), (15, 0.023), (28, 0.313), (47, 0.213), (57, 0.032), (59, 0.013), (63, 0.037), (71, 0.027), (77, 0.036), (78, 0.034), (83, 0.04)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.92328322 160 nips-2008-On Computational Power and the Order-Chaos Phase Transition in Reservoir Computing

Author: Benjamin Schrauwen, Lars Buesing, Robert A. Legenstein

Abstract: Randomly connected recurrent neural circuits have proven to be very powerful models for online computations when a trained memoryless readout function is appended. Such Reservoir Computing (RC) systems are commonly used in two flavors: with analog or binary (spiking) neurons in the recurrent circuits. Previous work showed a fundamental difference between these two incarnations of the RC idea. The performance of a RC system built from binary neurons seems to depend strongly on the network connectivity structure. In networks of analog neurons such dependency has not been observed. In this article we investigate this apparent dichotomy in terms of the in-degree of the circuit nodes. Our analyses based amongst others on the Lyapunov exponent reveal that the phase transition between ordered and chaotic network behavior of binary circuits qualitatively differs from the one in analog circuits. This explains the observed decreased computational performance of binary circuits of high node in-degree. Furthermore, a novel mean-field predictor for computational performance is introduced and shown to accurately predict the numerically obtained results. 1

2 0.91213882 146 nips-2008-Multi-task Gaussian Process Learning of Robot Inverse Dynamics

Author: Christopher Williams, Stefan Klanke, Sethu Vijayakumar, Kian M. Chai

Abstract: The inverse dynamics problem for a robotic manipulator is to compute the torques needed at the joints to drive it along a given trajectory; it is beneficial to be able to learn this function for adaptive control. A robotic manipulator will often need to be controlled while holding different loads in its end effector, giving rise to a multi-task learning problem. By placing independent Gaussian process priors over the latent functions of the inverse dynamics, we obtain a multi-task Gaussian process prior for handling multiple loads, where the inter-task similarity depends on the underlying inertial parameters. Experiments demonstrate that this multi-task formulation is effective in sharing information among the various loads, and generally improves performance over either learning only on single tasks or pooling the data over all tasks. 1

same-paper 3 0.90724719 159 nips-2008-On Bootstrapping the ROC Curve

Author: Patrice Bertail, Stéphan J. Clémençcon, Nicolas Vayatis

Abstract: This paper is devoted to thoroughly investigating how to bootstrap the ROC curve, a widely used visual tool for evaluating the accuracy of test/scoring statistics in the bipartite setup. The issue of confidence bands for the ROC curve is considered and a resampling procedure based on a smooth version of the empirical distribution called the ”smoothed bootstrap” is introduced. Theoretical arguments and simulation results are presented to show that the ”smoothed bootstrap” is preferable to a ”naive” bootstrap in order to construct accurate confidence bands. 1

4 0.8333602 174 nips-2008-Overlaying classifiers: a practical approach for optimal ranking

Author: Stéphan J. Clémençcon, Nicolas Vayatis

Abstract: ROC curves are one of the most widely used displays to evaluate performance of scoring functions. In the paper, we propose a statistical method for directly optimizing the ROC curve. The target is known to be the regression function up to an increasing transformation and this boils down to recovering the level sets of the latter. We propose to use classifiers obtained by empirical risk minimization of a weighted classification error and then to construct a scoring rule by overlaying these classifiers. We show the consistency and rate of convergence to the optimal ROC curve of this procedure in terms of supremum norm and also, as a byproduct of the analysis, we derive an empirical estimate of the optimal ROC curve. 1

5 0.82991064 110 nips-2008-Kernel-ARMA for Hand Tracking and Brain-Machine interfacing During 3D Motor Control

Author: Lavi Shpigelman, Hagai Lalazar, Eilon Vaadia

Abstract: Using machine learning algorithms to decode intended behavior from neural activity serves a dual purpose. First, these tools allow patients to interact with their environment through a Brain-Machine Interface (BMI). Second, analyzing the characteristics of such methods can reveal the relative significance of various features of neural activity, task stimuli, and behavior. In this study we adapted, implemented and tested a machine learning method called Kernel Auto-Regressive Moving Average (KARMA), for the task of inferring movements from neural activity in primary motor cortex. Our version of this algorithm is used in an online learning setting and is updated after a sequence of inferred movements is completed. We first used it to track real hand movements executed by a monkey in a standard 3D reaching task. We then applied it in a closed-loop BMI setting to infer intended movement, while the monkey’s arms were comfortably restrained, thus performing the task using the BMI alone. KARMA is a recurrent method that learns a nonlinear model of output dynamics. It uses similarity functions (termed kernels) to compare between inputs. These kernels can be structured to incorporate domain knowledge into the method. We compare KARMA to various state-of-the-art methods by evaluating tracking performance and present results from the KARMA based BMI experiments. 1

6 0.82657504 126 nips-2008-Localized Sliced Inverse Regression

7 0.82607323 222 nips-2008-Stress, noradrenaline, and realistic prediction of mouse behaviour using reinforcement learning

8 0.82570601 117 nips-2008-Learning Taxonomies by Dependence Maximization

9 0.82487601 115 nips-2008-Learning Bounded Treewidth Bayesian Networks

10 0.82372928 101 nips-2008-Human Active Learning

11 0.82262349 77 nips-2008-Evaluating probabilities under high-dimensional latent variable models

12 0.82107025 96 nips-2008-Hebbian Learning of Bayes Optimal Decisions

13 0.82020682 206 nips-2008-Sequential effects: Superstition or rational behavior?

14 0.81971389 190 nips-2008-Reconciling Real Scores with Binary Comparisons: A New Logistic Based Model for Ranking

15 0.81865656 72 nips-2008-Empirical performance maximization for linear rank statistics

16 0.81720841 224 nips-2008-Structured ranking learning using cumulative distribution networks

17 0.81709641 93 nips-2008-Global Ranking Using Continuous Conditional Random Fields

18 0.81548071 48 nips-2008-Clustering via LP-based Stabilities

19 0.81396389 223 nips-2008-Structure Learning in Human Sequential Decision-Making

20 0.81367397 231 nips-2008-Temporal Dynamics of Cognitive Control