jmlr jmlr2012 jmlr2012-107 knowledge-graph by maker-knowledge-mining

107 jmlr-2012-Smoothing Multivariate Performance Measures

Source: pdf

Author: Xinhua Zhang, Ankan Saha, S.V.N. Vishwanathan

Abstract: Optimizing multivariate performance measure is an important task in Machine Learning. Joachims (2005) introduced a Support Vector Method whose underlying optimization problem is commonly solved by cutting plane methods (CPMs) such as SVM-Perf and BMRM. It can be shown that CPMs 1 converge to an ε accurate solution in O λε iterations, where λ is the trade-off parameter between the regularizer and the loss function. Motivated by the impressive convergence rate of CPM on a number of practical problems, it was conjectured that these rates can be further improved. We disprove this conjecture in this paper by constructing counter examples. However, surprisingly, we further discover that these problems are not inherently hard, and we develop a novel smoothing strategy, which in conjunction with Nesterov’s accelerated gradient method, can ﬁnd an ε accuiterations. Computationally, our smoothing technique is also rate solution in O∗ min 1 , √1 ε λε particularly advantageous for optimizing multivariate performance scores such as precision/recall break-even point and ROCArea; the cost per iteration remains the same as that of CPMs. Empirical evaluation on some of the largest publicly available data sets shows that our method converges signiﬁcantly faster than CPMs without sacriﬁcing generalization ability. Keywords: non-smooth optimization, max-margin methods, multivariate performance measures, Support Vector Machines, smoothing

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 7 8 10 1 2 10 10 (c) Obj value for PRBEP, λ = 10−6 BMRM SMS −1 10 W all− c lo c k tim e (s ec o n d s ) 0 10 W all− c lo c k tim e (s ec o n d s ) 6 7 . [sent-1025, score-1.709]

2 4 6 −1 10 W all− c lo c k tim e (s ec o n d s ) (a) Obj value for PRBEP, λ = 10−2 BMRM SMS T es t PRBEP (% ) 0 . [sent-1036, score-0.867]

3 9 3 0 −1 10 10 W all− c lo c k tim e (s ec o n d s ) 0 1 10 2 10 10 W all− c lo c k tim e (s ec o n d s ) (f) Test PRBEP, λ = 10−6 , 67. [sent-1043, score-1.704]

4 2 0 −1 BMRM SMS 10 −1 10 W all− c lo c k tim e (s ec o n d s ) Valu e o f Reg u lariz ed Ris k (e) Test PRBEP, λ = 10−4 , 67. [sent-1055, score-1.062]

5 1 9 0 BMRM SMS 10 −1 10 W all− c lo c k tim e (s ec o n d s ) 0 10 1 10 2 10 W all− c lo c k tim e (s ec o n d s ) (g) Obj value for ROCArea, λ = 10−2 (h) Obj value for ROCArea, λ = 10−4 (i) Obj value for ROCArea, λ = 10−6 8 9 . [sent-1061, score-1.719]

6 2 3 10 3 4 10 10 (c) Obj value for PRBEP, λ = 10−6 BMRM SMS 1 10 W all− c lo c k tim e (s ec o n d s ) 2 10 W all− c lo c k tim e (s ec o n d s ) 7 8 . [sent-1092, score-1.709]

7 3 9 2 10 BMRM SMS 1 2 10 W all− c lo c k tim e (s ec o n d s ) 3 10 4 10 10 W all− c lo c k tim e (s ec o n d s ) (f) Test PRBEP, λ = 10−6 , 78. [sent-1111, score-1.704]

8 5 0 1 BMRM SMS 10 1 10 W all− c lo c k tim e (s ec o n d s ) Valu e o f Reg u lariz ed Ris k (e) Test PRBEP, λ = 10−4 , 78. [sent-1123, score-1.062]

9 2 5 2 BMRM SMS 10 1 2 10 W all− c lo c k tim e (s ec o n d s ) 10 3 10 W all− c lo c k tim e (s ec o n d s ) (g) Obj value for ROCArea, λ = 10−2 (h) Obj value for ROCArea, λ = 10−4 (i) Obj value for ROCArea, λ = 10−6 6 6 . [sent-1129, score-1.719]

10 7 9 0 0 W all− c lo c k tim e (s ec o n d s ) T es t PRBEP (% ) T es t PRBEP (% ) T es t PRBEP (% ) 8 7 . [sent-1164, score-0.882]

11 7 2 1 10 W all− c lo c k tim e (s ec o n d s ) Valu e o f Reg u lariz ed Ris k 1 . [sent-1174, score-1.062]

12 0 7 10 BMRM SMS 0 1 10 W all− c lo c k tim e (s ec o n d s ) 10 2 3 10 10 W all− c lo c k tim e (s ec o n d s ) (f) Test PRBEP, λ = 10−6 , 99. [sent-1179, score-1.704]

13 3 7 0 10 W all− c lo c k tim e (s ec o n d s ) Valu e o f Reg u lariz ed Ris k (e) Test PRBEP, λ = 10−4 , 95. [sent-1195, score-1.062]

14 30 10 0 1 10 W all− c lo c k tim e (s ec o n d s ) 10 W all− c lo c k tim e (s ec o n d s ) (g) Obj value for ROCArea, λ = 10−2 (h) Obj value for ROCArea, λ = 10−4 (i) Obj value for ROCArea, λ = 10−6 9 7 . [sent-1197, score-1.719]

15 4 3 Valu e o f Reg u lariz ed Ris k Valu e o f Reg u lariz ed Ris k S MOOTHING M ULTIVARIATE P ERFORMANCE M EASURES −1 0 10 W all− c lo c k tim e (s ec o n d s ) 9 6 . [sent-1243, score-1.272]

16 7 7 10 W all− c lo c k tim e (s ec o n d s ) BMRM SMS 0 1 10 10 2 10 W all− c lo c k tim e (s ec o n d s ) (f) Test PRBEP, λ = 10−6 , 97. [sent-1247, score-1.704]

17 7 6 − 1 10 0 10 W all− c lo c k tim e (s ec o n d s ) Valu e o f Reg u lariz ed Ris k (e) Test PRBEP, λ = 10−4 , 98. [sent-1263, score-1.062]

18 20 W all− c lo c k tim e (s ec o n d s ) 0 1 10 10 W all− c lo c k tim e (s ec o n d s ) (g) Obj value for ROCArea, λ = 10−2 (h) Obj value for ROCArea, λ = 10−4 (i) Obj value for ROCArea, λ = 10−6 9 8 . [sent-1265, score-1.719]

19 0 0 W all− c lo c k tim e (s ec o n d s ) BMRM SMS 1 W all− c lo c k tim e (s ec o n d s ) (e) Test PRBEP, λ = 10−4 , 49. [sent-1313, score-1.704]

20 3 9 0 10 W all− c lo c k tim e (s ec o n d s ) (a) Obj value for PRBEP, λ = 10−2 BMRM SMS T es t PRBEP (% ) 0 . [sent-1380, score-0.867]

21 0 0 Valu e o f Reg u lariz ed Ris k Valu e o f Reg u lariz ed Ris k S MOOTHING M ULTIVARIATE P ERFORMANCE M EASURES 10 BMRM SMS 0 1 10 W all− c lo c k tim e (s ec o n d s ) 2 10 10 W all− c lo c k tim e (s ec o n d s ) (f) Test PRBEP, λ = 10−6 , 69. [sent-1383, score-2.124]

22 6 8 1 10 W all− c lo c k tim e (s ec o n d s ) Valu e o f Reg u lariz ed Ris k (e) Test PRBEP, λ = 10−4 , 68. [sent-1399, score-1.062]

23 51 10 0 1 10 W all− c lo c k tim e (s ec o n d s ) 10 2 10 W all− c lo c k tim e (s ec o n d s ) (g) Obj value for ROCArea, λ = 10−2 (h) Obj value for ROCArea, λ = 10−4 (i) Obj value for ROCArea, λ = 10−6 8 1 . [sent-1401, score-1.719]

24 9 7 T es t PRBEP (% ) 3 10 W all− c lo c k tim e (s ec o n d s ) Valu e o f Reg u lariz ed Ris k 1 . [sent-1446, score-1.072]

25 0 0 Valu e o f Reg u lariz ed Ris k Valu e o f Reg u lariz ed Ris k Z HANG , S AHA AND V ISHWANATHAN BMRM SMS 1 2 10 W all− c lo c k tim e (s ec o n d s ) 3 10 7 3 . [sent-1447, score-1.272]

26 5 1 10 BMRM SMS 1 2 10 W all− c lo c k tim e (s ec o n d s ) 3 10 10 4 10 W all− c lo c k tim e (s ec o n d s ) (f) Test PRBEP, λ = 10−6 , 78. [sent-1451, score-1.704]

27 0 5 3 BMRM SMS 10 1 2 10 W all− c lo c k tim e (s ec o n d s ) 10 3 10 Valu e o f Reg u lariz ed Ris k (e) Test PRBEP, λ = 10−4 , 78. [sent-1463, score-1.062]

28 0 5 4 BMRM SMS 10 1 2 10 W all− c lo c k tim e (s ec o n d s ) 10 3 10 4 10 W all− c lo c k tim e (s ec o n d s ) (g) Obj value for ROCArea, λ = 10−2 (h) Obj value for ROCArea, λ = 10−4 (i) Obj value for ROCArea, λ = 10−6 8 5 . [sent-1469, score-1.719]

29 8 1 BMRM SMS BMRM SMS W all− c lo c k tim e (s ec o n d s ) T es t PRBEP (% ) T es t PRBEP (% ) 3 1 . [sent-1515, score-0.872]

30 9 7 3 10 W all− c lo c k tim e (s ec o n d s ) Valu e o f Reg u lariz ed Ris k BMRM SMS T es t PRBEP (% ) 1 . [sent-1518, score-1.072]

31 0 1 Valu e o f Reg u lariz ed Ris k Valu e o f Reg u lariz ed Ris k S MOOTHING M ULTIVARIATE P ERFORMANCE M EASURES 10 2 3 10 W all− c lo c k tim e (s ec o n d s ) 4 10 10 W all− c lo c k tim e (s ec o n d s ) (f) Test PRBEP, λ = 10−6 , 29. [sent-1519, score-2.124]

32 9 9 3 10 BMRM SMS 10 2 3 10 W all− c lo c k tim e (s ec o n d s ) 10 Valu e o f Reg u lariz ed Ris k (e) Test PRBEP, λ = 10−4 , 29. [sent-1531, score-1.062]

33 9 9 4 10 2 3 10 W all− c lo c k tim e (s ec o n d s ) 10 4 10 W all− c lo c k tim e (s ec o n d s ) (g) Obj value for ROCArea, λ = 10−2 (h) Obj value for ROCArea, λ = 10−4 (i) Obj value for ROCArea, λ = 10−6 9 5 . [sent-1537, score-1.719]

34 0 4 W all− c lo c k tim e (s ec o n d s ) T es t PRBEP (% ) T es t PRBEP (% ) 4 5 . [sent-1580, score-0.872]

35 3 32 10 4 10 3 10 W all− c lo c k tim e (s ec o n d s ) 4 10 W all− c lo c k tim e (s ec o n d s ) (f) Test PRBEP, λ = 10−6 , 54. [sent-1588, score-1.704]

36 4 02 10 4 10 W all− c lo c k tim e (s ec o n d s ) 3 10 4 10 W all− c lo c k tim e (s ec o n d s ) (g) Obj value for ROCArea, λ = 10−2 (h) Obj value for ROCArea, λ = 10−4 (i) Obj value for ROCArea, λ = 10−6 9 0 . [sent-1606, score-1.719]

37 0 4 10 W all− c lo c k tim e (s ec o n d s ) BMRM SMS 2 3 10 4 10 10 W all− c lo c k tim e (s ec o n d s ) (f) Test PRBEP, λ = 10−6 , 89. [sent-1657, score-1.704]

38 9 9 1 10 3 10 BMRM SMS W all− c lo c k tim e (s ec o n d s ) 2 10 3 10 W all− c lo c k tim e (s ec o n d s ) (g) Obj value for ROCArea, λ = 10−2 (h) Obj value for ROCArea, λ = 10−4 (i) Obj value for ROCArea, λ = 10−6 8 0 . [sent-1675, score-1.719]

39 3 3 BMRM SMS 1 1 2 10 3 10 W all− c lo c k tim e (s ec o n d s ) (c) Obj value for PRBEP, λ = 10−6 T es t PRBEP (% ) 8 2 . [sent-1705, score-0.867]

40 2 6 2 10 W all− c lo c k tim e (s ec o n d s ) Valu e o f Reg u lariz ed Ris k 1 . [sent-1720, score-1.062]

41 4 1 1 10 10 W all− c lo c k tim e (s ec o n d s ) 2 10 3 10 W all− c lo c k tim e (s ec o n d s ) (f) Test PRBEP, λ = 10−6 , 82. [sent-1725, score-1.704]

42 5 9 2 10 W all− c lo c k tim e (s ec o n d s ) Valu e o f Reg u lariz ed Ris k (e) Test PRBEP, λ = 10−4 , 77. [sent-1741, score-1.062]

43 84 10 1 2 10 W all− c lo c k tim e (s ec o n d s ) 10 3 10 W all− c lo c k tim e (s ec o n d s ) (g) Obj value for ROCArea, λ = 10−2 (h) Obj value for ROCArea, λ = 10−4 (i) Obj value for ROCArea, λ = 10−6 9 2 . [sent-1743, score-1.719]

44 1 9 10 2 3 10 4 10 10 W all− c lo c k tim e (s ec o n d s ) (c) Obj value for PRBEP, λ = 10−6 BMRM SMS 1 10 W all− c lo c k tim e (s ec o n d s ) 1 10 8 0 . [sent-1776, score-1.709]

45 0 9 4 10 BMRM SMS 1 2 10 W all− c lo c k tim e (s ec o n d s ) 3 10 10 4 10 W all− c lo c k tim e (s ec o n d s ) (f) Test PRBEP, λ = 10−6 , 80. [sent-1793, score-1.704]

46 7 0 3 BMRM SMS 10 1 2 10 W all− c lo c k tim e (s ec o n d s ) 10 3 10 Valu e o f Reg u lariz ed Ris k (e) Test PRBEP, λ = 10−4 , 80. [sent-1805, score-1.062]

47 6 9 4 BMRM SMS 10 1 2 10 W all− c lo c k tim e (s ec o n d s ) 10 3 10 4 10 W all− c lo c k tim e (s ec o n d s ) (g) Obj value for ROCArea, λ = 10−2 (h) Obj value for ROCArea, λ = 10−4 (i) Obj value for ROCArea, λ = 10−6 8 7 . [sent-1811, score-1.719]

48 2 2 2 10 W all− c lo c k tim e (s ec o n d s ) 1 W all− c lo c k tim e (s ec o n d s ) 9 7 . [sent-1845, score-1.704]

49 4 0 Valu e o f Reg u lariz ed Ris k Valu e o f Reg u lariz ed Ris k Z HANG , S AHA AND V ISHWANATHAN 10 BMRM SMS 1 2 10 W all− c lo c k tim e (s ec o n d s ) 3 10 10 W all− c lo c k tim e (s ec o n d s ) (f) Test PRBEP, λ = 10−6 , 94. [sent-1861, score-2.124]

50 9 6 2 BMRM SMS 10 1 10 W all− c lo c k tim e (s ec o n d s ) Valu e o f Reg u lariz ed Ris k (e) Test PRBEP, λ = 10−4 , 94. [sent-1873, score-1.062]

51 1 4 2 BMRM SMS 10 1 10 W all− c lo c k tim e (s ec o n d s ) 2 10 W all− c lo c k tim e (s ec o n d s ) (g) Obj value for ROCArea, λ = 10−2 (h) Obj value for ROCArea, λ = 10−4 (i) Obj value for ROCArea, λ = 10−6 9 5 . [sent-1879, score-1.719]

52 8 91 10 4 10 W all− c lo c k tim e (s ec o n d s ) BMRM SMS 2 10 3 10 4 10 W all− c lo c k tim e (s ec o n d s ) (f) Test PRBEP, λ = 10−6 , 93. [sent-1929, score-1.704]

53 6 8 1 10 3 10 BMRM SMS W all− c lo c k tim e (s ec o n d s ) 2 3 10 10 Valu e o f Reg u lariz ed Ris k (e) Test PRBEP, λ = 10−4 , 93. [sent-1941, score-1.062]

54 0 5 1 10 4 10 BMRM SMS W all− c lo c k tim e (s ec o n d s ) 2 3 10 10 4 10 W all− c lo c k tim e (s ec o n d s ) (g) Obj value for ROCArea, λ = 10−2 (h) Obj value for ROCArea, λ = 10−4 (i) Obj value for ROCArea, λ = 10−6 7 4 . [sent-1947, score-1.719]

55 2 3 10 W all− c lo c k tim e (s ec o n d s ) 2 10 3 10 10 W all− c lo c k tim e (s ec o n d s ) (c) Obj value for PRBEP, λ = 10−6 9 1 . [sent-1981, score-1.709]

56 0 0 10 W all− c lo c k tim e (s ec o n d s ) 1 2 10 3 10 10 W all− c lo c k tim e (s ec o n d s ) (f) Test PRBEP, λ = 10−6 , 93. [sent-1998, score-1.704]

57 9 1 0 10 3 10 BMRM SMS W all− c lo c k tim e (s ec o n d s ) 1 2 10 10 3 10 Valu e o f Reg u lariz ed Ris k (e) Test PRBEP, λ = 10−4 , 93. [sent-2010, score-1.062]

58 7 2 0 10 4 10 BMRM SMS W all− c lo c k tim e (s ec o n d s ) 1 2 10 10 3 10 4 10 W all− c lo c k tim e (s ec o n d s ) (g) Obj value for ROCArea, λ = 10−2 (h) Obj value for ROCArea, λ = 10−4 (i) Obj value for ROCArea, λ = 10−6 7 5 . [sent-2016, score-1.719]

59 0 0 Valu e o f Reg u lariz ed Ris k Valu e o f Reg u lariz ed Ris k S MOOTHING M ULTIVARIATE P ERFORMANCE M EASURES 0 1 10 2 10 W all− c lo c k tim e (s ec o n d s ) 10 9 6 . [sent-2063, score-1.272]

60 4 8 3 10 0 10 W all− c lo c k tim e (s ec o n d s ) 1 2 10 10 3 10 4 10 W all− c lo c k tim e (s ec o n d s ) (f) Test PRBEP, λ = 10−6 , 97. [sent-2067, score-1.704]

61 3 3 2 10 W all− c lo c k tim e (s ec o n d s ) 10 BMRM SMS 0 10 W all− c lo c k tim e (s ec o n d s ) 1 2 10 10 W all− c lo c k tim e (s ec o n d s ) (g) Obj value for ROCArea, λ = 10−2 (h) Obj value for ROCArea, λ = 10−4 (i) Obj value for ROCArea, λ = 10−6 1 0 0 . [sent-2085, score-2.571]

62 29 W all− c lo c k tim e (s ec o n d s ) 2 10 W all− c lo c k tim e (s ec o n d s ) Valu e o f Reg u lariz ed Ris k 1 . [sent-2135, score-1.914]

63 40 1 W all− c lo c k tim e (s ec o n d s ) Valu e o f Reg u lariz ed Ris k 6 9 . [sent-2141, score-1.062]

64 8 0 Valu e o f Reg u lariz ed Ris k 1 10 W all− c lo c k tim e (s ec o n d s ) 0 . [sent-2151, score-1.062]

65 0 2 − 1 10 −1 1 10 W all− c lo c k tim e (s ec o n d s ) T es t PRBEP (% ) T es t PRBEP (% ) T es t PRBEP (% ) 8 4 . [sent-2189, score-0.882]

66 0 0 Valu e o f Reg u lariz ed Ris k Valu e o f Reg u lariz ed Ris k S MOOTHING M ULTIVARIATE P ERFORMANCE M EASURES W all− c lo c k tim e (s ec o n d s ) 9 4 . [sent-2200, score-1.272]

67 9 8 − 1 10 0 10 0 10 W all− c lo c k tim e (s ec o n d s ) 1 10 2 10 W all− c lo c k tim e (s ec o n d s ) (f) Test PRBEP, λ = 10−6 , 94. [sent-2204, score-1.704]

68 9 0 0 10 W all− c lo c k tim e (s ec o n d s ) Valu e o f Reg u lariz ed Ris k (e) Test PRBEP, λ = 10−4 , 95. [sent-2220, score-1.062]

69 51 −1 0 10 W all− c lo c k tim e (s ec o n d s ) 1 10 10 W all− c lo c k tim e (s ec o n d s ) (g) Obj value for ROCArea, λ = 10−2 (h) Obj value for ROCArea, λ = 10−4 (i) Obj value for ROCArea, λ = 10−6 9 7 . [sent-2222, score-1.719]

70 0 0 0 10 1 10 W all− c lo c k tim e (s ec o n d s ) (a) Obj value for PRBEP, λ = 10−2 (b) Obj value for PRBEP, λ = 10−4 T es t PRBEP (% ) T es t PRBEP (% ) 4 9 . [sent-2249, score-0.882]

71 3 3 10 W all− c lo c k tim e (s ec o n d s ) 1 2 10 10 W all− c lo c k tim e (s ec o n d s ) (c) Obj value for PRBEP, λ = 10−6 5 2 . [sent-2256, score-1.709]

72 3 3 0 10 1 10 W all− c lo c k tim e (s ec o n d s ) 1 2 10 10 W all− c lo c k tim e (s ec o n d s ) (f) Test PRBEP, λ = 10−6 , 50. [sent-2272, score-1.704]

73 4 9 1 10 W all− c lo c k tim e (s ec o n d s ) BMRM SMS 1 10 W all− c lo c k tim e (s ec o n d s ) 2 10 W all− c lo c k tim e (s ec o n d s ) (g) Obj value for ROCArea, λ = 10−2 (h) Obj value for ROCArea, λ = 10−4 (i) Obj value for ROCArea, λ = 10−6 9 6 . [sent-2290, score-2.571]

74 9 4 2 10 W all− c lo c k tim e (s ec o n d s ) 1 10 W all− c lo c k tim e (s ec o n d s ) 9 5 . [sent-2328, score-1.704]

75 0 0 Valu e o f Reg u lariz ed Ris k Valu e o f Reg u lariz ed Ris k S MOOTHING M ULTIVARIATE P ERFORMANCE M EASURES 10 BMRM SMS 1 2 10 W all− c lo c k tim e (s ec o n d s ) 10 3 10 W all− c lo c k tim e (s ec o n d s ) (f) Test PRBEP, λ = 10−6 , 94. [sent-2340, score-2.124]

76 6 7 1 10 W all− c lo c k tim e (s ec o n d s ) BMRM SMS 1 10 W all− c lo c k tim e (s ec o n d s ) 2 10 W all− c lo c k tim e (s ec o n d s ) (g) Obj value for ROCArea, λ = 10−2 (h) Obj value for ROCArea, λ = 10−4 (i) Obj value for ROCArea, λ = 10−6 9 8 . [sent-2358, score-2.571]

77 9 7 0 10 W all− c lo c k tim e (s ec o n d s ) 0 10 W all− c lo c k tim e (s ec o n d s ) 8 1 . [sent-2392, score-1.704]

78 0 0 Valu e o f Reg u lariz ed Ris k Valu e o f Reg u lariz ed Ris k Z HANG , S AHA AND V ISHWANATHAN −1 10 10 W all− c lo c k tim e (s ec o n d s ) 0 1 10 2 10 10 W all− c lo c k tim e (s ec o n d s ) (f) Test PRBEP, λ = 10−6 , 79. [sent-2408, score-2.124]

79 6 0 −1 BMRM SMS 10 −1 10 W all− c lo c k tim e (s ec o n d s ) Valu e o f Reg u lariz ed Ris k (e) Test PRBEP, λ = 10−4 , 76. [sent-2420, score-1.062]

80 5 1 0 BMRM SMS 10 −1 10 W all− c lo c k tim e (s ec o n d s ) 0 10 1 10 W all− c lo c k tim e (s ec o n d s ) (g) Obj value for ROCArea, λ = 10−2 (h) Obj value for ROCArea, λ = 10−4 (i) Obj value for ROCArea, λ = 10−6 9 9 . [sent-2426, score-1.719]

81 6 8 3 10 W all− c lo c k tim e (s ec o n d s ) 2 W all− c lo c k tim e (s ec o n d s ) 1 0 0 . [sent-2460, score-1.704]

82 5 9 Valu e o f Reg u lariz ed Ris k Valu e o f Reg u lariz ed Ris k S MOOTHING M ULTIVARIATE P ERFORMANCE M EASURES 10 BMRM SMS 2 3 10 W all− c lo c k tim e (s ec o n d s ) 4 10 10 W all− c lo c k tim e (s ec o n d s ) (f) Test PRBEP, λ = 10−6 , 99. [sent-2476, score-2.124]

83 5 8 3 10 W all− c lo c k tim e (s ec o n d s ) Valu e o f Reg u lariz ed Ris k (e) Test PRBEP, λ = 10−4 , 98. [sent-2492, score-1.062]

84 26 10 2 3 10 W all− c lo c k tim e (s ec o n d s ) 10 W all− c lo c k tim e (s ec o n d s ) (g) Obj value for ROCArea, λ = 10−2 (h) Obj value for ROCArea, λ = 10−4 (i) Obj value for ROCArea, λ = 10−6 1 0 0 . [sent-2494, score-1.719]

85 7 8 1 10 W all− c lo c k tim e (s ec o n d s ) 10 W all− c lo c k tim e (s ec o n d s ) 9 4 . [sent-2532, score-1.704]

86 4 5 1 10 W all− c lo c k tim e (s ec o n d s ) T es t PRBEP (% ) BMRM SMS Valu e o f Reg u lariz ed Ris k − x 1 01 6 . [sent-2543, score-1.072]

87 5 0 Valu e o f Reg u lariz ed Ris k Valu e o f Reg u lariz ed Ris k Z HANG , S AHA AND V ISHWANATHAN 0 10 10 W all− c lo c k tim e (s ec o n d s ) 1 10 2 10 W all− c lo c k tim e (s ec o n d s ) (f) Test PRBEP, λ = 10−6 , 94. [sent-2544, score-2.124]

88 7 0 1 10 W all− c lo c k tim e (s ec o n d s ) Valu e o f Reg u lariz ed Ris k (e) Test PRBEP, λ = 10−4 , 93. [sent-2560, score-1.062]

89 13 10 0 1 10 W all− c lo c k tim e (s ec o n d s ) 10 2 10 W all− c lo c k tim e (s ec o n d s ) (g) Obj value for ROCArea, λ = 10−2 (h) Obj value for ROCArea, λ = 10−4 (i) Obj value for ROCArea, λ = 10−6 9 2 . [sent-2562, score-1.719]

90 6 3 2 10 W all− c lo c k tim e (s ec o n d s ) 2 10 W all− c lo c k tim e (s ec o n d s ) 8 7 . [sent-2597, score-1.704]

91 0 0 Valu e o f Reg u lariz ed Ris k Valu e o f Reg u lariz ed Ris k S MOOTHING M ULTIVARIATE P ERFORMANCE M EASURES 10 BMRM SMS 1 2 10 W all− c lo c k tim e (s ec o n d s ) 3 10 4 10 10 W all− c lo c k tim e (s ec o n d s ) (f) Test PRBEP, λ = 10−6 , 86. [sent-2612, score-2.124]

92 4 9 1 BMRM SMS 10 1 10 W all− c lo c k tim e (s ec o n d s ) Valu e o f Reg u lariz ed Ris k (e) Test PRBEP, λ = 10−4 , 86. [sent-2624, score-1.062]

93 4 1 2 BMRM SMS 10 1 2 10 W all− c lo c k tim e (s ec o n d s ) 10 3 10 W all− c lo c k tim e (s ec o n d s ) (g) Obj value for ROCArea, λ = 10−2 (h) Obj value for ROCArea, λ = 10−4 (i) Obj value for ROCArea, λ = 10−6 9 8 . [sent-2630, score-1.719]

94 0 0 3 10 W all− c lo c k tim e (s ec o n d s ) W all− c lo c k tim e (s ec o n d s ) Valu e o f Reg u lariz ed Ris k Valu e o f Reg u lariz ed Ris k 1 . [sent-2682, score-2.124]

95 5 7 BMRM SMS BMRM SMS W all− c lo c k tim e (s ec o n d s ) T es t PRBEP (% ) T es t PRBEP (% ) 5 8 . [sent-2692, score-0.872]

96 7 8 3 10 W all− c lo c k tim e (s ec o n d s ) (a) Obj value for PRBEP, λ = 10−2 Valu e o f Reg u lariz ed Ris k 1 . [sent-2695, score-1.067]

97 6 4 0 10 1 10 W all− c lo c k tim e (s ec o n d s ) 10 W all− c lo c k tim e (s ec o n d s ) (a) Obj value for PRBEP (b) Test PRBEP 1 . [sent-2728, score-1.709]

98 7 0 0 10 1 10 W all− c lo c k tim e (s ec o n d s ) 10 W all− c lo c k tim e (s ec o n d s ) (a) Obj value for PRBEP (b) Test PRBEP − x 1 01 4 . [sent-2738, score-1.709]

99 7 8 2 10 10 0 1 10 W all− c lo c k tim e (s ec o n d s ) 2 10 10 W all− c lo c k tim e (s ec o n d s ) (a) Obj value for PRBEP (b) Test PRBEP 1 . [sent-2761, score-1.709]

100 6 5 2 10 10 0 1 10 W all− c lo c k tim e (s ec o n d s ) 2 10 10 W all− c lo c k tim e (s ec o n d s ) (a) Obj value for PRBEP (b) Test PRBEP 1 . [sent-2771, score-1.709]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('bmrm', 0.413), ('prbep', 0.368), ('rocarea', 0.362), ('sms', 0.362), ('lo', 0.351), ('tim', 0.257), ('ec', 0.244), ('lariz', 0.206), ('ris', 0.206), ('valu', 0.206), ('obj', 0.146), ('reg', 0.136), ('remp', 0.057), ('ishwanathan', 0.038), ('moothing', 0.037), ('aha', 0.033), ('ultivariate', 0.032), ('easures', 0.028), ('boldface', 0.025), ('hang', 0.024), ('erformance', 0.023), ('test', 0.023), ('panels', 0.017), ('highlighted', 0.016), ('nesterov', 0.016), ('dna', 0.014), ('smoothing', 0.013), ('seconds', 0.013), ('wk', 0.012), ('cpms', 0.012), ('xinhua', 0.012), ('ai', 0.011), ('es', 0.01), ('ankan', 0.009), ('saha', 0.009), ('mi', 0.009), ('teo', 0.009), ('zi', 0.008), ('gure', 0.008), ('kdda', 0.008), ('risk', 0.007), ('subgradient', 0.007), ('agm', 0.007), ('smoothed', 0.006), ('objective', 0.006), ('panel', 0.006), ('minw', 0.006), ('covertype', 0.006), ('yi', 0.006), ('regularized', 0.006), ('value', 0.005), ('multivariate', 0.005), ('cpm', 0.005), ('kddb', 0.005), ('petsc', 0.005), ('accelerated', 0.005), ('franc', 0.005), ('cutting', 0.005), ('gradient', 0.005), ('jk', 0.005), ('contingency', 0.005), ('ocr', 0.005), ('optimal', 0.005), ('string', 0.005), ('sonnenburg', 0.005), ('yurii', 0.005), ('head', 0.004), ('delta', 0.004), ('ed', 0.004), ('zhang', 0.004), ('ak', 0.004), ('max', 0.004), ('joachims', 0.004), ('cne', 0.004), ('tero', 0.004), ('zeta', 0.004), ('smooth', 0.004), ('convex', 0.004), ('alberta', 0.004), ('bound', 0.004), ('plane', 0.004), ('wi', 0.004), ('nocedal', 0.003), ('beck', 0.003), ('decrement', 0.003), ('epsilon', 0.003), ('maximizer', 0.003), ('purdue', 0.003), ('maxn', 0.003), ('tao', 0.003), ('rates', 0.003), ('misclassi', 0.003), ('argmin', 0.003), ('lemar', 0.003), ('alpha', 0.003), ('worm', 0.003), ('beta', 0.003), ('fd', 0.003), ('culties', 0.003), ('chapelle', 0.003)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000001 107 jmlr-2012-Smoothing Multivariate Performance Measures

Author: Xinhua Zhang, Ankan Saha, S.V.N. Vishwanathan

2 0.05848781 118 jmlr-2012-Variational Multinomial Logit Gaussian Process

Author: Kian Ming A. Chai

Abstract: Gaussian process prior with an appropriate likelihood function is a ﬂexible non-parametric model for a variety of learning tasks. One important and standard task is multi-class classiﬁcation, which is the categorization of an item into one of several ﬁxed classes. A usual likelihood function for this is the multinomial logistic likelihood function. However, exact inference with this model has proved to be difﬁcult because high-dimensional integrations are required. In this paper, we propose a variational approximation to this model, and we describe the optimization of the variational parameters. Experiments have shown our approximation to be tight. In addition, we provide dataindependent bounds on the marginal likelihood of the model, one of which is shown to be much tighter than the existing variational mean-ﬁeld bound in the experiments. We also derive a proper lower bound on the predictive likelihood that involves the Kullback-Leibler divergence between the approximating and the true posterior. We combine our approach with a recently proposed sparse approximation to give a variational sparse approximation to the Gaussian process multi-class model. We also derive criteria which can be used to select the inducing set, and we show the effectiveness of these criteria over random selection in an experiment. Keywords: Gaussian process, probabilistic classiﬁcation, multinomial logistic, variational approximation, sparse approximation

3 0.044576302 31 jmlr-2012-DEAP: Evolutionary Algorithms Made Easy

Author: Félix-Antoine Fortin, François-Michel De Rainville, Marc-André Gardner, Marc Parizeau, Christian Gagné

Abstract: DEAP is a novel evolutionary computation framework for rapid prototyping and testing of ideas. Its design departs from most other existing frameworks in that it seeks to make algorithms explicit and data structures transparent, as opposed to the more common black-box frameworks. Freely available with extensive documentation at http://deap.gel.ulaval.ca, DEAP is an open source project under an LGPL license. Keywords: distributed evolutionary algorithms, software tools

4 0.0098798228 69 jmlr-2012-Mixability is Bayes Risk Curvature Relative to Log Loss

Author: Tim van Erven, Mark D. Reid, Robert C. Williamson

Abstract: Mixability of a loss characterizes fast rates in the online learning setting of prediction with expert advice. The determination of the mixability constant for binary losses is straightforward but opaque. In the binary case we make this transparent and simpler by characterising mixability in terms of the second derivative of the Bayes risk of proper losses. We then extend this result to multiclass proper losses where there are few existing results. We show that mixability is governed by the maximum eigenvalue of the Hessian of the Bayes risk, relative to the Hessian of the Bayes risk for log loss. We conclude by comparing our result to other work that bounds prediction performance in terms of the geometry of the Bayes risk. Although all calculations are for proper losses, we also show how to carry the results across to improper losses. Keywords: mixability, multiclass, prediction with expert advice, proper loss, learning rates

5 0.0091789644 26 jmlr-2012-Coherence Functions with Applications in Large-Margin Classification Methods

Author: Zhihua Zhang, Dehua Liu, Guang Dai, Michael I. Jordan

Abstract: Support vector machines (SVMs) naturally embody sparseness due to their use of hinge loss functions. However, SVMs can not directly estimate conditional class probabilities. In this paper we propose and study a family of coherence functions, which are convex and differentiable, as surrogates of the hinge function. The coherence function is derived by using the maximum-entropy principle and is characterized by a temperature parameter. It bridges the hinge function and the logit function in logistic regression. The limit of the coherence function at zero temperature corresponds to the hinge function, and the limit of the minimizer of its expected error is the minimizer of the expected error of the hinge loss. We refer to the use of the coherence function in large-margin classiﬁcation as “C -learning,” and we present efﬁcient coordinate descent algorithms for the training of regularized C -learning models. Keywords: large-margin classiﬁers, hinge functions, logistic functions, coherence functions, C learning

6 0.0087212194 98 jmlr-2012-Regularized Bundle Methods for Convex and Non-Convex Risks

7 0.0079854177 116 jmlr-2012-Transfer in Reinforcement Learning via Shared Features

8 0.0077446834 59 jmlr-2012-Linear Regression With Random Projections

9 0.0076986435 87 jmlr-2012-PAC-Bayes Bounds with Data Dependent Priors

10 0.0071217483 18 jmlr-2012-An Improved GLMNET for L1-regularized Logistic Regression

11 0.0065963808 85 jmlr-2012-Optimal Distributed Online Prediction Using Mini-Batches

12 0.0064277002 28 jmlr-2012-Confidence-Weighted Linear Classification for Text Categorization

13 0.0063392008 96 jmlr-2012-Refinement of Operator-valued Reproducing Kernels

14 0.0063189222 112 jmlr-2012-Structured Sparsity via Alternating Direction Methods

15 0.0061709397 44 jmlr-2012-Feature Selection via Dependence Maximization

16 0.0060857506 39 jmlr-2012-Estimation and Selection via Absolute Penalized Convex Minimization And Its Multistage Adaptive Applications

17 0.0056766099 80 jmlr-2012-On Ranking and Generalization Bounds

18 0.0055816011 2 jmlr-2012-A Comparison of the Lasso and Marginal Regression

19 0.0055796071 97 jmlr-2012-Regularization Techniques for Learning with Matrices

20 0.0053152656 49 jmlr-2012-Hope and Fear for Discriminative Training of Statistical Translation Models

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.03), (1, 0.007), (2, 0.015), (3, -0.002), (4, 0.02), (5, 0.003), (6, 0.019), (7, 0.013), (8, -0.061), (9, -0.009), (10, -0.02), (11, 0.022), (12, -0.017), (13, 0.043), (14, -0.01), (15, -0.005), (16, -0.0), (17, 0.019), (18, -0.098), (19, -0.008), (20, 0.031), (21, 0.057), (22, -0.113), (23, -0.051), (24, -0.033), (25, -0.141), (26, -0.079), (27, -0.048), (28, 0.103), (29, -0.022), (30, -0.201), (31, 0.023), (32, -0.085), (33, -0.371), (34, -0.043), (35, 0.038), (36, 0.367), (37, -0.073), (38, 0.006), (39, -0.173), (40, -0.173), (41, 0.111), (42, 0.024), (43, -0.087), (44, 0.186), (45, -0.056), (46, -0.075), (47, 0.116), (48, 0.148), (49, 0.164)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99373305 107 jmlr-2012-Smoothing Multivariate Performance Measures

Author: Xinhua Zhang, Ankan Saha, S.V.N. Vishwanathan

2 0.55071497 31 jmlr-2012-DEAP: Evolutionary Algorithms Made Easy

Author: Félix-Antoine Fortin, François-Michel De Rainville, Marc-André Gardner, Marc Parizeau, Christian Gagné

3 0.3555128 118 jmlr-2012-Variational Multinomial Logit Gaussian Process

Author: Kian Ming A. Chai

4 0.10918193 25 jmlr-2012-Characterization and Greedy Learning of Interventional Markov Equivalence Classes of Directed Acyclic Graphs

Author: Alain Hauser, Peter Bühlmann

Abstract: The investigation of directed acyclic graphs (DAGs) encoding the same Markov property, that is the same conditional independence relations of multivariate observational distributions, has a long tradition; many algorithms exist for model selection and structure learning in Markov equivalence classes. In this paper, we extend the notion of Markov equivalence of DAGs to the case of interventional distributions arising from multiple intervention experiments. We show that under reasonable assumptions on the intervention experiments, interventional Markov equivalence deﬁnes a ﬁner partitioning of DAGs than observational Markov equivalence and hence improves the identiﬁability of causal models. We give a graph theoretic criterion for two DAGs being Markov equivalent under interventions and show that each interventional Markov equivalence class can, analogously to the observational case, be uniquely represented by a chain graph called interventional essential graph (also known as CPDAG in the observational case). These are key insights for deriving a generalization of the Greedy Equivalence Search algorithm aimed at structure learning from interventional data. This new algorithm is evaluated in a simulation study. Keywords: causal inference, interventions, graphical model, Markov equivalence, greedy equivalence search

5 0.096303716 59 jmlr-2012-Linear Regression With Random Projections

Author: Odalric-Ambrym Maillard, Rémi Munos

Abstract: We investigate a method for regression that makes use of a randomly generated subspace GP ⊂ F (of ﬁnite dimension P) of a given large (possibly inﬁnite) dimensional function space F , for example, L2 ([0, 1]d ; R). GP is deﬁned as the span of P random features that are linear combinations of a basis functions of F weighted by random Gaussian i.i.d. coefﬁcients. We show practical motivation for the use of this approach, detail the link that this random projections method share with RKHS and Gaussian objects theory and prove, both in deterministic and random design, approximation error bounds when searching for the best regression function in GP rather than in F , and derive excess risk bounds for a speciﬁc regression algorithm (least squares regression in GP ). This paper stresses the motivation to study such methods, thus the analysis developed is kept simple for explanations purpose and leaves room for future developments. Keywords: regression, random matrices, dimension reduction

6 0.093144536 41 jmlr-2012-Exploration in Relational Domains for Model-based Reinforcement Learning

7 0.090803578 27 jmlr-2012-Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection

8 0.090547606 18 jmlr-2012-An Improved GLMNET for L1-regularized Logistic Regression

9 0.07861381 112 jmlr-2012-Structured Sparsity via Alternating Direction Methods

10 0.073389143 96 jmlr-2012-Refinement of Operator-valued Reproducing Kernels

11 0.071065851 29 jmlr-2012-Consistent Model Selection Criteria on High Dimensions

12 0.068648316 69 jmlr-2012-Mixability is Bayes Risk Curvature Relative to Log Loss

13 0.068069808 52 jmlr-2012-Iterative Reweighted Algorithms for Matrix Rank Minimization

14 0.067514151 104 jmlr-2012-Security Analysis of Online Centroid Anomaly Detection

15 0.064914323 26 jmlr-2012-Coherence Functions with Applications in Large-Margin Classification Methods

16 0.064310253 116 jmlr-2012-Transfer in Reinforcement Learning via Shared Features

17 0.063637383 16 jmlr-2012-Algorithms for Learning Kernels Based on Centered Alignment

18 0.06249861 83 jmlr-2012-Online Learning in the Embedded Manifold of Low-rank Matrices

19 0.057827588 66 jmlr-2012-Metric and Kernel Learning Using a Linear Transformation

20 0.054002598 86 jmlr-2012-Optimistic Bayesian Sampling in Contextual-Bandit Problems

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(21, 0.017), (26, 0.019), (29, 0.017), (64, 0.638), (75, 0.026), (92, 0.024), (96, 0.046)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.8798821 107 jmlr-2012-Smoothing Multivariate Performance Measures

Author: Xinhua Zhang, Ankan Saha, S.V.N. Vishwanathan

2 0.63017911 94 jmlr-2012-Query Strategies for Evading Convex-Inducing Classifiers

Author: Blaine Nelson, Benjamin I. P. Rubinstein, Ling Huang, Anthony D. Joseph, Steven J. Lee, Satish Rao, J. D. Tygar

Abstract: Classiﬁers are often used to detect miscreant activities. We study how an adversary can systematically query a classiﬁer to elicit information that allows the attacker to evade detection while incurring a near-minimal cost of modifying their intended malfeasance. We generalize the theory of Lowd and Meek (2005) to the family of convex-inducing classiﬁers that partition their feature space into two sets, one of which is convex. We present query algorithms for this family that construct undetected instances of approximately minimal cost using only polynomially-many queries in the dimension of the space and in the level of approximation. Our results demonstrate that nearoptimal evasion can be accomplished for this family without reverse engineering the classiﬁer’s decision boundary. We also consider general ℓ p costs and show that near-optimal evasion on the family of convex-inducing classiﬁers is generally efﬁcient for both positive and negative convexity for all levels of approximation if p = 1. Keywords: query algorithms, evasion, reverse engineering, adversarial learning

3 0.57686657 30 jmlr-2012-DARWIN: A Framework for Machine Learning and Computer Vision Research and Development

Author: Stephen Gould

Abstract: We present an open-source platform-independent C++ framework for machine learning and computer vision research. The framework includes a wide range of standard machine learning and graphical models algorithms as well as reference implementations for many machine learning and computer vision applications. The framework contains Matlab wrappers for core components of the library and an experimental graphical user interface for developing and visualizing machine learning data ﬂows. Keywords: machine learning, graphical models, computer vision, open-source software

4 0.55573779 33 jmlr-2012-Distance Metric Learning with Eigenvalue Optimization

Author: Yiming Ying, Peng Li

Abstract: The main theme of this paper is to develop a novel eigenvalue optimization framework for learning a Mahalanobis metric. Within this context, we introduce a novel metric learning approach called DML-eig which is shown to be equivalent to a well-known eigenvalue optimization problem called minimizing the maximal eigenvalue of a symmetric matrix (Overton, 1988; Lewis and Overton, 1996). Moreover, we formulate LMNN (Weinberger et al., 2005), one of the state-of-the-art metric learning methods, as a similar eigenvalue optimization problem. This novel framework not only provides new insights into metric learning but also opens new avenues to the design of efﬁcient metric learning algorithms. Indeed, ﬁrst-order algorithms are developed for DML-eig and LMNN which only need the computation of the largest eigenvector of a matrix per iteration. Their convergence characteristics are rigorously established. Various experiments on benchmark data sets show the competitive performance of our new approaches. In addition, we report an encouraging result on a difﬁcult and challenging face veriﬁcation data set called Labeled Faces in the Wild (LFW). Keywords: metric learning, convex optimization, semi-deﬁnite programming, ﬁrst-order methods, eigenvalue optimization, matrix factorization, face veriﬁcation

5 0.22172375 92 jmlr-2012-Positive Semidefinite Metric Learning Using Boosting-like Algorithms

Author: Chunhua Shen, Junae Kim, Lei Wang, Anton van den Hengel

Abstract: The success of many machine learning and pattern recognition methods relies heavily upon the identiﬁcation of an appropriate distance metric on the input data. It is often beneﬁcial to learn such a metric from the input training data, instead of using a default one such as the Euclidean distance. In this work, we propose a boosting-based technique, termed B OOST M ETRIC, for learning a quadratic Mahalanobis distance metric. Learning a valid Mahalanobis distance metric requires enforcing the constraint that the matrix parameter to the metric remains positive semideﬁnite. Semideﬁnite programming is often used to enforce this constraint, but does not scale well and is not easy to implement. B OOST M ETRIC is instead based on the observation that any positive semideﬁnite matrix can be decomposed into a linear combination of trace-one rank-one matrices. B OOST M ETRIC thus uses rank-one positive semideﬁnite matrices as weak learners within an efﬁcient and scalable boosting-based learning process. The resulting methods are easy to implement, efﬁcient, and can accommodate various types of constraints. We extend traditional boosting algorithms in that its weak learner is a positive semideﬁnite matrix with trace and rank being one rather than a classiﬁer or regressor. Experiments on various data sets demonstrate that the proposed algorithms compare favorably to those state-of-the-art methods in terms of classiﬁcation accuracy and running time. Keywords: Mahalanobis distance, semideﬁnite programming, column generation, boosting, Lagrange duality, large margin nearest neighbor

6 0.14600082 84 jmlr-2012-Online Submodular Minimization

7 0.1444177 10 jmlr-2012-A Unified View of Performance Metrics: Translating Threshold Choice into Expected Classification Loss

8 0.14379944 8 jmlr-2012-A Primal-Dual Convergence Analysis of Boosting

9 0.14281704 105 jmlr-2012-Selective Sampling and Active Learning from Single and Multiple Teachers

10 0.14263868 18 jmlr-2012-An Improved GLMNET for L1-regularized Logistic Regression

11 0.14176491 65 jmlr-2012-MedLDA: Maximum Margin Supervised Topic Models

12 0.13976569 36 jmlr-2012-Efficient Methods for Robust Classification Under Uncertainty in Kernel Matrices

13 0.13886029 5 jmlr-2012-A Local Spectral Method for Graphs: With Applications to Improving Graph Partitions and Exploring Data Graphs Locally

14 0.13866115 64 jmlr-2012-Manifold Identification in Dual Averaging for Regularized Stochastic Online Learning

15 0.13520034 11 jmlr-2012-A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction: Insights and New Models

16 0.13465968 104 jmlr-2012-Security Analysis of Online Centroid Anomaly Detection

17 0.13366479 115 jmlr-2012-Trading Regret for Efficiency: Online Convex Optimization with Long Term Constraints

18 0.12815782 86 jmlr-2012-Optimistic Bayesian Sampling in Contextual-Bandit Problems

19 0.12735976 87 jmlr-2012-PAC-Bayes Bounds with Data Dependent Priors

20 0.12681919 77 jmlr-2012-Non-Sparse Multiple Kernel Fisher Discriminant Analysis