jmlr jmlr2009 jmlr2009-80 knowledge-graph by maker-knowledge-mining

80 jmlr-2009-Reproducing Kernel Banach Spaces for Machine Learning


Source: pdf

Author: Haizhang Zhang, Yuesheng Xu, Jun Zhang

Abstract: We introduce the notion of reproducing kernel Banach spaces (RKBS) and study special semiinner-product RKBS by making use of semi-inner-products and the duality mapping. Properties of an RKBS and its reproducing kernel are investigated. As applications, we develop in the framework of RKBS standard learning schemes including minimal norm interpolation, regularization network, support vector machines, and kernel principal component analysis. In particular, existence, uniqueness and representer theorems are established. Keywords: reproducing kernel Banach spaces, reproducing kernels, learning theory, semi-innerproducts, representer theorems

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 EDU Department of Psychology University of Michigan Ann Arbor, MI 48109, USA Editor: Ingo Steinwart Abstract We introduce the notion of reproducing kernel Banach spaces (RKBS) and study special semiinner-product RKBS by making use of semi-inner-products and the duality mapping. [sent-4, score-0.472]

2 Properties of an RKBS and its reproducing kernel are investigated. [sent-5, score-0.372]

3 Keywords: reproducing kernel Banach spaces, reproducing kernels, learning theory, semi-innerproducts, representer theorems 1. [sent-8, score-0.701]

4 The reason of using positive definite kernels to measure similarity lies in the celebrated theoretical fact due to Mercer (1909) that there is a bijective correspondence between them and reproducing kernel Hilbert spaces (RKHS). [sent-17, score-0.516]

5 In light of this bijective correspondence, positive definite kernels are usually called reproducing kernels. [sent-22, score-0.347]

6 Based on the theory of reproducing kernels, many effective schemes have been developed for learning from finite samples (Evgeniou et al. [sent-27, score-0.27]

7 The above discussion indicates that there is a need of introducing the notion of reproducing kernel Banach spaces for the systematic study of learning in Banach spaces. [sent-56, score-0.439]

8 We shall introduce the notion of reproducing kernel Banach spaces in Section 2, and a general construction in Section 3. [sent-60, score-0.511]

9 It will become clear that the lack of an inner product may cause arbitrariness in the properties of the associated reproducing kernel. [sent-61, score-0.3]

10 reproducing kernel Banach spaces by making use of semi-inner-products for normed vector spaces first defined by Lumer (1961) and further developed by Giles (1967). [sent-65, score-0.649]

11 Here the availability of a semi-inner-product enables us to study basic properties of reproducing kernel Banach spaces and their reproducing kernels. [sent-67, score-0.709]

12 In Section 5, we shall develop in the framework of reproducing kernel Banach spaces standard learning schemes including minimal norm interpolation, regularization network, support vector machines, and kernel principal component analysis. [sent-68, score-0.721]

13 A normed vector space B is called a Banach space of functions on X if it is a Banach space whose elements are functions on X, and for each f ∈ B , its norm f B in B vanishes if and only if f , as a function, vanishes everywhere on X. [sent-74, score-0.282]

14 Influenced by the definition of RKHS, our first intuition is to define a reproducing kernel Banach space (RKBS) as a Banach space of functions on X on which point evaluations are continuous linear functionals. [sent-76, score-0.468]

15 However, since for each f ∈ C[0, 1], f (x) = δx ( f ), x ∈ [0, 1], the reproducing kernel for C[0, 1] would have to be the delta distribution, which is not a function that can be evaluated. [sent-79, score-0.372]

16 Recall that two normed vector spaces V1 and V2 are said to be isometric if there is a bijective linear norm-preserving mapping between them. [sent-81, score-0.299]

17 Definition 1 A reproducing kernel Banach space (RKBS) on X is a reflexive Banach space B of functions on X for which B ∗ is isometric to a Banach space B # of functions on X and the point evaluation is continuous on both B and B # . [sent-87, score-0.534]

18 We shall show that there indeed exists a reproducing kernel for an RKBS. [sent-105, score-0.467]

19 (7) (c) The linear span of {K(x, ·) : x ∈ X} is dense in B , namely, span {K(x, ·) : x ∈ X} = B . [sent-113, score-0.273]

20 (8) (d) The linear span of {K(·, x) : x ∈ X} is dense in B ∗ , namely, span {K(·, x) : x ∈ X} = B ∗ . [sent-114, score-0.273]

21 (10) (e) For all x, y ∈ X Proof For every x ∈ X, since δx is a continuous linear functional on B , there exists gx ∈ B ∗ such that f (x) = ( f , gx )B , f ∈ B . [sent-116, score-0.437]

22 We call the function K in Theorem 2 the reproducing kernel for the RKBS B . [sent-134, score-0.372]

23 By Theorem 2, an RKBS has exactly one reproducing kernel. [sent-135, score-0.27]

24 However, different RKBS may have the same reproducing kernel. [sent-136, score-0.27]

25 Consequently, although we have at hand a reproducing kernel K for an RKBS B , the relationship (13), and denseness conditions (8), (9), we still can not determine the norm on B . [sent-144, score-0.475]

26 Construction of Reproducing Kernels via Feature Maps In this section, we shall characterize reproducing kernels for RKBS. [sent-146, score-0.392]

27 The characterization will at the same time provide a convenient way of constructing reproducing kernels and their corresponding RKBS. [sent-147, score-0.32]

28 Suppose that there exists Φ : X → W , and Φ∗ : X → W ∗ such that span Φ(X) = W , span Φ∗ (X) = W ∗ . [sent-150, score-0.271]

29 (16) Moreover, the reproducing kernel K for B is K(x, y) := (Φ(x), Φ∗ (y))W , x, y ∈ X. [sent-152, score-0.372]

30 These facts show that K is the reproducing kernel for B and complete the proof. [sent-176, score-0.372]

31 We call the mappings Φ, Φ∗ in Theorem 3 a pair of feature maps for the reproducing kernel K. [sent-177, score-0.372]

32 As a corollary to Theorem 3, we obtain the following characterization of reproducing kernels for RKBS. [sent-179, score-0.32]

33 Theorem 4 A function K : X × X → C is the reproducing kernel of an RKBS on X if and only if it is of the form (17), where W is a reflexive Banach space, and mappings Φ : X → W , Φ∗ : X → W ∗ satisfy (14). [sent-180, score-0.372]

34 For the necessity, we assume that K is the reproducing kernel of an RKBS B on X, and set W := B , W ∗ := B ∗ , Φ(x) := K(x, ·), Φ∗ (x) := K(·, x), x ∈ X. [sent-182, score-0.372]

35 To demonstrate how we get RKBS and their reproducing kernels by Theorem 3, we now present a nontrivial example of RKBS. [sent-184, score-0.34]

36 However, we see that they all have the sinc function as the reproducing kernel. [sent-199, score-0.27]

37 In fact, if no further conditions are imposed on an RKBS, its reproducing kernel can be rather arbitrary. [sent-200, score-0.372]

38 Proposition 5 If the input space X is a finite set, then any nontrivial function K on X × X is the reproducing kernel of some RKBS on X. [sent-202, score-0.424]

39 By Theorem 4, K is a reproducing kernel for some RKBS on X. [sent-213, score-0.372]

40 Proposition 5 reveals that due to the lack of an inner product, the reproducing kernel for a general RKBS can be an arbitrary function on X × X. [sent-214, score-0.402]

41 In order for reproducing kernels of RKBS to have desired properties as those of RKHS, we may need to impose certain structures on RKBS, which in some sense are substitutes of the inner product for RKHS. [sent-216, score-0.35]

42 We call a normed vector space V Gˆ teaux differentiable if for all x, y ∈ V \ {0} a x + ty lim V − x V t t∈R,t→0 exists. [sent-263, score-0.406]

43 It is called uniformly Fr´ chet differentiable if the limit is approached uniformly on S (V ) × e S (V ). [sent-264, score-0.387]

44 space V is Gˆ teaux differentiable then for all x, y ∈ V with x = 0 a lim x + ty V − x V t t∈R,t→0 = Re [y, x] . [sent-270, score-0.263]

45 x V (26) The above lemma indicates that a Gˆ teaux differentiable normed vector space has a unique a semi-inner-product. [sent-271, score-0.34]

46 In fact, we have by (26) that [x, y]V = y V lim t∈R,t→0 y + tx V t − y V + i lim t∈R,t→0 2751 iy + tx V t − y V , x, y ∈ V \ {0}. [sent-272, score-0.403]

47 (27) Z HANG , X U AND Z HANG For this reason, if V is a Gˆ teaux differentiable normed vector space we always assume that it is a an s. [sent-273, score-0.273]

48 A normed vector space V is uniformly convex if for all ε > 0 there exists a δ > 0 such that x + y V ≤ 2 − δ for all x, y ∈ S (V ) with x − y V ≥ ε. [sent-284, score-0.328]

49 It states that a e normed vector space is uniformly Fr´ chet differentiable if and only if its dual is uniformly convex. [sent-289, score-0.602]

50 e Therefore, if B is a uniformly convex and uniformly Fr´ chet differentiable Banach space then so e is B ∗ since B is reflexive. [sent-290, score-0.471]

51 Lemma 8 (Riesz Representation Theorem) Suppose that B is a uniformly convex, uniformly Fr´ chet e ∗ there exists a unique x ∈ B such that f = x∗ , differentiable Banach space. [sent-292, score-0.455]

52 By Lemma 8 and the discussion right before it, we shall investigate in the next subsection RKBS which are both uniformly convex and uniformly Fr´ chet differentiable. [sent-296, score-0.448]

53 e Let B be a uniformly convex and uniformly Fr´ chet differentiable Banach space. [sent-297, score-0.439]

54 Since B ∗ is uniformly Fr´ chet differentiable, e it has a unique semi-inner-product, which is given by [x∗ , y∗ ]B ∗ = [y, x]B , x, y ∈ B . [sent-301, score-0.291]

55 (28) We close this subsection with a concrete example of uniformly convex and uniformly Fr´ chet e p (Ω, µ) for some p ∈ differentiable Banach spaces. [sent-302, score-0.439]

56 It is uniformly convex and uniformly Fr´ chet differentiable with dual B ∗ = Lq (Ω, µ). [sent-304, score-0.479]

57 We call a uniformly convex and uniformly Fr´ chet differentiable e RKBS on X an s. [sent-313, score-0.439]

58 RKBS B is by definition uniformly Fr´ chet differentiable. [sent-334, score-0.246]

59 This leads to a more specific representation of the reproducing kernel. [sent-336, score-0.27]

60 To prove the remaining claims, we recall from Theorem 2 that the reproducing kernel K satisfies for each f ∈ B that f (x) = ( f , K(·, x))B , x ∈ X. [sent-348, score-0.372]

61 It coincides with the reproducing kernel K when B is an RKHS. [sent-361, score-0.372]

62 reproducing kernel G satisfies the following generalization of (3) G(x, y) = [G(x, ·), G(y, ·)]B , x, y ∈ X. [sent-369, score-0.372]

63 reproducing kernel in terms of its corresponding feature map. [sent-373, score-0.372]

64 To this end, for a mapping Φ from X to a uniformly convex and uniformly Fr´ chet differene ∗ the mapping from X to W ∗ defined as tiable Banach space W , we denote by Φ Φ∗ (x) := (Φ(x))∗ , x ∈ X. [sent-374, score-0.464]

65 2753 Z HANG , X U AND Z HANG Theorem 10 Let W be a uniformly convex and uniformly Fr´ chet differentiable Banach space and e Φ a mapping from X to W such that span Φ(X) = W , span Φ∗ (X) = W ∗ . [sent-375, score-0.747]

66 (36) Then B := {[u, Φ(·)]W : u ∈ W } equipped with [u, Φ(·)]W , [v, Φ(·)]W := [u, v]W (37) B and B ∗ := {[Φ(·), u]W : u ∈ W } with [Φ(·), u]W , [Φ(·), v]W := [v, u]W B∗ are uniformly convex and uniformly Fr´ chet differentiable Banach spaces. [sent-376, score-0.458]

67 kernel G of B is given by G(x, y) = [Φ(x), Φ(y)]W , x, y ∈ X, (39) which coincides with its reproducing kernel K. [sent-381, score-0.474]

68 reproducing kernel if and only if it is of the form (39), where Φ is a mapping from X to a uniformly convex and uniformly Fr´ chet differentiable e Banach space W satisfying (36). [sent-400, score-0.871]

69 By Theorem 11, G is identical with the reproducing kernel K for B if and only if span {G(x, ·) : x ∈ X} = B . [sent-428, score-0.496]

70 1, ℓ p (Nn ) is uniformly convex and uniformly Fr´ chet differe entiable. [sent-447, score-0.376]

71 Therefore, 1 2 3 span {e1 , e2 , e3 } = ℓ3 (N3 ) 2755 (44) Z HANG , X U AND Z HANG while span {e∗ , e∗ , e∗ } 1 2 3 3 ℓ 2 (N3 ). [sent-452, score-0.248]

72 Fur3 thermore, since the linear mapping Φu → u∗ is isometrically isomorphic from B to ℓ 2 (N3 ), B is a uniformly convex and uniformly Fr´ chet differentiable Banach space. [sent-456, score-0.512]

73 (47) Recall that the reproducing kernel K for B satisfies the denseness condition (8). [sent-465, score-0.432]

74 Reproducing Kernels The existence of a semi-inner-product makes it possible to study properties of RKBS and their reproducing kernels. [sent-477, score-0.27]

75 reproducing kernel G is constructed as 1 + xy p−1 G(x, y) = [Φ(x), Φ(y)]W = (51) p−2 , x, y ∈ X. [sent-503, score-0.372]

76 reproducing kernel G defined by (51), matrix G[x] is positive semidefinite for all x = {x, y, z} ⊆ X if and only if p = 2. [sent-507, score-0.372]

77 reproducing kernels for RKBS that distincts them from reproducing kernels for RKHS. [sent-522, score-0.64]

78 Universality of a kernel ensures that it can approximate any continuous target function uniformly on compact subsets of the input space. [sent-541, score-0.234]

79 For this purpose, we call a continuous kernel G on a compact metric space X weakly universal if span {G(x, ·) : x ∈ X} is dense in C(X). [sent-545, score-0.337]

80 reproducing kernel G defined by (39) is continuous on X × X, and there holds in C(X) the equality of subspaces span {G(x, ·) : x ∈ X} = span {[u, Φ(·)]W : u ∈ W }. [sent-554, score-0.652]

81 Now since G(x, ·) = [Φ(x), Φ(·)]W ∈ {[u, Φ(·)]W : u ∈ W }, we have the inclusion span {G(x, ·) : x ∈ X} ⊆ span {[u, Φ(·)]W : u ∈ W }. [sent-558, score-0.248]

82 Noting that [vn , Φ(·)]W ∈ span {G(x, ·) : x ∈ X}, n ∈ N, we have the reverse inclusion span {[u, Φ(·)]W : u ∈ W } ⊆ span {G(x, ·) : x ∈ X}, which proves the result. [sent-563, score-0.372]

83 (2006), that is, for each compact subset Z ⊆ X span {G(x, ·) : x ∈ Z } = span {[u, Φ(·)]W : u ∈ W }, where the two closures are taken in C(Z ). [sent-566, score-0.27]

84 reproducing kernels will be treated specially in future work. [sent-574, score-0.34]

85 reproducing kernel G defined by a feature map Φ : X → W as in (39). [sent-590, score-0.372]

86 We shall develop in this framework several standard learning schemes including minimal norm interpolation, regularization network, support vector machines, and kernel principal component analysis. [sent-591, score-0.282]

87 reproducing kernel G: (53) f (x) = [ f , G(x, ·)]B = [G(·, x), f ∗ ]B ∗ , x ∈ X, f ∈ B . [sent-602, score-0.372]

88 Using the reproducing property (53), we have for each (c j : j ∈ Nn ) ∈ Cn that ∑ c j f (x j ) = ∑ c j [ f , G(x j , ·)]B = ∑ c j G(·, x j ), f ∗ j∈Nn Thus, j∈Nn j∈Nn ∑ c j f (x j ) = 0, . [sent-605, score-0.27]

89 at 2760 R EPRODUCING K ERNEL BANACH S PACES FOR M ACHINE L EARNING Lemma 15 If V is a uniformly convex Banach space, then for any nonempty closed convex subset A ⊆ V and any x ∈ V there exists a unique x0 ∈ A such that x − x0 V = inf{ x − a V : a ∈ A}. [sent-611, score-0.273]

90 Lemma 17 If V is a uniformly Fr´ chet differentiable normed vector space, then x + λy e for all λ ∈ C if and only if [y, x]V = 0. [sent-622, score-0.452]

91 Note that a uniformly convex normed vector space is automatically strictly convex. [sent-686, score-0.305]

92 Conclusion We have introduced the notion of reproducing kernel Banach spaces and generalized the correspondence between an RKHS and its reproducing kernel to the setting of RKBS. [sent-826, score-0.811]

93 An immediate challenge is to construct a class of useful RKBS and the corresponding reproducing kernels. [sent-837, score-0.27]

94 By the classical theory of RKHS, a function K is a reproducing kernel if and only if the finite matrix (1) is always hermitian and positive semi-definite. [sent-838, score-0.372]

95 Thus, we ask what characteristics a function must possess so that it is a reproducing kernel for some RKBS. [sent-840, score-0.372]

96 Properties of RKBS and their reproducing kernels also deserve a systematic study. [sent-841, score-0.32]

97 If for all x, y ∈ V \ {0} the limit x + ty V − x V lim exists then [·, ·]V : V ×V → C defined by t∈R,t→0 t [x, y]V := y V lim t∈R,t→0 y + tx V t − y V + i lim iy + tx t∈R,t→0 and [x, y]V := 0 if x = 0 or y = 0 is a semi-inner-product on V . [sent-870, score-0.559]

98 2771 V t − y V if x, y = 0 (77) Z HANG , X U AND Z HANG Proof First, we obtain for x = 0 that [x, x]V = x V (1 + t)x lim V − x V (i + t)x + i lim t∈R,t→0 t |1 + t| − 1 |i + t| − 1 2 lim = x V + i lim t∈R,t→0 t∈R,t→0 t t 2 (1 + 0) = x 2 > 0. [sent-871, score-0.372]

99 We start with the estimate: z + tx + ty V − z V t z z 2 + tx V + 2 + ty V − z V ≤ z V lim + t t∈R,t→0 z z z z + tx V − 2 V + ty V − 2 V lim + 2 = z V + lim + 2 t t t∈R,t→0 t∈R,t→0 z + 2tx V − z V z + 2ty V − z V lim = z V + lim + . [sent-875, score-0.708]

100 Support vector machines, reproducing kernel Hilbert spaces and the randomized GACV. [sent-1138, score-0.439]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('rkbs', 0.477), ('banach', 0.423), ('nn', 0.329), ('reproducing', 0.27), ('hang', 0.195), ('gx', 0.177), ('chet', 0.168), ('normed', 0.143), ('iy', 0.135), ('eproducing', 0.128), ('paces', 0.128), ('ez', 0.128), ('span', 0.124), ('achine', 0.105), ('kernel', 0.102), ('lim', 0.093), ('micchelli', 0.086), ('rkhs', 0.083), ('ernel', 0.079), ('minimizer', 0.078), ('cn', 0.078), ('uniformly', 0.078), ('fr', 0.074), ('shall', 0.072), ('exive', 0.071), ('spaces', 0.067), ('differentiable', 0.063), ('giles', 0.061), ('nm', 0.06), ('denseness', 0.06), ('representer', 0.059), ('fk', 0.059), ('hilbert', 0.056), ('lumer', 0.053), ('convex', 0.052), ('kernels', 0.05), ('bilinear', 0.047), ('unique', 0.045), ('riesz', 0.044), ('lq', 0.043), ('interpolation', 0.043), ('re', 0.043), ('norm', 0.043), ('tx', 0.041), ('dual', 0.04), ('ty', 0.04), ('earning', 0.038), ('xu', 0.038), ('pca', 0.037), ('teaux', 0.035), ('isometric', 0.034), ('tg', 0.034), ('duality', 0.033), ('space', 0.032), ('theorem', 0.032), ('continuous', 0.032), ('inner', 0.03), ('conway', 0.03), ('yuesheng', 0.03), ('im', 0.029), ('lkopf', 0.028), ('mapping', 0.028), ('functional', 0.028), ('bijective', 0.027), ('inf', 0.027), ('spanv', 0.027), ('sch', 0.026), ('proposition', 0.025), ('yk', 0.025), ('dense', 0.025), ('co', 0.025), ('fn', 0.025), ('xk', 0.024), ('linearly', 0.024), ('id', 0.023), ('principal', 0.023), ('exists', 0.023), ('nonempty', 0.023), ('ay', 0.023), ('gz', 0.023), ('isometrically', 0.023), ('lemma', 0.022), ('uniqueness', 0.022), ('isomorphic', 0.022), ('compact', 0.022), ('functionals', 0.022), ('rd', 0.021), ('minimal', 0.021), ('smola', 0.021), ('regularization', 0.021), ('bk', 0.02), ('convexity', 0.02), ('nontrivial', 0.02), ('fourier', 0.02), ('specially', 0.02), ('equipped', 0.019), ('mx', 0.019), ('der', 0.019), ('nite', 0.019), ('cristianini', 0.019)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.9999997 80 jmlr-2009-Reproducing Kernel Banach Spaces for Machine Learning

Author: Haizhang Zhang, Yuesheng Xu, Jun Zhang

Abstract: We introduce the notion of reproducing kernel Banach spaces (RKBS) and study special semiinner-product RKBS by making use of semi-inner-products and the duality mapping. Properties of an RKBS and its reproducing kernel are investigated. As applications, we develop in the framework of RKBS standard learning schemes including minimal norm interpolation, regularization network, support vector machines, and kernel principal component analysis. In particular, existence, uniqueness and representer theorems are established. Keywords: reproducing kernel Banach spaces, reproducing kernels, learning theory, semi-innerproducts, representer theorems

2 0.19725911 78 jmlr-2009-Refinement of Reproducing Kernels

Author: Yuesheng Xu, Haizhang Zhang

Abstract: We continue our recent study on constructing a refinement kernel for a given kernel so that the reproducing kernel Hilbert space associated with the refinement kernel contains that with the original kernel as a subspace. To motivate this study, we first develop a refinement kernel method for learning, which gives an efficient algorithm for updating a learning predictor. Several characterizations of refinement kernels are then presented. It is shown that a nontrivial refinement kernel for a given kernel always exists if the input space has an infinite cardinal number. Refinement kernels for translation invariant kernels and Hilbert-Schmidt kernels are investigated. Various concrete examples are provided. Keywords: reproducing kernels, reproducing kernel Hilbert spaces, learning with kernels, refinement kernels, translation invariant kernels, Hilbert-Schmidt kernels

3 0.1734544 100 jmlr-2009-When Is There a Representer Theorem? Vector Versus Matrix Regularizers

Author: Andreas Argyriou, Charles A. Micchelli, Massimiliano Pontil

Abstract: We consider a general class of regularization methods which learn a vector of parameters on the basis of linear measurements. It is well known that if the regularizer is a nondecreasing function of the L2 norm, then the learned vector is a linear combination of the input data. This result, known as the representer theorem, lies at the basis of kernel-based methods in machine learning. In this paper, we prove the necessity of the above condition, in the case of differentiable regularizers. We further extend our analysis to regularization methods which learn a matrix, a problem which is motivated by the application to multi-task learning. In this context, we study a more general representer theorem, which holds for a larger class of regularizers. We provide a necessary and sufficient condition characterizing this class of matrix regularizers and we highlight some concrete examples of practical importance. Our analysis uses basic principles from matrix theory, especially the useful notion of matrix nondecreasing functions. Keywords: kernel methods, matrix learning, minimal norm interpolation, multi-task learning, regularization

4 0.098693043 90 jmlr-2009-Structure Spaces

Author: Brijnesh J. Jain, Klaus Obermayer

Abstract: Finite structures such as point patterns, strings, trees, and graphs occur as ”natural” representations of structured data in different application areas of machine learning. We develop the theory of structure spaces and derive geometrical and analytical concepts such as the angle between structures and the derivative of functions on structures. In particular, we show that the gradient of a differentiable structural function is a well-defined structure pointing in the direction of steepest ascent. Exploiting the properties of structure spaces, it will turn out that a number of problems in structural pattern recognition such as central clustering or learning in structured output spaces can be formulated as optimization problems with cost functions that are locally Lipschitz. Hence, methods from nonsmooth analysis are applicable to optimize those cost functions. Keywords: graphs, graph matching, learning in structured domains, nonsmooth optimization

5 0.088348597 22 jmlr-2009-Deterministic Error Analysis of Support Vector Regression and Related Regularized Kernel Methods

Author: Christian Rieger, Barbara Zwicknagl

Abstract: We introduce a new technique for the analysis of kernel-based regression problems. The basic tools are sampling inequalities which apply to all machine learning problems involving penalty terms induced by kernels related to Sobolev spaces. They lead to explicit deterministic results concerning the worst case behaviour of ε- and ν-SVRs. Using these, we show how to adjust regularization parameters to get best possible approximation orders for regression. The results are illustrated by some numerical examples. Keywords: sampling inequality, radial basis functions, approximation theory, reproducing kernel Hilbert space, Sobolev space

6 0.084768213 2 jmlr-2009-A New Approach to Collaborative Filtering: Operator Estimation with Spectral Regularization

7 0.06845706 29 jmlr-2009-Estimating Labels from Label Proportions

8 0.061070304 16 jmlr-2009-Classification with Gaussians and Convex Loss

9 0.056274943 98 jmlr-2009-Universal Kernel-Based Learning with Applications to Regular Languages    (Special Topic on Mining and Learning with Graphs and Relations)

10 0.055885591 59 jmlr-2009-Nearest Neighbor Clustering: A Baseline Method for Consistent Clustering with Arbitrary Objective Functions

11 0.050737545 68 jmlr-2009-Online Learning with Samples Drawn from Non-identical Distributions

12 0.036847953 51 jmlr-2009-Low-Rank Kernel Learning with Bregman Matrix Divergences

13 0.036505092 87 jmlr-2009-Sparse Online Learning via Truncated Gradient

14 0.035081021 52 jmlr-2009-Margin-based Ranking and an Equivalence between AdaBoost and RankBoost

15 0.033984814 86 jmlr-2009-Similarity-based Classification: Concepts and Algorithms

16 0.033552203 38 jmlr-2009-Hash Kernels for Structured Data

17 0.033504754 61 jmlr-2009-Nonextensive Information Theoretic Kernels on Measures

18 0.031312529 66 jmlr-2009-On the Consistency of Feature Selection using Greedy Least Squares Regression

19 0.030620961 23 jmlr-2009-Discriminative Learning Under Covariate Shift

20 0.030452797 82 jmlr-2009-Robustness and Regularization of Support Vector Machines


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.195), (1, 0.03), (2, 0.111), (3, -0.12), (4, -0.312), (5, 0.177), (6, 0.142), (7, -0.273), (8, -0.037), (9, 0.001), (10, 0.029), (11, 0.077), (12, 0.206), (13, 0.039), (14, 0.078), (15, -0.02), (16, -0.024), (17, 0.066), (18, -0.103), (19, -0.062), (20, -0.019), (21, 0.008), (22, 0.057), (23, 0.127), (24, 0.01), (25, -0.016), (26, 0.024), (27, -0.046), (28, -0.122), (29, -0.003), (30, 0.029), (31, 0.015), (32, -0.037), (33, -0.075), (34, -0.002), (35, 0.001), (36, 0.034), (37, -0.111), (38, -0.027), (39, -0.033), (40, -0.02), (41, 0.11), (42, 0.093), (43, -0.026), (44, 0.053), (45, -0.125), (46, -0.007), (47, 0.062), (48, -0.05), (49, -0.105)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.96586418 80 jmlr-2009-Reproducing Kernel Banach Spaces for Machine Learning

Author: Haizhang Zhang, Yuesheng Xu, Jun Zhang

Abstract: We introduce the notion of reproducing kernel Banach spaces (RKBS) and study special semiinner-product RKBS by making use of semi-inner-products and the duality mapping. Properties of an RKBS and its reproducing kernel are investigated. As applications, we develop in the framework of RKBS standard learning schemes including minimal norm interpolation, regularization network, support vector machines, and kernel principal component analysis. In particular, existence, uniqueness and representer theorems are established. Keywords: reproducing kernel Banach spaces, reproducing kernels, learning theory, semi-innerproducts, representer theorems

2 0.77520359 78 jmlr-2009-Refinement of Reproducing Kernels

Author: Yuesheng Xu, Haizhang Zhang

Abstract: We continue our recent study on constructing a refinement kernel for a given kernel so that the reproducing kernel Hilbert space associated with the refinement kernel contains that with the original kernel as a subspace. To motivate this study, we first develop a refinement kernel method for learning, which gives an efficient algorithm for updating a learning predictor. Several characterizations of refinement kernels are then presented. It is shown that a nontrivial refinement kernel for a given kernel always exists if the input space has an infinite cardinal number. Refinement kernels for translation invariant kernels and Hilbert-Schmidt kernels are investigated. Various concrete examples are provided. Keywords: reproducing kernels, reproducing kernel Hilbert spaces, learning with kernels, refinement kernels, translation invariant kernels, Hilbert-Schmidt kernels

3 0.66706026 100 jmlr-2009-When Is There a Representer Theorem? Vector Versus Matrix Regularizers

Author: Andreas Argyriou, Charles A. Micchelli, Massimiliano Pontil

Abstract: We consider a general class of regularization methods which learn a vector of parameters on the basis of linear measurements. It is well known that if the regularizer is a nondecreasing function of the L2 norm, then the learned vector is a linear combination of the input data. This result, known as the representer theorem, lies at the basis of kernel-based methods in machine learning. In this paper, we prove the necessity of the above condition, in the case of differentiable regularizers. We further extend our analysis to regularization methods which learn a matrix, a problem which is motivated by the application to multi-task learning. In this context, we study a more general representer theorem, which holds for a larger class of regularizers. We provide a necessary and sufficient condition characterizing this class of matrix regularizers and we highlight some concrete examples of practical importance. Our analysis uses basic principles from matrix theory, especially the useful notion of matrix nondecreasing functions. Keywords: kernel methods, matrix learning, minimal norm interpolation, multi-task learning, regularization

4 0.57446617 22 jmlr-2009-Deterministic Error Analysis of Support Vector Regression and Related Regularized Kernel Methods

Author: Christian Rieger, Barbara Zwicknagl

Abstract: We introduce a new technique for the analysis of kernel-based regression problems. The basic tools are sampling inequalities which apply to all machine learning problems involving penalty terms induced by kernels related to Sobolev spaces. They lead to explicit deterministic results concerning the worst case behaviour of ε- and ν-SVRs. Using these, we show how to adjust regularization parameters to get best possible approximation orders for regression. The results are illustrated by some numerical examples. Keywords: sampling inequality, radial basis functions, approximation theory, reproducing kernel Hilbert space, Sobolev space

5 0.43365356 2 jmlr-2009-A New Approach to Collaborative Filtering: Operator Estimation with Spectral Regularization

Author: Jacob Abernethy, Francis Bach, Theodoros Evgeniou, Jean-Philippe Vert

Abstract: We present a general approach for collaborative filtering (CF) using spectral regularization to learn linear operators mapping a set of “users” to a set of possibly desired “objects”. In particular, several recent low-rank type matrix-completion methods for CF are shown to be special cases of our proposed framework. Unlike existing regularization-based CF, our approach can be used to incorporate additional information such as attributes of the users/objects—a feature currently lacking in existing regularization-based CF approaches—using popular and well-known kernel methods. We provide novel representer theorems that we use to develop new estimation methods. We then provide learning algorithms based on low-rank decompositions and test them on a standard CF data set. The experiments indicate the advantages of generalizing the existing regularization-based CF methods to incorporate related information about users and objects. Finally, we show that certain multi-task learning methods can be also seen as special cases of our proposed approach. Keywords: collaborative filtering, matrix completion, kernel methods, spectral regularization

6 0.34107983 90 jmlr-2009-Structure Spaces

7 0.29497615 29 jmlr-2009-Estimating Labels from Label Proportions

8 0.29012874 98 jmlr-2009-Universal Kernel-Based Learning with Applications to Regular Languages    (Special Topic on Mining and Learning with Graphs and Relations)

9 0.28068158 59 jmlr-2009-Nearest Neighbor Clustering: A Baseline Method for Consistent Clustering with Arbitrary Objective Functions

10 0.26418218 51 jmlr-2009-Low-Rank Kernel Learning with Bregman Matrix Divergences

11 0.23221359 86 jmlr-2009-Similarity-based Classification: Concepts and Algorithms

12 0.21840426 61 jmlr-2009-Nonextensive Information Theoretic Kernels on Measures

13 0.21145448 38 jmlr-2009-Hash Kernels for Structured Data

14 0.20079079 66 jmlr-2009-On the Consistency of Feature Selection using Greedy Least Squares Regression

15 0.19572787 16 jmlr-2009-Classification with Gaussians and Convex Loss

16 0.1953197 68 jmlr-2009-Online Learning with Samples Drawn from Non-identical Distributions

17 0.17342359 48 jmlr-2009-Learning Nondeterministic Classifiers

18 0.14990185 40 jmlr-2009-Identification of Recurrent Neural Networks by Bayesian Interrogation Techniques

19 0.14680915 67 jmlr-2009-Online Learning with Sample Path Constraints

20 0.14447774 95 jmlr-2009-The P-Norm Push: A Simple Convex Ranking Algorithm that Concentrates at the Top of the List


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(7, 0.282), (8, 0.011), (11, 0.065), (19, 0.035), (21, 0.034), (38, 0.032), (47, 0.025), (52, 0.033), (55, 0.032), (58, 0.031), (66, 0.147), (68, 0.018), (90, 0.124), (96, 0.022)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.77598882 80 jmlr-2009-Reproducing Kernel Banach Spaces for Machine Learning

Author: Haizhang Zhang, Yuesheng Xu, Jun Zhang

Abstract: We introduce the notion of reproducing kernel Banach spaces (RKBS) and study special semiinner-product RKBS by making use of semi-inner-products and the duality mapping. Properties of an RKBS and its reproducing kernel are investigated. As applications, we develop in the framework of RKBS standard learning schemes including minimal norm interpolation, regularization network, support vector machines, and kernel principal component analysis. In particular, existence, uniqueness and representer theorems are established. Keywords: reproducing kernel Banach spaces, reproducing kernels, learning theory, semi-innerproducts, representer theorems

2 0.58266836 78 jmlr-2009-Refinement of Reproducing Kernels

Author: Yuesheng Xu, Haizhang Zhang

Abstract: We continue our recent study on constructing a refinement kernel for a given kernel so that the reproducing kernel Hilbert space associated with the refinement kernel contains that with the original kernel as a subspace. To motivate this study, we first develop a refinement kernel method for learning, which gives an efficient algorithm for updating a learning predictor. Several characterizations of refinement kernels are then presented. It is shown that a nontrivial refinement kernel for a given kernel always exists if the input space has an infinite cardinal number. Refinement kernels for translation invariant kernels and Hilbert-Schmidt kernels are investigated. Various concrete examples are provided. Keywords: reproducing kernels, reproducing kernel Hilbert spaces, learning with kernels, refinement kernels, translation invariant kernels, Hilbert-Schmidt kernels

3 0.57499301 100 jmlr-2009-When Is There a Representer Theorem? Vector Versus Matrix Regularizers

Author: Andreas Argyriou, Charles A. Micchelli, Massimiliano Pontil

Abstract: We consider a general class of regularization methods which learn a vector of parameters on the basis of linear measurements. It is well known that if the regularizer is a nondecreasing function of the L2 norm, then the learned vector is a linear combination of the input data. This result, known as the representer theorem, lies at the basis of kernel-based methods in machine learning. In this paper, we prove the necessity of the above condition, in the case of differentiable regularizers. We further extend our analysis to regularization methods which learn a matrix, a problem which is motivated by the application to multi-task learning. In this context, we study a more general representer theorem, which holds for a larger class of regularizers. We provide a necessary and sufficient condition characterizing this class of matrix regularizers and we highlight some concrete examples of practical importance. Our analysis uses basic principles from matrix theory, especially the useful notion of matrix nondecreasing functions. Keywords: kernel methods, matrix learning, minimal norm interpolation, multi-task learning, regularization

4 0.56268549 13 jmlr-2009-Bounded Kernel-Based Online Learning

Author: Francesco Orabona, Joseph Keshet, Barbara Caputo

Abstract: A common problem of kernel-based online algorithms, such as the kernel-based Perceptron algorithm, is the amount of memory required to store the online hypothesis, which may increase without bound as the algorithm progresses. Furthermore, the computational load of such algorithms grows linearly with the amount of memory used to store the hypothesis. To attack these problems, most previous work has focused on discarding some of the instances, in order to keep the memory bounded. In this paper we present a new algorithm, in which the instances are not discarded, but are instead projected onto the space spanned by the previous online hypothesis. We call this algorithm Projectron. While the memory size of the Projectron solution cannot be predicted before training, we prove that its solution is guaranteed to be bounded. We derive a relative mistake bound for the proposed algorithm, and deduce from it a slightly different algorithm which outperforms the Perceptron. We call this second algorithm Projectron++. We show that this algorithm can be extended to handle the multiclass and the structured output settings, resulting, as far as we know, in the first online bounded algorithm that can learn complex classification tasks. The method of bounding the hypothesis representation can be applied to any conservative online algorithm and to other online algorithms, as it is demonstrated for ALMA2 . Experimental results on various data sets show the empirical advantage of our technique compared to various bounded online algorithms, both in terms of memory and accuracy. Keywords: online learning, kernel methods, support vector machines, bounded support set

5 0.56001776 52 jmlr-2009-Margin-based Ranking and an Equivalence between AdaBoost and RankBoost

Author: Cynthia Rudin, Robert E. Schapire

Abstract: We study boosting algorithms for learning to rank. We give a general margin-based bound for ranking based on covering numbers for the hypothesis space. Our bound suggests that algorithms that maximize the ranking margin will generalize well. We then describe a new algorithm, smooth margin ranking, that precisely converges to a maximum ranking-margin solution. The algorithm is a modification of RankBoost, analogous to “approximate coordinate ascent boosting.” Finally, we prove that AdaBoost and RankBoost are equally good for the problems of bipartite ranking and classification in terms of their asymptotic behavior on the training set. Under natural conditions, AdaBoost achieves an area under the ROC curve that is equally as good as RankBoost’s; furthermore, RankBoost, when given a specific intercept, achieves a misclassification error that is as good as AdaBoost’s. This may help to explain the empirical observations made by Cortes and Mohri, and Caruana and Niculescu-Mizil, about the excellent performance of AdaBoost as a bipartite ranking algorithm, as measured by the area under the ROC curve. Keywords: ranking, RankBoost, generalization bounds, AdaBoost, area under the ROC curve

6 0.55255103 22 jmlr-2009-Deterministic Error Analysis of Support Vector Regression and Related Regularized Kernel Methods

7 0.55045974 93 jmlr-2009-The Hidden Life of Latent Variables: Bayesian Learning with Mixed Graph Models

8 0.54344904 85 jmlr-2009-Settable Systems: An Extension of Pearl's Causal Model with Optimization, Equilibrium, and Learning

9 0.54202783 19 jmlr-2009-Controlling the False Discovery Rate of the Association Causality Structure Learned with the PC Algorithm    (Special Topic on Mining and Learning with Graphs and Relations)

10 0.54114819 36 jmlr-2009-Fourier Theoretic Probabilistic Inference over Permutations

11 0.5409494 32 jmlr-2009-Exploiting Product Distributions to Identify Relevant Variables of Correlation Immune Functions

12 0.54038966 95 jmlr-2009-The P-Norm Push: A Simple Convex Ranking Algorithm that Concentrates at the Top of the List

13 0.54009801 82 jmlr-2009-Robustness and Regularization of Support Vector Machines

14 0.53921008 27 jmlr-2009-Efficient Online and Batch Learning Using Forward Backward Splitting

15 0.53887957 18 jmlr-2009-Consistency and Localizability

16 0.53561151 47 jmlr-2009-Learning Linear Ranking Functions for Beam Search with Application to Planning

17 0.53078181 94 jmlr-2009-The Nonparanormal: Semiparametric Estimation of High Dimensional Undirected Graphs

18 0.5307188 37 jmlr-2009-Generalization Bounds for Ranking Algorithms via Algorithmic Stability

19 0.53002441 2 jmlr-2009-A New Approach to Collaborative Filtering: Operator Estimation with Spectral Regularization

20 0.52905834 9 jmlr-2009-Analysis of Perceptron-Based Active Learning