jmlr jmlr2009 jmlr2009-80 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Haizhang Zhang, Yuesheng Xu, Jun Zhang
Abstract: We introduce the notion of reproducing kernel Banach spaces (RKBS) and study special semiinner-product RKBS by making use of semi-inner-products and the duality mapping. Properties of an RKBS and its reproducing kernel are investigated. As applications, we develop in the framework of RKBS standard learning schemes including minimal norm interpolation, regularization network, support vector machines, and kernel principal component analysis. In particular, existence, uniqueness and representer theorems are established. Keywords: reproducing kernel Banach spaces, reproducing kernels, learning theory, semi-innerproducts, representer theorems
Reference: text
sentIndex sentText sentNum sentScore
1 EDU Department of Psychology University of Michigan Ann Arbor, MI 48109, USA Editor: Ingo Steinwart Abstract We introduce the notion of reproducing kernel Banach spaces (RKBS) and study special semiinner-product RKBS by making use of semi-inner-products and the duality mapping. [sent-4, score-0.472]
2 Properties of an RKBS and its reproducing kernel are investigated. [sent-5, score-0.372]
3 Keywords: reproducing kernel Banach spaces, reproducing kernels, learning theory, semi-innerproducts, representer theorems 1. [sent-8, score-0.701]
4 The reason of using positive definite kernels to measure similarity lies in the celebrated theoretical fact due to Mercer (1909) that there is a bijective correspondence between them and reproducing kernel Hilbert spaces (RKHS). [sent-17, score-0.516]
5 In light of this bijective correspondence, positive definite kernels are usually called reproducing kernels. [sent-22, score-0.347]
6 Based on the theory of reproducing kernels, many effective schemes have been developed for learning from finite samples (Evgeniou et al. [sent-27, score-0.27]
7 The above discussion indicates that there is a need of introducing the notion of reproducing kernel Banach spaces for the systematic study of learning in Banach spaces. [sent-56, score-0.439]
8 We shall introduce the notion of reproducing kernel Banach spaces in Section 2, and a general construction in Section 3. [sent-60, score-0.511]
9 It will become clear that the lack of an inner product may cause arbitrariness in the properties of the associated reproducing kernel. [sent-61, score-0.3]
10 reproducing kernel Banach spaces by making use of semi-inner-products for normed vector spaces first defined by Lumer (1961) and further developed by Giles (1967). [sent-65, score-0.649]
11 Here the availability of a semi-inner-product enables us to study basic properties of reproducing kernel Banach spaces and their reproducing kernels. [sent-67, score-0.709]
12 In Section 5, we shall develop in the framework of reproducing kernel Banach spaces standard learning schemes including minimal norm interpolation, regularization network, support vector machines, and kernel principal component analysis. [sent-68, score-0.721]
13 A normed vector space B is called a Banach space of functions on X if it is a Banach space whose elements are functions on X, and for each f ∈ B , its norm f B in B vanishes if and only if f , as a function, vanishes everywhere on X. [sent-74, score-0.282]
14 Influenced by the definition of RKHS, our first intuition is to define a reproducing kernel Banach space (RKBS) as a Banach space of functions on X on which point evaluations are continuous linear functionals. [sent-76, score-0.468]
15 However, since for each f ∈ C[0, 1], f (x) = δx ( f ), x ∈ [0, 1], the reproducing kernel for C[0, 1] would have to be the delta distribution, which is not a function that can be evaluated. [sent-79, score-0.372]
16 Recall that two normed vector spaces V1 and V2 are said to be isometric if there is a bijective linear norm-preserving mapping between them. [sent-81, score-0.299]
17 Definition 1 A reproducing kernel Banach space (RKBS) on X is a reflexive Banach space B of functions on X for which B ∗ is isometric to a Banach space B # of functions on X and the point evaluation is continuous on both B and B # . [sent-87, score-0.534]
18 We shall show that there indeed exists a reproducing kernel for an RKBS. [sent-105, score-0.467]
19 (7) (c) The linear span of {K(x, ·) : x ∈ X} is dense in B , namely, span {K(x, ·) : x ∈ X} = B . [sent-113, score-0.273]
20 (8) (d) The linear span of {K(·, x) : x ∈ X} is dense in B ∗ , namely, span {K(·, x) : x ∈ X} = B ∗ . [sent-114, score-0.273]
21 (10) (e) For all x, y ∈ X Proof For every x ∈ X, since δx is a continuous linear functional on B , there exists gx ∈ B ∗ such that f (x) = ( f , gx )B , f ∈ B . [sent-116, score-0.437]
22 We call the function K in Theorem 2 the reproducing kernel for the RKBS B . [sent-134, score-0.372]
23 By Theorem 2, an RKBS has exactly one reproducing kernel. [sent-135, score-0.27]
24 However, different RKBS may have the same reproducing kernel. [sent-136, score-0.27]
25 Consequently, although we have at hand a reproducing kernel K for an RKBS B , the relationship (13), and denseness conditions (8), (9), we still can not determine the norm on B . [sent-144, score-0.475]
26 Construction of Reproducing Kernels via Feature Maps In this section, we shall characterize reproducing kernels for RKBS. [sent-146, score-0.392]
27 The characterization will at the same time provide a convenient way of constructing reproducing kernels and their corresponding RKBS. [sent-147, score-0.32]
28 Suppose that there exists Φ : X → W , and Φ∗ : X → W ∗ such that span Φ(X) = W , span Φ∗ (X) = W ∗ . [sent-150, score-0.271]
29 (16) Moreover, the reproducing kernel K for B is K(x, y) := (Φ(x), Φ∗ (y))W , x, y ∈ X. [sent-152, score-0.372]
30 These facts show that K is the reproducing kernel for B and complete the proof. [sent-176, score-0.372]
31 We call the mappings Φ, Φ∗ in Theorem 3 a pair of feature maps for the reproducing kernel K. [sent-177, score-0.372]
32 As a corollary to Theorem 3, we obtain the following characterization of reproducing kernels for RKBS. [sent-179, score-0.32]
33 Theorem 4 A function K : X × X → C is the reproducing kernel of an RKBS on X if and only if it is of the form (17), where W is a reflexive Banach space, and mappings Φ : X → W , Φ∗ : X → W ∗ satisfy (14). [sent-180, score-0.372]
34 For the necessity, we assume that K is the reproducing kernel of an RKBS B on X, and set W := B , W ∗ := B ∗ , Φ(x) := K(x, ·), Φ∗ (x) := K(·, x), x ∈ X. [sent-182, score-0.372]
35 To demonstrate how we get RKBS and their reproducing kernels by Theorem 3, we now present a nontrivial example of RKBS. [sent-184, score-0.34]
36 However, we see that they all have the sinc function as the reproducing kernel. [sent-199, score-0.27]
37 In fact, if no further conditions are imposed on an RKBS, its reproducing kernel can be rather arbitrary. [sent-200, score-0.372]
38 Proposition 5 If the input space X is a finite set, then any nontrivial function K on X × X is the reproducing kernel of some RKBS on X. [sent-202, score-0.424]
39 By Theorem 4, K is a reproducing kernel for some RKBS on X. [sent-213, score-0.372]
40 Proposition 5 reveals that due to the lack of an inner product, the reproducing kernel for a general RKBS can be an arbitrary function on X × X. [sent-214, score-0.402]
41 In order for reproducing kernels of RKBS to have desired properties as those of RKHS, we may need to impose certain structures on RKBS, which in some sense are substitutes of the inner product for RKHS. [sent-216, score-0.35]
42 We call a normed vector space V Gˆ teaux differentiable if for all x, y ∈ V \ {0} a x + ty lim V − x V t t∈R,t→0 exists. [sent-263, score-0.406]
43 It is called uniformly Fr´ chet differentiable if the limit is approached uniformly on S (V ) × e S (V ). [sent-264, score-0.387]
44 space V is Gˆ teaux differentiable then for all x, y ∈ V with x = 0 a lim x + ty V − x V t t∈R,t→0 = Re [y, x] . [sent-270, score-0.263]
45 x V (26) The above lemma indicates that a Gˆ teaux differentiable normed vector space has a unique a semi-inner-product. [sent-271, score-0.34]
46 In fact, we have by (26) that [x, y]V = y V lim t∈R,t→0 y + tx V t − y V + i lim t∈R,t→0 2751 iy + tx V t − y V , x, y ∈ V \ {0}. [sent-272, score-0.403]
47 (27) Z HANG , X U AND Z HANG For this reason, if V is a Gˆ teaux differentiable normed vector space we always assume that it is a an s. [sent-273, score-0.273]
48 A normed vector space V is uniformly convex if for all ε > 0 there exists a δ > 0 such that x + y V ≤ 2 − δ for all x, y ∈ S (V ) with x − y V ≥ ε. [sent-284, score-0.328]
49 It states that a e normed vector space is uniformly Fr´ chet differentiable if and only if its dual is uniformly convex. [sent-289, score-0.602]
50 e Therefore, if B is a uniformly convex and uniformly Fr´ chet differentiable Banach space then so e is B ∗ since B is reflexive. [sent-290, score-0.471]
51 Lemma 8 (Riesz Representation Theorem) Suppose that B is a uniformly convex, uniformly Fr´ chet e ∗ there exists a unique x ∈ B such that f = x∗ , differentiable Banach space. [sent-292, score-0.455]
52 By Lemma 8 and the discussion right before it, we shall investigate in the next subsection RKBS which are both uniformly convex and uniformly Fr´ chet differentiable. [sent-296, score-0.448]
53 e Let B be a uniformly convex and uniformly Fr´ chet differentiable Banach space. [sent-297, score-0.439]
54 Since B ∗ is uniformly Fr´ chet differentiable, e it has a unique semi-inner-product, which is given by [x∗ , y∗ ]B ∗ = [y, x]B , x, y ∈ B . [sent-301, score-0.291]
55 (28) We close this subsection with a concrete example of uniformly convex and uniformly Fr´ chet e p (Ω, µ) for some p ∈ differentiable Banach spaces. [sent-302, score-0.439]
56 It is uniformly convex and uniformly Fr´ chet differentiable with dual B ∗ = Lq (Ω, µ). [sent-304, score-0.479]
57 We call a uniformly convex and uniformly Fr´ chet differentiable e RKBS on X an s. [sent-313, score-0.439]
58 RKBS B is by definition uniformly Fr´ chet differentiable. [sent-334, score-0.246]
59 This leads to a more specific representation of the reproducing kernel. [sent-336, score-0.27]
60 To prove the remaining claims, we recall from Theorem 2 that the reproducing kernel K satisfies for each f ∈ B that f (x) = ( f , K(·, x))B , x ∈ X. [sent-348, score-0.372]
61 It coincides with the reproducing kernel K when B is an RKHS. [sent-361, score-0.372]
62 reproducing kernel G satisfies the following generalization of (3) G(x, y) = [G(x, ·), G(y, ·)]B , x, y ∈ X. [sent-369, score-0.372]
63 reproducing kernel in terms of its corresponding feature map. [sent-373, score-0.372]
64 To this end, for a mapping Φ from X to a uniformly convex and uniformly Fr´ chet differene ∗ the mapping from X to W ∗ defined as tiable Banach space W , we denote by Φ Φ∗ (x) := (Φ(x))∗ , x ∈ X. [sent-374, score-0.464]
65 2753 Z HANG , X U AND Z HANG Theorem 10 Let W be a uniformly convex and uniformly Fr´ chet differentiable Banach space and e Φ a mapping from X to W such that span Φ(X) = W , span Φ∗ (X) = W ∗ . [sent-375, score-0.747]
66 (36) Then B := {[u, Φ(·)]W : u ∈ W } equipped with [u, Φ(·)]W , [v, Φ(·)]W := [u, v]W (37) B and B ∗ := {[Φ(·), u]W : u ∈ W } with [Φ(·), u]W , [Φ(·), v]W := [v, u]W B∗ are uniformly convex and uniformly Fr´ chet differentiable Banach spaces. [sent-376, score-0.458]
67 kernel G of B is given by G(x, y) = [Φ(x), Φ(y)]W , x, y ∈ X, (39) which coincides with its reproducing kernel K. [sent-381, score-0.474]
68 reproducing kernel if and only if it is of the form (39), where Φ is a mapping from X to a uniformly convex and uniformly Fr´ chet differentiable e Banach space W satisfying (36). [sent-400, score-0.871]
69 By Theorem 11, G is identical with the reproducing kernel K for B if and only if span {G(x, ·) : x ∈ X} = B . [sent-428, score-0.496]
70 1, ℓ p (Nn ) is uniformly convex and uniformly Fr´ chet differe entiable. [sent-447, score-0.376]
71 Therefore, 1 2 3 span {e1 , e2 , e3 } = ℓ3 (N3 ) 2755 (44) Z HANG , X U AND Z HANG while span {e∗ , e∗ , e∗ } 1 2 3 3 ℓ 2 (N3 ). [sent-452, score-0.248]
72 Fur3 thermore, since the linear mapping Φu → u∗ is isometrically isomorphic from B to ℓ 2 (N3 ), B is a uniformly convex and uniformly Fr´ chet differentiable Banach space. [sent-456, score-0.512]
73 (47) Recall that the reproducing kernel K for B satisfies the denseness condition (8). [sent-465, score-0.432]
74 Reproducing Kernels The existence of a semi-inner-product makes it possible to study properties of RKBS and their reproducing kernels. [sent-477, score-0.27]
75 reproducing kernel G is constructed as 1 + xy p−1 G(x, y) = [Φ(x), Φ(y)]W = (51) p−2 , x, y ∈ X. [sent-503, score-0.372]
76 reproducing kernel G defined by (51), matrix G[x] is positive semidefinite for all x = {x, y, z} ⊆ X if and only if p = 2. [sent-507, score-0.372]
77 reproducing kernels for RKBS that distincts them from reproducing kernels for RKHS. [sent-522, score-0.64]
78 Universality of a kernel ensures that it can approximate any continuous target function uniformly on compact subsets of the input space. [sent-541, score-0.234]
79 For this purpose, we call a continuous kernel G on a compact metric space X weakly universal if span {G(x, ·) : x ∈ X} is dense in C(X). [sent-545, score-0.337]
80 reproducing kernel G defined by (39) is continuous on X × X, and there holds in C(X) the equality of subspaces span {G(x, ·) : x ∈ X} = span {[u, Φ(·)]W : u ∈ W }. [sent-554, score-0.652]
81 Now since G(x, ·) = [Φ(x), Φ(·)]W ∈ {[u, Φ(·)]W : u ∈ W }, we have the inclusion span {G(x, ·) : x ∈ X} ⊆ span {[u, Φ(·)]W : u ∈ W }. [sent-558, score-0.248]
82 Noting that [vn , Φ(·)]W ∈ span {G(x, ·) : x ∈ X}, n ∈ N, we have the reverse inclusion span {[u, Φ(·)]W : u ∈ W } ⊆ span {G(x, ·) : x ∈ X}, which proves the result. [sent-563, score-0.372]
83 (2006), that is, for each compact subset Z ⊆ X span {G(x, ·) : x ∈ Z } = span {[u, Φ(·)]W : u ∈ W }, where the two closures are taken in C(Z ). [sent-566, score-0.27]
84 reproducing kernels will be treated specially in future work. [sent-574, score-0.34]
85 reproducing kernel G defined by a feature map Φ : X → W as in (39). [sent-590, score-0.372]
86 We shall develop in this framework several standard learning schemes including minimal norm interpolation, regularization network, support vector machines, and kernel principal component analysis. [sent-591, score-0.282]
87 reproducing kernel G: (53) f (x) = [ f , G(x, ·)]B = [G(·, x), f ∗ ]B ∗ , x ∈ X, f ∈ B . [sent-602, score-0.372]
88 Using the reproducing property (53), we have for each (c j : j ∈ Nn ) ∈ Cn that ∑ c j f (x j ) = ∑ c j [ f , G(x j , ·)]B = ∑ c j G(·, x j ), f ∗ j∈Nn Thus, j∈Nn j∈Nn ∑ c j f (x j ) = 0, . [sent-605, score-0.27]
89 at 2760 R EPRODUCING K ERNEL BANACH S PACES FOR M ACHINE L EARNING Lemma 15 If V is a uniformly convex Banach space, then for any nonempty closed convex subset A ⊆ V and any x ∈ V there exists a unique x0 ∈ A such that x − x0 V = inf{ x − a V : a ∈ A}. [sent-611, score-0.273]
90 Lemma 17 If V is a uniformly Fr´ chet differentiable normed vector space, then x + λy e for all λ ∈ C if and only if [y, x]V = 0. [sent-622, score-0.452]
91 Note that a uniformly convex normed vector space is automatically strictly convex. [sent-686, score-0.305]
92 Conclusion We have introduced the notion of reproducing kernel Banach spaces and generalized the correspondence between an RKHS and its reproducing kernel to the setting of RKBS. [sent-826, score-0.811]
93 An immediate challenge is to construct a class of useful RKBS and the corresponding reproducing kernels. [sent-837, score-0.27]
94 By the classical theory of RKHS, a function K is a reproducing kernel if and only if the finite matrix (1) is always hermitian and positive semi-definite. [sent-838, score-0.372]
95 Thus, we ask what characteristics a function must possess so that it is a reproducing kernel for some RKBS. [sent-840, score-0.372]
96 Properties of RKBS and their reproducing kernels also deserve a systematic study. [sent-841, score-0.32]
97 If for all x, y ∈ V \ {0} the limit x + ty V − x V lim exists then [·, ·]V : V ×V → C defined by t∈R,t→0 t [x, y]V := y V lim t∈R,t→0 y + tx V t − y V + i lim iy + tx t∈R,t→0 and [x, y]V := 0 if x = 0 or y = 0 is a semi-inner-product on V . [sent-870, score-0.559]
98 2771 V t − y V if x, y = 0 (77) Z HANG , X U AND Z HANG Proof First, we obtain for x = 0 that [x, x]V = x V (1 + t)x lim V − x V (i + t)x + i lim t∈R,t→0 t |1 + t| − 1 |i + t| − 1 2 lim = x V + i lim t∈R,t→0 t∈R,t→0 t t 2 (1 + 0) = x 2 > 0. [sent-871, score-0.372]
99 We start with the estimate: z + tx + ty V − z V t z z 2 + tx V + 2 + ty V − z V ≤ z V lim + t t∈R,t→0 z z z z + tx V − 2 V + ty V − 2 V lim + 2 = z V + lim + 2 t t t∈R,t→0 t∈R,t→0 z + 2tx V − z V z + 2ty V − z V lim = z V + lim + . [sent-875, score-0.708]
100 Support vector machines, reproducing kernel Hilbert spaces and the randomized GACV. [sent-1138, score-0.439]
wordName wordTfidf (topN-words)
[('rkbs', 0.477), ('banach', 0.423), ('nn', 0.329), ('reproducing', 0.27), ('hang', 0.195), ('gx', 0.177), ('chet', 0.168), ('normed', 0.143), ('iy', 0.135), ('eproducing', 0.128), ('paces', 0.128), ('ez', 0.128), ('span', 0.124), ('achine', 0.105), ('kernel', 0.102), ('lim', 0.093), ('micchelli', 0.086), ('rkhs', 0.083), ('ernel', 0.079), ('minimizer', 0.078), ('cn', 0.078), ('uniformly', 0.078), ('fr', 0.074), ('shall', 0.072), ('exive', 0.071), ('spaces', 0.067), ('differentiable', 0.063), ('giles', 0.061), ('nm', 0.06), ('denseness', 0.06), ('representer', 0.059), ('fk', 0.059), ('hilbert', 0.056), ('lumer', 0.053), ('convex', 0.052), ('kernels', 0.05), ('bilinear', 0.047), ('unique', 0.045), ('riesz', 0.044), ('lq', 0.043), ('interpolation', 0.043), ('re', 0.043), ('norm', 0.043), ('tx', 0.041), ('dual', 0.04), ('ty', 0.04), ('earning', 0.038), ('xu', 0.038), ('pca', 0.037), ('teaux', 0.035), ('isometric', 0.034), ('tg', 0.034), ('duality', 0.033), ('space', 0.032), ('theorem', 0.032), ('continuous', 0.032), ('inner', 0.03), ('conway', 0.03), ('yuesheng', 0.03), ('im', 0.029), ('lkopf', 0.028), ('mapping', 0.028), ('functional', 0.028), ('bijective', 0.027), ('inf', 0.027), ('spanv', 0.027), ('sch', 0.026), ('proposition', 0.025), ('yk', 0.025), ('dense', 0.025), ('co', 0.025), ('fn', 0.025), ('xk', 0.024), ('linearly', 0.024), ('id', 0.023), ('principal', 0.023), ('exists', 0.023), ('nonempty', 0.023), ('ay', 0.023), ('gz', 0.023), ('isometrically', 0.023), ('lemma', 0.022), ('uniqueness', 0.022), ('isomorphic', 0.022), ('compact', 0.022), ('functionals', 0.022), ('rd', 0.021), ('minimal', 0.021), ('smola', 0.021), ('regularization', 0.021), ('bk', 0.02), ('convexity', 0.02), ('nontrivial', 0.02), ('fourier', 0.02), ('specially', 0.02), ('equipped', 0.019), ('mx', 0.019), ('der', 0.019), ('nite', 0.019), ('cristianini', 0.019)]
simIndex simValue paperId paperTitle
same-paper 1 0.9999997 80 jmlr-2009-Reproducing Kernel Banach Spaces for Machine Learning
Author: Haizhang Zhang, Yuesheng Xu, Jun Zhang
Abstract: We introduce the notion of reproducing kernel Banach spaces (RKBS) and study special semiinner-product RKBS by making use of semi-inner-products and the duality mapping. Properties of an RKBS and its reproducing kernel are investigated. As applications, we develop in the framework of RKBS standard learning schemes including minimal norm interpolation, regularization network, support vector machines, and kernel principal component analysis. In particular, existence, uniqueness and representer theorems are established. Keywords: reproducing kernel Banach spaces, reproducing kernels, learning theory, semi-innerproducts, representer theorems
2 0.19725911 78 jmlr-2009-Refinement of Reproducing Kernels
Author: Yuesheng Xu, Haizhang Zhang
Abstract: We continue our recent study on constructing a refinement kernel for a given kernel so that the reproducing kernel Hilbert space associated with the refinement kernel contains that with the original kernel as a subspace. To motivate this study, we first develop a refinement kernel method for learning, which gives an efficient algorithm for updating a learning predictor. Several characterizations of refinement kernels are then presented. It is shown that a nontrivial refinement kernel for a given kernel always exists if the input space has an infinite cardinal number. Refinement kernels for translation invariant kernels and Hilbert-Schmidt kernels are investigated. Various concrete examples are provided. Keywords: reproducing kernels, reproducing kernel Hilbert spaces, learning with kernels, refinement kernels, translation invariant kernels, Hilbert-Schmidt kernels
3 0.1734544 100 jmlr-2009-When Is There a Representer Theorem? Vector Versus Matrix Regularizers
Author: Andreas Argyriou, Charles A. Micchelli, Massimiliano Pontil
Abstract: We consider a general class of regularization methods which learn a vector of parameters on the basis of linear measurements. It is well known that if the regularizer is a nondecreasing function of the L2 norm, then the learned vector is a linear combination of the input data. This result, known as the representer theorem, lies at the basis of kernel-based methods in machine learning. In this paper, we prove the necessity of the above condition, in the case of differentiable regularizers. We further extend our analysis to regularization methods which learn a matrix, a problem which is motivated by the application to multi-task learning. In this context, we study a more general representer theorem, which holds for a larger class of regularizers. We provide a necessary and sufficient condition characterizing this class of matrix regularizers and we highlight some concrete examples of practical importance. Our analysis uses basic principles from matrix theory, especially the useful notion of matrix nondecreasing functions. Keywords: kernel methods, matrix learning, minimal norm interpolation, multi-task learning, regularization
4 0.098693043 90 jmlr-2009-Structure Spaces
Author: Brijnesh J. Jain, Klaus Obermayer
Abstract: Finite structures such as point patterns, strings, trees, and graphs occur as ”natural” representations of structured data in different application areas of machine learning. We develop the theory of structure spaces and derive geometrical and analytical concepts such as the angle between structures and the derivative of functions on structures. In particular, we show that the gradient of a differentiable structural function is a well-defined structure pointing in the direction of steepest ascent. Exploiting the properties of structure spaces, it will turn out that a number of problems in structural pattern recognition such as central clustering or learning in structured output spaces can be formulated as optimization problems with cost functions that are locally Lipschitz. Hence, methods from nonsmooth analysis are applicable to optimize those cost functions. Keywords: graphs, graph matching, learning in structured domains, nonsmooth optimization
5 0.088348597 22 jmlr-2009-Deterministic Error Analysis of Support Vector Regression and Related Regularized Kernel Methods
Author: Christian Rieger, Barbara Zwicknagl
Abstract: We introduce a new technique for the analysis of kernel-based regression problems. The basic tools are sampling inequalities which apply to all machine learning problems involving penalty terms induced by kernels related to Sobolev spaces. They lead to explicit deterministic results concerning the worst case behaviour of ε- and ν-SVRs. Using these, we show how to adjust regularization parameters to get best possible approximation orders for regression. The results are illustrated by some numerical examples. Keywords: sampling inequality, radial basis functions, approximation theory, reproducing kernel Hilbert space, Sobolev space
6 0.084768213 2 jmlr-2009-A New Approach to Collaborative Filtering: Operator Estimation with Spectral Regularization
7 0.06845706 29 jmlr-2009-Estimating Labels from Label Proportions
8 0.061070304 16 jmlr-2009-Classification with Gaussians and Convex Loss
10 0.055885591 59 jmlr-2009-Nearest Neighbor Clustering: A Baseline Method for Consistent Clustering with Arbitrary Objective Functions
11 0.050737545 68 jmlr-2009-Online Learning with Samples Drawn from Non-identical Distributions
12 0.036847953 51 jmlr-2009-Low-Rank Kernel Learning with Bregman Matrix Divergences
13 0.036505092 87 jmlr-2009-Sparse Online Learning via Truncated Gradient
14 0.035081021 52 jmlr-2009-Margin-based Ranking and an Equivalence between AdaBoost and RankBoost
15 0.033984814 86 jmlr-2009-Similarity-based Classification: Concepts and Algorithms
16 0.033552203 38 jmlr-2009-Hash Kernels for Structured Data
17 0.033504754 61 jmlr-2009-Nonextensive Information Theoretic Kernels on Measures
18 0.031312529 66 jmlr-2009-On the Consistency of Feature Selection using Greedy Least Squares Regression
19 0.030620961 23 jmlr-2009-Discriminative Learning Under Covariate Shift
20 0.030452797 82 jmlr-2009-Robustness and Regularization of Support Vector Machines
topicId topicWeight
[(0, 0.195), (1, 0.03), (2, 0.111), (3, -0.12), (4, -0.312), (5, 0.177), (6, 0.142), (7, -0.273), (8, -0.037), (9, 0.001), (10, 0.029), (11, 0.077), (12, 0.206), (13, 0.039), (14, 0.078), (15, -0.02), (16, -0.024), (17, 0.066), (18, -0.103), (19, -0.062), (20, -0.019), (21, 0.008), (22, 0.057), (23, 0.127), (24, 0.01), (25, -0.016), (26, 0.024), (27, -0.046), (28, -0.122), (29, -0.003), (30, 0.029), (31, 0.015), (32, -0.037), (33, -0.075), (34, -0.002), (35, 0.001), (36, 0.034), (37, -0.111), (38, -0.027), (39, -0.033), (40, -0.02), (41, 0.11), (42, 0.093), (43, -0.026), (44, 0.053), (45, -0.125), (46, -0.007), (47, 0.062), (48, -0.05), (49, -0.105)]
simIndex simValue paperId paperTitle
same-paper 1 0.96586418 80 jmlr-2009-Reproducing Kernel Banach Spaces for Machine Learning
Author: Haizhang Zhang, Yuesheng Xu, Jun Zhang
Abstract: We introduce the notion of reproducing kernel Banach spaces (RKBS) and study special semiinner-product RKBS by making use of semi-inner-products and the duality mapping. Properties of an RKBS and its reproducing kernel are investigated. As applications, we develop in the framework of RKBS standard learning schemes including minimal norm interpolation, regularization network, support vector machines, and kernel principal component analysis. In particular, existence, uniqueness and representer theorems are established. Keywords: reproducing kernel Banach spaces, reproducing kernels, learning theory, semi-innerproducts, representer theorems
2 0.77520359 78 jmlr-2009-Refinement of Reproducing Kernels
Author: Yuesheng Xu, Haizhang Zhang
Abstract: We continue our recent study on constructing a refinement kernel for a given kernel so that the reproducing kernel Hilbert space associated with the refinement kernel contains that with the original kernel as a subspace. To motivate this study, we first develop a refinement kernel method for learning, which gives an efficient algorithm for updating a learning predictor. Several characterizations of refinement kernels are then presented. It is shown that a nontrivial refinement kernel for a given kernel always exists if the input space has an infinite cardinal number. Refinement kernels for translation invariant kernels and Hilbert-Schmidt kernels are investigated. Various concrete examples are provided. Keywords: reproducing kernels, reproducing kernel Hilbert spaces, learning with kernels, refinement kernels, translation invariant kernels, Hilbert-Schmidt kernels
3 0.66706026 100 jmlr-2009-When Is There a Representer Theorem? Vector Versus Matrix Regularizers
Author: Andreas Argyriou, Charles A. Micchelli, Massimiliano Pontil
Abstract: We consider a general class of regularization methods which learn a vector of parameters on the basis of linear measurements. It is well known that if the regularizer is a nondecreasing function of the L2 norm, then the learned vector is a linear combination of the input data. This result, known as the representer theorem, lies at the basis of kernel-based methods in machine learning. In this paper, we prove the necessity of the above condition, in the case of differentiable regularizers. We further extend our analysis to regularization methods which learn a matrix, a problem which is motivated by the application to multi-task learning. In this context, we study a more general representer theorem, which holds for a larger class of regularizers. We provide a necessary and sufficient condition characterizing this class of matrix regularizers and we highlight some concrete examples of practical importance. Our analysis uses basic principles from matrix theory, especially the useful notion of matrix nondecreasing functions. Keywords: kernel methods, matrix learning, minimal norm interpolation, multi-task learning, regularization
4 0.57446617 22 jmlr-2009-Deterministic Error Analysis of Support Vector Regression and Related Regularized Kernel Methods
Author: Christian Rieger, Barbara Zwicknagl
Abstract: We introduce a new technique for the analysis of kernel-based regression problems. The basic tools are sampling inequalities which apply to all machine learning problems involving penalty terms induced by kernels related to Sobolev spaces. They lead to explicit deterministic results concerning the worst case behaviour of ε- and ν-SVRs. Using these, we show how to adjust regularization parameters to get best possible approximation orders for regression. The results are illustrated by some numerical examples. Keywords: sampling inequality, radial basis functions, approximation theory, reproducing kernel Hilbert space, Sobolev space
5 0.43365356 2 jmlr-2009-A New Approach to Collaborative Filtering: Operator Estimation with Spectral Regularization
Author: Jacob Abernethy, Francis Bach, Theodoros Evgeniou, Jean-Philippe Vert
Abstract: We present a general approach for collaborative filtering (CF) using spectral regularization to learn linear operators mapping a set of “users” to a set of possibly desired “objects”. In particular, several recent low-rank type matrix-completion methods for CF are shown to be special cases of our proposed framework. Unlike existing regularization-based CF, our approach can be used to incorporate additional information such as attributes of the users/objects—a feature currently lacking in existing regularization-based CF approaches—using popular and well-known kernel methods. We provide novel representer theorems that we use to develop new estimation methods. We then provide learning algorithms based on low-rank decompositions and test them on a standard CF data set. The experiments indicate the advantages of generalizing the existing regularization-based CF methods to incorporate related information about users and objects. Finally, we show that certain multi-task learning methods can be also seen as special cases of our proposed approach. Keywords: collaborative filtering, matrix completion, kernel methods, spectral regularization
6 0.34107983 90 jmlr-2009-Structure Spaces
7 0.29497615 29 jmlr-2009-Estimating Labels from Label Proportions
10 0.26418218 51 jmlr-2009-Low-Rank Kernel Learning with Bregman Matrix Divergences
11 0.23221359 86 jmlr-2009-Similarity-based Classification: Concepts and Algorithms
12 0.21840426 61 jmlr-2009-Nonextensive Information Theoretic Kernels on Measures
13 0.21145448 38 jmlr-2009-Hash Kernels for Structured Data
14 0.20079079 66 jmlr-2009-On the Consistency of Feature Selection using Greedy Least Squares Regression
15 0.19572787 16 jmlr-2009-Classification with Gaussians and Convex Loss
16 0.1953197 68 jmlr-2009-Online Learning with Samples Drawn from Non-identical Distributions
17 0.17342359 48 jmlr-2009-Learning Nondeterministic Classifiers
18 0.14990185 40 jmlr-2009-Identification of Recurrent Neural Networks by Bayesian Interrogation Techniques
19 0.14680915 67 jmlr-2009-Online Learning with Sample Path Constraints
20 0.14447774 95 jmlr-2009-The P-Norm Push: A Simple Convex Ranking Algorithm that Concentrates at the Top of the List
topicId topicWeight
[(7, 0.282), (8, 0.011), (11, 0.065), (19, 0.035), (21, 0.034), (38, 0.032), (47, 0.025), (52, 0.033), (55, 0.032), (58, 0.031), (66, 0.147), (68, 0.018), (90, 0.124), (96, 0.022)]
simIndex simValue paperId paperTitle
same-paper 1 0.77598882 80 jmlr-2009-Reproducing Kernel Banach Spaces for Machine Learning
Author: Haizhang Zhang, Yuesheng Xu, Jun Zhang
Abstract: We introduce the notion of reproducing kernel Banach spaces (RKBS) and study special semiinner-product RKBS by making use of semi-inner-products and the duality mapping. Properties of an RKBS and its reproducing kernel are investigated. As applications, we develop in the framework of RKBS standard learning schemes including minimal norm interpolation, regularization network, support vector machines, and kernel principal component analysis. In particular, existence, uniqueness and representer theorems are established. Keywords: reproducing kernel Banach spaces, reproducing kernels, learning theory, semi-innerproducts, representer theorems
2 0.58266836 78 jmlr-2009-Refinement of Reproducing Kernels
Author: Yuesheng Xu, Haizhang Zhang
Abstract: We continue our recent study on constructing a refinement kernel for a given kernel so that the reproducing kernel Hilbert space associated with the refinement kernel contains that with the original kernel as a subspace. To motivate this study, we first develop a refinement kernel method for learning, which gives an efficient algorithm for updating a learning predictor. Several characterizations of refinement kernels are then presented. It is shown that a nontrivial refinement kernel for a given kernel always exists if the input space has an infinite cardinal number. Refinement kernels for translation invariant kernels and Hilbert-Schmidt kernels are investigated. Various concrete examples are provided. Keywords: reproducing kernels, reproducing kernel Hilbert spaces, learning with kernels, refinement kernels, translation invariant kernels, Hilbert-Schmidt kernels
3 0.57499301 100 jmlr-2009-When Is There a Representer Theorem? Vector Versus Matrix Regularizers
Author: Andreas Argyriou, Charles A. Micchelli, Massimiliano Pontil
Abstract: We consider a general class of regularization methods which learn a vector of parameters on the basis of linear measurements. It is well known that if the regularizer is a nondecreasing function of the L2 norm, then the learned vector is a linear combination of the input data. This result, known as the representer theorem, lies at the basis of kernel-based methods in machine learning. In this paper, we prove the necessity of the above condition, in the case of differentiable regularizers. We further extend our analysis to regularization methods which learn a matrix, a problem which is motivated by the application to multi-task learning. In this context, we study a more general representer theorem, which holds for a larger class of regularizers. We provide a necessary and sufficient condition characterizing this class of matrix regularizers and we highlight some concrete examples of practical importance. Our analysis uses basic principles from matrix theory, especially the useful notion of matrix nondecreasing functions. Keywords: kernel methods, matrix learning, minimal norm interpolation, multi-task learning, regularization
4 0.56268549 13 jmlr-2009-Bounded Kernel-Based Online Learning
Author: Francesco Orabona, Joseph Keshet, Barbara Caputo
Abstract: A common problem of kernel-based online algorithms, such as the kernel-based Perceptron algorithm, is the amount of memory required to store the online hypothesis, which may increase without bound as the algorithm progresses. Furthermore, the computational load of such algorithms grows linearly with the amount of memory used to store the hypothesis. To attack these problems, most previous work has focused on discarding some of the instances, in order to keep the memory bounded. In this paper we present a new algorithm, in which the instances are not discarded, but are instead projected onto the space spanned by the previous online hypothesis. We call this algorithm Projectron. While the memory size of the Projectron solution cannot be predicted before training, we prove that its solution is guaranteed to be bounded. We derive a relative mistake bound for the proposed algorithm, and deduce from it a slightly different algorithm which outperforms the Perceptron. We call this second algorithm Projectron++. We show that this algorithm can be extended to handle the multiclass and the structured output settings, resulting, as far as we know, in the first online bounded algorithm that can learn complex classification tasks. The method of bounding the hypothesis representation can be applied to any conservative online algorithm and to other online algorithms, as it is demonstrated for ALMA2 . Experimental results on various data sets show the empirical advantage of our technique compared to various bounded online algorithms, both in terms of memory and accuracy. Keywords: online learning, kernel methods, support vector machines, bounded support set
5 0.56001776 52 jmlr-2009-Margin-based Ranking and an Equivalence between AdaBoost and RankBoost
Author: Cynthia Rudin, Robert E. Schapire
Abstract: We study boosting algorithms for learning to rank. We give a general margin-based bound for ranking based on covering numbers for the hypothesis space. Our bound suggests that algorithms that maximize the ranking margin will generalize well. We then describe a new algorithm, smooth margin ranking, that precisely converges to a maximum ranking-margin solution. The algorithm is a modification of RankBoost, analogous to “approximate coordinate ascent boosting.” Finally, we prove that AdaBoost and RankBoost are equally good for the problems of bipartite ranking and classification in terms of their asymptotic behavior on the training set. Under natural conditions, AdaBoost achieves an area under the ROC curve that is equally as good as RankBoost’s; furthermore, RankBoost, when given a specific intercept, achieves a misclassification error that is as good as AdaBoost’s. This may help to explain the empirical observations made by Cortes and Mohri, and Caruana and Niculescu-Mizil, about the excellent performance of AdaBoost as a bipartite ranking algorithm, as measured by the area under the ROC curve. Keywords: ranking, RankBoost, generalization bounds, AdaBoost, area under the ROC curve
6 0.55255103 22 jmlr-2009-Deterministic Error Analysis of Support Vector Regression and Related Regularized Kernel Methods
7 0.55045974 93 jmlr-2009-The Hidden Life of Latent Variables: Bayesian Learning with Mixed Graph Models
8 0.54344904 85 jmlr-2009-Settable Systems: An Extension of Pearl's Causal Model with Optimization, Equilibrium, and Learning
10 0.54114819 36 jmlr-2009-Fourier Theoretic Probabilistic Inference over Permutations
11 0.5409494 32 jmlr-2009-Exploiting Product Distributions to Identify Relevant Variables of Correlation Immune Functions
12 0.54038966 95 jmlr-2009-The P-Norm Push: A Simple Convex Ranking Algorithm that Concentrates at the Top of the List
13 0.54009801 82 jmlr-2009-Robustness and Regularization of Support Vector Machines
14 0.53921008 27 jmlr-2009-Efficient Online and Batch Learning Using Forward Backward Splitting
15 0.53887957 18 jmlr-2009-Consistency and Localizability
16 0.53561151 47 jmlr-2009-Learning Linear Ranking Functions for Beam Search with Application to Planning
17 0.53078181 94 jmlr-2009-The Nonparanormal: Semiparametric Estimation of High Dimensional Undirected Graphs
18 0.5307188 37 jmlr-2009-Generalization Bounds for Ranking Algorithms via Algorithmic Stability
19 0.53002441 2 jmlr-2009-A New Approach to Collaborative Filtering: Operator Estimation with Spectral Regularization
20 0.52905834 9 jmlr-2009-Analysis of Perceptron-Based Active Learning