jmlr jmlr2007 jmlr2007-71 knowledge-graph by maker-knowledge-mining

71 jmlr-2007-Refinable Kernels


Source: pdf

Author: Yuesheng Xu, Haizhang Zhang

Abstract: Motivated by mathematical learning from training data, we introduce the notion of refinable kernels. Various characterizations of refinable kernels are presented. The concept of refinable kernels leads to the introduction of wavelet-like reproducing kernels. We also investigate a refinable kernel that forms a Riesz basis. In particular, we characterize refinable translation invariant kernels, and refinable kernels defined by refinable functions. This study leads to multiresolution analysis of reproducing kernel Hilbert spaces. Keywords: refinable kernels, refinable feature maps, wavelet-like reproducing kernels, dual kernels, learning with kernels, reproducing kernel Hilbert spaces, Riesz bases

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 The concept of refinable kernels leads to the introduction of wavelet-like reproducing kernels. [sent-5, score-0.113]

2 In particular, we characterize refinable translation invariant kernels, and refinable kernels defined by refinable functions. [sent-7, score-0.126]

3 This study leads to multiresolution analysis of reproducing kernel Hilbert spaces. [sent-8, score-0.123]

4 Keywords: refinable kernels, refinable feature maps, wavelet-like reproducing kernels, dual kernels, learning with kernels, reproducing kernel Hilbert spaces, Riesz bases 1. [sent-9, score-0.149]

5 Introduction The main purpose of this paper is to introduce the notion of refinable kernels, wavelet-like reproducing kernels and multiresolution analysis of a reproducing kernel Hilbert space. [sent-10, score-0.236]

6 , 1991; Daubechies, 1992), multiresolution analysis of L2 (R) (Mallat, 1989; Meyer, 1992) and kernels constructed by wavelet functions (Amato et al. [sent-12, score-0.132]

7 A kernel K on X corresponds to a Hilbert space HK := span {K(·, y) : y ∈ X} (3) of functions on X with an inner product determined by (K(·, y), K(·, x))HK = K(x, y), x, y ∈ X. [sent-32, score-0.104]

8 (4) The space HK is a reproducing kernel Hilbert space (RKHS), that is, point evaluations are continuous linear functionals on HK (Aronszajn, 1950). [sent-33, score-0.102]

9 (5) Due to Equation (5), K is often interpreted as the reproducing kernel of H K . [sent-35, score-0.09]

10 A RKHS has exactly one reproducing kernel (Aronszajn, 1950). [sent-36, score-0.09]

11 Specifically, we demand kernels K having the feature that there is a cheap way of updating K to a new kernel K such that HK HK . [sent-50, score-0.125]

12 We shall study characterizations of a refinable kernel, fundamental properties of refinable kernels and wavelet-like reproducing kernels, and multiscale structures of a RKHS induced by refinable kernels. [sent-58, score-0.172]

13 It is important to note that the concept of wavelet-like reproducing kernels differs from that of “wavelet kernels” in Amato et al. [sent-59, score-0.113]

14 The earlier means the kernels defined by the difference of kernels at two consecutive scales while the latter means the kernels defined by a linear combination of dilations and translations of wavelet functions. [sent-62, score-0.247]

15 Section 3 is devoted to wavelet-like reproducing kernels and a multiscale decomposition of the RKHS of a refinable kernel. [sent-65, score-0.133]

16 In Section 4, we investigate refinable kernels of a Riesz type and we also introduce the notion of a multiresolution analysis for a RKHS. [sent-66, score-0.107]

17 As concrete examples of refinable kernels, we formulate in Sections 5 and 6, respectively, conditions for translation invariant kernels and kernels defined by a refinable function to be refinable. [sent-67, score-0.2]

18 2085 X U AND Z HANG Theorem 1 If K is a kernel on X, then for each n ∈ Z, the RKHS of kernel Kn is HKn = { f ◦ γn : f ∈ HK } with inner product (9) ( f , g)HKn := λ−n ( f ◦ γ−n , g ◦ γ−n )HK , f , g ∈ HKn . [sent-80, score-0.119]

19 This implies that Kn is the reproducing kernel for Hn . [sent-87, score-0.09]

20 A direct consequence of Theorem 1 is that if K is a γ-refinable kernel then for each n ∈ Z, the kernel Kn is γ-refinable. [sent-89, score-0.102]

21 Finally, we verify by (12) for each f , g ∈ HK that ( f ◦ γ−1 , g ◦ γ−1 )HK = ( f−1 , g−1 )HK = lim ( fn ◦ γ−1 , gn ◦ γ−1 )HK = λ lim ( fn , gn )HK = λ( f , g)HK . [sent-121, score-0.13]

22 Another characterization of γ-refinable kernels K is in terms of a feature map for K. [sent-124, score-0.11]

23 To state the result, we denote by Φ(X) the image of X under Φ, span Φ(X) the closure of span Φ(X) in W , and PΦ the orthogonal projection from W onto span Φ(X). [sent-129, score-0.122]

24 In the application of Lemma 5, it is always convenient to assume that there holds span Φ(X) = W (18) since otherwise W can be replaced by span Φ(X). [sent-137, score-0.085]

25 Recall that a linear operator A on W is isometric if for all u ∈ W , Au W = u W . [sent-142, score-0.088]

26 One can see that A is isometric if and only if A∗ A is equal to the identity operator on W , where A∗ denotes the adjoint operator of A. [sent-143, score-0.16]

27 2088 R EFINABLE K ERNELS Proof Suppose that Φ is γ-refinable, that is, it satisfies (20) for some bounded operator T on W , and suppose that T ∗ is isometric. [sent-147, score-0.084]

28 Let A denote the map u → vu and observe that A is a linear operator on W . [sent-161, score-0.097]

29 Then HK−1 is a proper subspace of HK if and only if the operator T in (20) is not injective. [sent-180, score-0.096]

30 By a property of reproducing kernels (see, Aronszajn, 1950, page 345), the sum of Kn and the kernel of Wn is equal to Kn+1 . [sent-199, score-0.164]

31 Therefore, G is the kernel of W0 , and it is observed 2090 R EFINABLE K ERNELS by (8) that the kernel of Wn is Gn . [sent-200, score-0.102]

32 We call G n the wavelet-like reproducing kernels and in particular, G the initial wavelet-like kernel. [sent-205, score-0.113]

33 It is clear that the initial wavelet-like kernel G is nontrivial if and only if HK−1 is a proper subspace of HK . [sent-206, score-0.126]

34 (30) One should notice the difference between wavelet-like reproducing kernels that we introduce here and the “wavelet kernels” studied in Amato et al. [sent-208, score-0.113]

35 The latter are a class of Hilbert-Schmidt kernels defined as a superposition of dilations and translations of a wavelet function. [sent-211, score-0.099]

36 Examples of nontrivial refinable kernels on Rd will be given in Sections 5 and 6. [sent-257, score-0.11]

37 This leads to consideration of requiring KX to be a frame or a Riesz basis for HK . [sent-261, score-0.118]

38 A family of elements {ϕ j : j ∈ J} in a Hilbert space W forms a frame if there exist 0 < α ≤ β < ∞ such that for all f ∈ W α f 2 ≤ ∑ |( f , ϕ j )W |2 ≤ β f 2 . [sent-265, score-0.084]

39 W W (36) j∈J The constants α, β are called the frame bounds for {ϕ j : j ∈ J}. [sent-266, score-0.084]

40 We call a frame {ϕ j : j ∈ J} a tight frame when its two frame bounds are equal, that is, α = β. [sent-267, score-0.252]

41 A Riesz basis {ϕ j : j ∈ J} is equivalent to an orthonormal basis {ψ j : j ∈ J} for W , namely, there exists a bounded linear operator L on W having a bounded inverse such that L(ψ j ) = ϕ j , j ∈ J. [sent-269, score-0.14]

42 An arbitrary element in W can be expressed as a linear combination of a frame {ϕ j : j ∈ J} for W . [sent-270, score-0.084]

43 Define the frame operator F from W to 2 (J) by setting for all f ∈ W , (F f ) j := ( f , ϕ j )W , for all j ∈ J. [sent-272, score-0.141]

44 Then F ∗ F is a bounded positive self-adjoint operator on W with a bounded inverse and ˜ ϕ j := (F ∗ F)−1 ϕ j : j ∈ J 2093 X U AND Z HANG is a frame for W . [sent-273, score-0.141]

45 We shall refer to it as the dual frame of {ϕ j : j ∈ J} since we have for all f ∈ W ˜ ˜ f = ∑ ( f , ϕ j )W ϕ j = ∑ ( f , ϕ j )W ϕ j . [sent-274, score-0.106]

46 For a kernel K on X, we are interested in the conditions for which the set KZ := {K(·, z j ) : j ∈ J} is a Riesz basis for HK . [sent-278, score-0.085]

47 Proposition 11 The family KZ is a Riesz basis for HK with frame bounds 0 < α ≤ β < ∞ if and only if for every finite subset X ⊆ Z , K[X ] is nonsingular, and for all f ∈ H K α f 2 K ≤ ∑ | f (z j )|2 ≤ β f 2 K . [sent-280, score-0.118]

48 If X is a topological space, we call a kernel K on X a universal kernel if for all compact X ⊆ X, the linear span of {K(·, y) : y ∈ X } is dense in C(X ), the Banach space of continuous functions on X . [sent-282, score-0.173]

49 Various characterizations of universal kernels are studied in Micchelli et al. [sent-283, score-0.091]

50 The third characterization is in terms of a feature map for the kernel K. [sent-296, score-0.087]

51 Then KZ is a Riesz basis for HK if and only if Φ(Z ) is a Riesz basis for span Φ(X). [sent-298, score-0.104]

52 Since the operator Γ defined by (19) is an isomorphism from H K onto W , KZ is a Riesz basis if and only if Γ(KZ ) is a Riesz basis for Γ(HK ) = W . [sent-300, score-0.125]

53 Proposition 15 Suppose that KZ is a Riesz basis for HK with frame bounds 0 < α ≤ β < ∞. [sent-312, score-0.118]

54 Then for each n ∈ Z, {φn, j : j ∈ J} is a Riesz basis for HKn with the same frame bounds α, β. [sent-313, score-0.118]

55 This assumption implies that the linear operator S on HK defined by S f := ∑ f (z j )K(·, z j ), f ∈ HK (40) j∈J is bounded positive self-adjoint and so is its inverse operator S −1 on HK (see, for example, Daubechies, 1992, pages 58–59). [sent-315, score-0.114]

56 2095 X U AND Z HANG Proposition 16 Suppose that KZ is a Riesz basis for HK and let S be the operator on HK defined by (40). [sent-318, score-0.091]

57 ˜ Theorem 17 If KZ is a Riesz basis for HK , then for each n ∈ Z, the dual Riesz basis {φn, j : j ∈ J} of {φn, j : j ∈ J} for HKn has the form ˜ ˜ φn, j = ∑ Λk, j φn,k , j ∈ J (44) k∈J and S−1 f = ∑ j∈J ˜ ∑ Λ j,k f (zk ) ˜ ∑ Λl, j K(·, zl ) k∈J , f ∈ HK . [sent-328, score-0.09]

58 By (10) and (5), we have for all j, k ∈ J that ˜ ˜ ˜ ˜ ˜ (φn, j , φn,k )HKn = (φ0, j , K(·, zk ))HK = φ0, j (zk ) = ∑ Λl, j K(zk , zl ) = ∑ Λk,l Λl, j = δ j,k , l∈J 2096 l∈J R EFINABLE K ERNELS ˜ which shows that {φn, j : j ∈ J} is the dual Riesz basis of {φn, j : j ∈ J} in HKn . [sent-333, score-0.088]

59 (48) The Riesz basis provides a characterization of γ-refinable kernels in terms of the sampling operator, which we present next. [sent-345, score-0.124]

60 We get by (5) for each x ∈ X that g(x) = (g, K(·, x))HK = lim ( fn , K(·, x))HK = lim fn (x) = lim ( fn , K−1 (·, x))HK n→∞ n→∞ n→∞ −1 = f (x). [sent-357, score-0.144]

61 Therefore, f ∈ HK , and by (53) ( f , f ) HK −1 = lim ( fn , fn )HK n→∞ −1 = lim ( fn , fn )HK = (g, g)HK = ( f , f )HK . [sent-358, score-0.15]

62 In the rest of this section, we construct a frame for the RKHS H G of the wavelet-like kernel G := K1 − K. [sent-365, score-0.135]

63 Lemma 19 Suppose that K is γ-refinable and KZ is a Riesz basis for HK with frame bounds 0 < α ≤ β < ∞. [sent-366, score-0.118]

64 Then ψ0, j := λ−1/2 G(·, γ−1 (z j )), j ∈ J form a frame for HG with the same frame bounds α, β. [sent-367, score-0.168]

65 Applying Proposition 15 and f HG = f HK yields that {ψ0, j : j ∈ J} is a frame for HG with frame bounds α, β. [sent-371, score-0.168]

66 1 The frame for the RKHS HG is now translated to a frame for the RKHSs HGn . [sent-372, score-0.168]

67 2098 R EFINABLE K ERNELS Proposition 20 Suppose that K is γ-refinable and KZ is a Riesz basis for HK with frame bounds 0 < α ≤ β < ∞. [sent-373, score-0.118]

68 Then for each n ∈ Z, ψn, j := λn/2 ψ0, j ◦ γn , j ∈ J form a frame for HGn with the same frame bounds α, β. [sent-374, score-0.168]

69 For each n ∈ Z, the functions ˜ ˜ ˜ (58) ψn, j := φn+1, j − ∑ Ck, j φn,k , j ∈ J k∈J constitute a frame for HGn and have the representation ˜ ψn, j = ˜ ˜ ∑ D j,k φn+1,k . [sent-384, score-0.084]

70 Arguments similar to those in the proof of ˜ Proposition 20 yield that ψn, j , j ∈ J form a frame for HGn with the same frame bounds as those of ˜ {φ0, j : j ∈ J} for HK . [sent-390, score-0.188]

71 The RKHS HK has a multiresolution analysis if and only if Φ is refinable, that is, it satisfies (20) for some bounded linear operator T on W whose adjoint T ∗ is isometric, T has the property (32) and there exists a countable subset Z of X such that Φ(Z ) is a Riesz basis for W . [sent-401, score-0.139]

72 Refinable Translation Invariant Kernels In this section, we consider refinable kernels with specializing our input space to R d , d ∈ N, the mapping γ to the dilation mapping x → 2x in Rd and kernels to translation invariant kernels K on Rd , that is, for all x, y, a ∈ Rd K(x − a, y − a) = K(x, y). [sent-423, score-0.274]

73 In other words, the main purpose of this section is to characterize refinable translation invariant kernels on Rd . [sent-424, score-0.126]

74 R ˆ Specifically, we shall characterize nonnegative k ∈ L1 (Rd ) for which the kernel K given by K(x, y) = k(x − y) = Z Rd ˆ ei(x−y,t) k(t)dt, x, y ∈ Rd (67) is refinable, that is, there holds HK−1 HK . [sent-434, score-0.1]

75 ˆ Theorem 23 Let K be the translation invariant kernel given in (67) through a nonnegative k ∈ 1 (Rd ) and let Ω be defined by (72). [sent-447, score-0.117]

76 ˆ Corollary 24 Let K be a refinable translation invariant kernel defined by (67). [sent-458, score-0.103]

77 Proof Since the Gaussian kernels can be represented as Gσ (x, y) = 1 (2π)d Z Rd π σ d/2 ei(x−y,t) e− t 2 4σ dt, · 2 and e− 4σ , supported on the whole Rd , is clearly continuous at t = 0, by Corollary 24, they are not refinable. [sent-471, score-0.086]

78 We now present a nontrivial refinable translation invariant kernel. [sent-472, score-0.088]

79 λ=2 We next characterize when HK−1 is a proper subspace of HK if K is a refinable translation invariant kernel. [sent-477, score-0.091]

80 We next present a necessary and sufficient condition for KZd to be a Riesz basis for HK if K is a ˆ translation invariant kernel defined by (67) through a nonnegative k ∈ L1 (Rd ). [sent-517, score-0.151]

81 ˆ Theorem 30 Let K be a translation invariant kernel defined by (67) through a nonnegative k ∈ 1 (Rd ) and Ω be defined by (72). [sent-518, score-0.117]

82 ˆ In the next theorem, we construct k that satisfy conditions (73), (74), (84) and (85) to obtain a d with K refinable kernel K on R Zd being a Riesz basis for HK . [sent-529, score-0.085]

83 Since (84) and (89) are true, we obtain by direct computation for all m, n ∈ Z d that 1 (K(·, n), K(·, m))HK η(2π)d 1 ei(m−n,t) dt (2π)d ZΩ 1 = ei(m−n,t) χΩ (t)dt (2π)d ZRd 1 ei(m−n,t) ∑ χΩ (· + 2lπ)dt = (2π)d [0,2π]d l∈Zd Z 1 = ei(m−n,t) dt = δm,n . [sent-545, score-0.092]

84 By noting that (84) can be interpreted as that Ω + 2nπ, n ∈ Z d , form a tiling of Rd , we construct examples of refinable kernels K such that KZd are Riesz bases for HK . [sent-547, score-0.094]

85 Then the kernel K defined by (67) with k having the form (89) is a refinable kernel such that KZd is a Riesz basis for HK . [sent-551, score-0.136]

86 2π 3 ≤α≤ (91) ˆ Then the kernel K defined by (67) with k having the form (89) with η = 1 is a refinable kernel such that KZd is a Riesz basis for HK . [sent-556, score-0.136]

87 Let ϕ be a compactly supported continuous function on R d that is refinable, namely, there exists h := [hn : n ∈ Zd ] such that · ϕ = ∑ hn ϕ(· − n). [sent-564, score-0.144]

88 (95) ψ(x − n)ψ(y − n) (96) n∈Zd constructed by a refinable function ψ were considered in Opfer (2006), and kernels defined as a superposition of frame elements in RKHS were discussed in Gao et al. [sent-570, score-0.158]

89 The next proposition shows that kernels in the form (95) are more general than those in the degenerate form (96) and in general cannot be written in the degenerate form. [sent-576, score-0.11]

90 Proof Since the operator A satisfies (93), there exists a bounded positive self-adjoint operator A 1/2 on 2 (Zd ) such that A1/2 A1/2 = A (see, Conway, 1990, ,page 240). [sent-598, score-0.114]

91 Proof By Theorem 7, HK−1 is a proper subspace of HK if and only if the null space N (T ) of operator T defined by (101) contains nonzero elements in 2 (Zd ), which is equivalent to that N (H) = {0}. [sent-618, score-0.096]

92 The last proposition provides a class of refinable kernels given by (95) that never degenerate to the form (96). [sent-644, score-0.11]

93 If there exists at least one n ∈ Zd such that the real part Re (hn ) of hn is not zero then A satisfies (97) if and only if a = 0. [sent-646, score-0.113]

94 Then (80) holds true if and only if hn = 0, for at least one n ∈ Zd \ 2Zd . [sent-659, score-0.126]

95 By condition (32) in Theorem 9, (80) holds true if and only if lim T j c 2 (Zd ) = 0, for all c ∈ 2 (Zd ), (108) j→∞ where T is the operator defined by (101). [sent-663, score-0.091]

96 (112) If hn = 0 for each n ∈ Zd \ 2Zd then m0 (· + πν j ) = m0 for all j ∈ N2d . [sent-669, score-0.113]

97 A Discussion of Applications and Conclusions For the completeness of the paper, in this section we discuss how refinable kernels can be used to efficiently update kernels for learning from increasing training data. [sent-696, score-0.148]

98 If K is refinable on X = R then we update the kernel K to a new kernel K1 := λK(2·, 2·). [sent-716, score-0.102]

99 Tight frame expansions of multiscale reproducing kernels in Sobolev spaces. [sent-979, score-0.217]

100 Support vector machines, reproducing kernel Hilbert spaces and the randomized GACV. [sent-1060, score-0.09]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('hk', 0.544), ('nable', 0.51), ('zd', 0.451), ('hkn', 0.217), ('riesz', 0.203), ('rd', 0.121), ('hgn', 0.117), ('hn', 0.113), ('frame', 0.084), ('kz', 0.084), ('efinable', 0.081), ('kdt', 0.077), ('kernels', 0.074), ('ernels', 0.069), ('kzd', 0.068), ('hang', 0.065), ('operator', 0.057), ('kn', 0.055), ('re', 0.053), ('kernel', 0.051), ('dt', 0.046), ('rkhs', 0.045), ('rakotomamonjy', 0.041), ('reproducing', 0.039), ('bm', 0.039), ('micchelli', 0.038), ('nontrivial', 0.036), ('proposition', 0.036), ('span', 0.036), ('daubechies', 0.034), ('basis', 0.034), ('multiresolution', 0.033), ('zk', 0.032), ('hg', 0.032), ('ei', 0.032), ('smale', 0.031), ('isometric', 0.031), ('translation', 0.03), ('hilbert', 0.029), ('fn', 0.027), ('suppose', 0.027), ('wavelet', 0.025), ('ah', 0.025), ('theorem', 0.025), ('dense', 0.023), ('walder', 0.023), ('invariant', 0.022), ('shall', 0.022), ('zl', 0.022), ('lim', 0.021), ('vu', 0.02), ('multiscale', 0.02), ('map', 0.02), ('subspace', 0.02), ('bases', 0.02), ('proof', 0.02), ('lemma', 0.02), ('lebesgue', 0.019), ('conversely', 0.019), ('ensures', 0.019), ('compactly', 0.019), ('proper', 0.019), ('amato', 0.018), ('augmentation', 0.018), ('ezd', 0.018), ('nability', 0.018), ('syracuse', 0.018), ('inclusion', 0.018), ('corollary', 0.017), ('characterizations', 0.017), ('mallat', 0.017), ('gn', 0.017), ('inner', 0.017), ('vn', 0.016), ('fourier', 0.016), ('characterization', 0.016), ('ha', 0.015), ('orthonormal', 0.015), ('nm', 0.015), ('adjoint', 0.015), ('predictor', 0.014), ('reconstruction', 0.014), ('pn', 0.014), ('nonnegative', 0.014), ('orthogonal', 0.014), ('cavaretta', 0.014), ('conway', 0.014), ('haizhang', 0.014), ('opfer', 0.014), ('yuesheng', 0.014), ('holds', 0.013), ('satis', 0.013), ('xu', 0.013), ('zhou', 0.013), ('sch', 0.013), ('chen', 0.013), ('cucker', 0.013), ('canu', 0.013), ('equation', 0.012), ('continuous', 0.012)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000005 71 jmlr-2007-Refinable Kernels

Author: Yuesheng Xu, Haizhang Zhang

Abstract: Motivated by mathematical learning from training data, we introduce the notion of refinable kernels. Various characterizations of refinable kernels are presented. The concept of refinable kernels leads to the introduction of wavelet-like reproducing kernels. We also investigate a refinable kernel that forms a Riesz basis. In particular, we characterize refinable translation invariant kernels, and refinable kernels defined by refinable functions. This study leads to multiresolution analysis of reproducing kernel Hilbert spaces. Keywords: refinable kernels, refinable feature maps, wavelet-like reproducing kernels, dual kernels, learning with kernels, reproducing kernel Hilbert spaces, Riesz bases

2 0.26561344 45 jmlr-2007-Learnability of Gaussians with Flexible Variances

Author: Yiming Ying, Ding-Xuan Zhou

Abstract: Gaussian kernels with flexible variances provide a rich family of Mercer kernels for learning algorithms. We show that the union of the unit balls of reproducing kernel Hilbert spaces generated by Gaussian kernels with flexible variances is a uniform Glivenko-Cantelli (uGC) class. This result confirms a conjecture concerning learnability of Gaussian kernels and verifies the uniform convergence of many learning algorithms involving Gaussians with changing variances. Rademacher averages and empirical covering numbers are used to estimate sample errors of multi-kernel regularization schemes associated with general loss functions. It is then shown that the regularization error associated with the least square loss and the Gaussian kernels can be greatly improved when flexible variances are allowed. Finally, for regularization schemes generated by Gaussian kernels with flexible variances we present explicit learning rates for regression with least square loss and classification with hinge loss. Keywords: Gaussian kernel, flexible variances, learning theory, Glivenko-Cantelli class, regularization scheme, empirical covering number

3 0.20223817 18 jmlr-2007-Characterizing the Function Space for Bayesian Kernel Models

Author: Natesh S. Pillai, Qiang Wu, Feng Liang, Sayan Mukherjee, Robert L. Wolpert

Abstract: Kernel methods have been very popular in the machine learning literature in the last ten years, mainly in the context of Tikhonov regularization algorithms. In this paper we study a coherent Bayesian kernel model based on an integral operator defined as the convolution of a kernel with a signed measure. Priors on the random signed measures correspond to prior distributions on the functions mapped by the integral operator. We study several classes of signed measures and their image mapped by the integral operator. In particular, we identify a general class of measures whose image is dense in the reproducing kernel Hilbert space (RKHS) induced by the kernel. A consequence of this result is a function theoretic foundation for using non-parametric prior specifications in Bayesian modeling, such as Gaussian process and Dirichlet process prior distributions. We discuss the construction of priors on spaces of signed measures using Gaussian and L´ vy processes, e with the Dirichlet processes being a special case the latter. Computational issues involved with sampling from the posterior distribution are outlined for a univariate regression and a high dimensional classification problem. Keywords: reproducing kernel Hilbert space, non-parametric Bayesian methods, L´ vy processes, e Dirichlet processes, integral operator, Gaussian processes c 2007 Natesh S. Pillai, Qiang Wu, Feng Liang, Sayan Mukherjee and Robert L. Wolpert. P ILLAI , W U , L IANG , M UKHERJEE AND W OLPERT

4 0.060328469 36 jmlr-2007-Generalization Error Bounds in Semi-supervised Classification Under the Cluster Assumption

Author: Philippe Rigollet

Abstract: We consider semi-supervised classification when part of the available data is unlabeled. These unlabeled data can be useful for the classification problem when we make an assumption relating the behavior of the regression function to that of the marginal distribution. Seeger (2000) proposed the well-known cluster assumption as a reasonable one. We propose a mathematical formulation of this assumption and a method based on density level sets estimation that takes advantage of it to achieve fast rates of convergence both in the number of unlabeled examples and the number of labeled examples. Keywords: semi-supervised learning, statistical learning theory, classification, cluster assumption, generalization bounds

5 0.046992745 46 jmlr-2007-Learning Equivariant Functions with Matrix Valued Kernels

Author: Marco Reisert, Hans Burkhardt

Abstract: This paper presents a new class of matrix valued kernels that are ideally suited to learn vector valued equivariant functions. Matrix valued kernels are a natural generalization of the common notion of a kernel. We set the theoretical foundations of so called equivariant matrix valued kernels. We work out several properties of equivariant kernels, we give an interpretation of their behavior and show relations to scalar kernels. The notion of (ir)reducibility of group representations is transferred into the framework of matrix valued kernels. At the end to two exemplary applications are demonstrated. We design a non-linear rotation and translation equivariant filter for 2D-images and propose an invariant object detector based on the generalized Hough transform. Keywords: kernel methods, matrix kernels, equivariance, group integration, representation theory, Hough transform, signal processing, Volterra series

6 0.045500968 38 jmlr-2007-Graph Laplacians and their Convergence on Random Neighborhood Graphs     (Special Topic on the Conference on Learning Theory 2005)

7 0.043245897 90 jmlr-2007-Value Regularization and Fenchel Duality

8 0.039882269 9 jmlr-2007-AdaBoost is Consistent

9 0.031797666 42 jmlr-2007-Infinitely Imbalanced Logistic Regression

10 0.031514596 89 jmlr-2007-VC Theory of Large Margin Multi-Category Classifiers     (Special Topic on Model Selection)

11 0.029898657 78 jmlr-2007-Statistical Consistency of Kernel Canonical Correlation Analysis

12 0.026193403 61 jmlr-2007-On the Consistency of Multiclass Classification Methods     (Special Topic on the Conference on Learning Theory 2005)

13 0.021865707 20 jmlr-2007-Combining PAC-Bayesian and Generic Chaining Bounds

14 0.020358929 69 jmlr-2007-Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes

15 0.019201368 62 jmlr-2007-On the Effectiveness of Laplacian Normalization for Graph Semi-supervised Learning

16 0.016933322 50 jmlr-2007-Local Discriminant Wavelet Packet Coordinates for Face Recognition

17 0.016057698 12 jmlr-2007-Attribute-Efficient and Non-adaptive Learning of Parities and DNF Expressions     (Special Topic on the Conference on Learning Theory 2005)

18 0.015608085 63 jmlr-2007-On the Representer Theorem and Equivalent Degrees of Freedom of SVR

19 0.015264783 7 jmlr-2007-A Stochastic Algorithm for Feature Selection in Pattern Recognition

20 0.014943683 60 jmlr-2007-Nonlinear Estimators and Tail Bounds for Dimension Reduction inl1Using Cauchy Random Projections


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.157), (1, -0.357), (2, 0.318), (3, 0.149), (4, 0.03), (5, 0.084), (6, 0.263), (7, -0.086), (8, -0.223), (9, -0.049), (10, -0.137), (11, -0.026), (12, 0.067), (13, 0.027), (14, -0.079), (15, -0.023), (16, 0.071), (17, -0.051), (18, 0.054), (19, -0.039), (20, -0.163), (21, -0.074), (22, -0.106), (23, -0.027), (24, -0.007), (25, -0.015), (26, -0.017), (27, -0.006), (28, -0.015), (29, 0.04), (30, 0.119), (31, -0.011), (32, -0.075), (33, -0.072), (34, 0.049), (35, -0.028), (36, -0.058), (37, -0.014), (38, 0.005), (39, 0.021), (40, -0.054), (41, 0.024), (42, 0.007), (43, 0.026), (44, -0.026), (45, 0.007), (46, 0.05), (47, -0.003), (48, 0.036), (49, 0.05)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.98115313 71 jmlr-2007-Refinable Kernels

Author: Yuesheng Xu, Haizhang Zhang

Abstract: Motivated by mathematical learning from training data, we introduce the notion of refinable kernels. Various characterizations of refinable kernels are presented. The concept of refinable kernels leads to the introduction of wavelet-like reproducing kernels. We also investigate a refinable kernel that forms a Riesz basis. In particular, we characterize refinable translation invariant kernels, and refinable kernels defined by refinable functions. This study leads to multiresolution analysis of reproducing kernel Hilbert spaces. Keywords: refinable kernels, refinable feature maps, wavelet-like reproducing kernels, dual kernels, learning with kernels, reproducing kernel Hilbert spaces, Riesz bases

2 0.76309639 45 jmlr-2007-Learnability of Gaussians with Flexible Variances

Author: Yiming Ying, Ding-Xuan Zhou

Abstract: Gaussian kernels with flexible variances provide a rich family of Mercer kernels for learning algorithms. We show that the union of the unit balls of reproducing kernel Hilbert spaces generated by Gaussian kernels with flexible variances is a uniform Glivenko-Cantelli (uGC) class. This result confirms a conjecture concerning learnability of Gaussian kernels and verifies the uniform convergence of many learning algorithms involving Gaussians with changing variances. Rademacher averages and empirical covering numbers are used to estimate sample errors of multi-kernel regularization schemes associated with general loss functions. It is then shown that the regularization error associated with the least square loss and the Gaussian kernels can be greatly improved when flexible variances are allowed. Finally, for regularization schemes generated by Gaussian kernels with flexible variances we present explicit learning rates for regression with least square loss and classification with hinge loss. Keywords: Gaussian kernel, flexible variances, learning theory, Glivenko-Cantelli class, regularization scheme, empirical covering number

3 0.74564934 18 jmlr-2007-Characterizing the Function Space for Bayesian Kernel Models

Author: Natesh S. Pillai, Qiang Wu, Feng Liang, Sayan Mukherjee, Robert L. Wolpert

Abstract: Kernel methods have been very popular in the machine learning literature in the last ten years, mainly in the context of Tikhonov regularization algorithms. In this paper we study a coherent Bayesian kernel model based on an integral operator defined as the convolution of a kernel with a signed measure. Priors on the random signed measures correspond to prior distributions on the functions mapped by the integral operator. We study several classes of signed measures and their image mapped by the integral operator. In particular, we identify a general class of measures whose image is dense in the reproducing kernel Hilbert space (RKHS) induced by the kernel. A consequence of this result is a function theoretic foundation for using non-parametric prior specifications in Bayesian modeling, such as Gaussian process and Dirichlet process prior distributions. We discuss the construction of priors on spaces of signed measures using Gaussian and L´ vy processes, e with the Dirichlet processes being a special case the latter. Computational issues involved with sampling from the posterior distribution are outlined for a univariate regression and a high dimensional classification problem. Keywords: reproducing kernel Hilbert space, non-parametric Bayesian methods, L´ vy processes, e Dirichlet processes, integral operator, Gaussian processes c 2007 Natesh S. Pillai, Qiang Wu, Feng Liang, Sayan Mukherjee and Robert L. Wolpert. P ILLAI , W U , L IANG , M UKHERJEE AND W OLPERT

4 0.20703089 46 jmlr-2007-Learning Equivariant Functions with Matrix Valued Kernels

Author: Marco Reisert, Hans Burkhardt

Abstract: This paper presents a new class of matrix valued kernels that are ideally suited to learn vector valued equivariant functions. Matrix valued kernels are a natural generalization of the common notion of a kernel. We set the theoretical foundations of so called equivariant matrix valued kernels. We work out several properties of equivariant kernels, we give an interpretation of their behavior and show relations to scalar kernels. The notion of (ir)reducibility of group representations is transferred into the framework of matrix valued kernels. At the end to two exemplary applications are demonstrated. We design a non-linear rotation and translation equivariant filter for 2D-images and propose an invariant object detector based on the generalized Hough transform. Keywords: kernel methods, matrix kernels, equivariance, group integration, representation theory, Hough transform, signal processing, Volterra series

5 0.19950743 36 jmlr-2007-Generalization Error Bounds in Semi-supervised Classification Under the Cluster Assumption

Author: Philippe Rigollet

Abstract: We consider semi-supervised classification when part of the available data is unlabeled. These unlabeled data can be useful for the classification problem when we make an assumption relating the behavior of the regression function to that of the marginal distribution. Seeger (2000) proposed the well-known cluster assumption as a reasonable one. We propose a mathematical formulation of this assumption and a method based on density level sets estimation that takes advantage of it to achieve fast rates of convergence both in the number of unlabeled examples and the number of labeled examples. Keywords: semi-supervised learning, statistical learning theory, classification, cluster assumption, generalization bounds

6 0.18361139 90 jmlr-2007-Value Regularization and Fenchel Duality

7 0.17606288 38 jmlr-2007-Graph Laplacians and their Convergence on Random Neighborhood Graphs     (Special Topic on the Conference on Learning Theory 2005)

8 0.14884387 89 jmlr-2007-VC Theory of Large Margin Multi-Category Classifiers     (Special Topic on Model Selection)

9 0.14804152 9 jmlr-2007-AdaBoost is Consistent

10 0.12518838 20 jmlr-2007-Combining PAC-Bayesian and Generic Chaining Bounds

11 0.10536564 42 jmlr-2007-Infinitely Imbalanced Logistic Regression

12 0.098317415 69 jmlr-2007-Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes

13 0.097213365 4 jmlr-2007-A New Probabilistic Approach in Rank Regression with Optimal Bayesian Partitioning     (Special Topic on Model Selection)

14 0.094290443 81 jmlr-2007-The Locally Weighted Bag of Words Framework for Document Representation

15 0.092077032 78 jmlr-2007-Statistical Consistency of Kernel Canonical Correlation Analysis

16 0.091317259 53 jmlr-2007-Maximum Entropy Density Estimation with Generalized Regularization and an Application to Species Distribution Modeling

17 0.087825686 23 jmlr-2007-Concave Learners for Rankboost

18 0.087079316 6 jmlr-2007-A Probabilistic Analysis of EM for Mixtures of Separated, Spherical Gaussians

19 0.078325741 3 jmlr-2007-A Generalized Maximum Entropy Approach to Bregman Co-clustering and Matrix Approximation

20 0.065923773 12 jmlr-2007-Attribute-Efficient and Non-adaptive Learning of Parities and DNF Expressions     (Special Topic on the Conference on Learning Theory 2005)


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(4, 0.017), (8, 0.027), (10, 0.019), (12, 0.045), (28, 0.107), (40, 0.073), (41, 0.39), (48, 0.019), (60, 0.056), (85, 0.032), (98, 0.075)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.72896284 71 jmlr-2007-Refinable Kernels

Author: Yuesheng Xu, Haizhang Zhang

Abstract: Motivated by mathematical learning from training data, we introduce the notion of refinable kernels. Various characterizations of refinable kernels are presented. The concept of refinable kernels leads to the introduction of wavelet-like reproducing kernels. We also investigate a refinable kernel that forms a Riesz basis. In particular, we characterize refinable translation invariant kernels, and refinable kernels defined by refinable functions. This study leads to multiresolution analysis of reproducing kernel Hilbert spaces. Keywords: refinable kernels, refinable feature maps, wavelet-like reproducing kernels, dual kernels, learning with kernels, reproducing kernel Hilbert spaces, Riesz bases

2 0.69032842 84 jmlr-2007-The Pyramid Match Kernel: Efficient Learning with Sets of Features

Author: Kristen Grauman, Trevor Darrell

Abstract: In numerous domains it is useful to represent a single example by the set of the local features or parts that comprise it. However, this representation poses a challenge to many conventional machine learning techniques, since sets may vary in cardinality and elements lack a meaningful ordering. Kernel methods can learn complex functions, but a kernel over unordered set inputs must somehow solve for correspondences—generally a computationally expensive task that becomes impractical for large set sizes. We present a new fast kernel function called the pyramid match that measures partial match similarity in time linear in the number of features. The pyramid match maps unordered feature sets to multi-resolution histograms and computes a weighted histogram intersection in order to find implicit correspondences based on the finest resolution histogram cell where a matched pair first appears. We show the pyramid match yields a Mercer kernel, and we prove bounds on its error relative to the optimal partial matching cost. We demonstrate our algorithm on both classification and regression tasks, including object recognition, 3-D human pose inference, and time of publication estimation for documents, and we show that the proposed method is accurate and significantly more efficient than current approaches. Keywords: kernel, sets of features, histogram intersection, multi-resolution histogram pyramid, approximate matching, object recognition

3 0.36133796 47 jmlr-2007-Learning Horn Expressions with LOGAN-H

Author: Marta Arias, Roni Khardon, Jérôme Maloberti

Abstract: The paper introduces L OG A N -H —a system for learning first-order function-free Horn expressions from interpretations. The system is based on an algorithm that learns by asking questions and that was proved correct in previous work. The current paper shows how the algorithm can be implemented in a practical system, and introduces a new algorithm based on it that avoids interaction and learns from examples only. The L OG A N -H system implements these algorithms and adds several facilities and optimizations that allow efficient applications in a wide range of problems. As one of the important ingredients, the system includes several fast procedures for solving the subsumption problem, an NP-complete problem that needs to be solved many times during the learning process. We describe qualitative and quantitative experiments in several domains. The experiments demonstrate that the system can deal with varied problems, large amounts of data, and that it achieves good classification accuracy. Keywords: inductive logic programming, subsumption, bottom-up learning, learning with queries

4 0.35971043 45 jmlr-2007-Learnability of Gaussians with Flexible Variances

Author: Yiming Ying, Ding-Xuan Zhou

Abstract: Gaussian kernels with flexible variances provide a rich family of Mercer kernels for learning algorithms. We show that the union of the unit balls of reproducing kernel Hilbert spaces generated by Gaussian kernels with flexible variances is a uniform Glivenko-Cantelli (uGC) class. This result confirms a conjecture concerning learnability of Gaussian kernels and verifies the uniform convergence of many learning algorithms involving Gaussians with changing variances. Rademacher averages and empirical covering numbers are used to estimate sample errors of multi-kernel regularization schemes associated with general loss functions. It is then shown that the regularization error associated with the least square loss and the Gaussian kernels can be greatly improved when flexible variances are allowed. Finally, for regularization schemes generated by Gaussian kernels with flexible variances we present explicit learning rates for regression with least square loss and classification with hinge loss. Keywords: Gaussian kernel, flexible variances, learning theory, Glivenko-Cantelli class, regularization scheme, empirical covering number

5 0.34629121 16 jmlr-2007-Boosted Classification Trees and Class Probability Quantile Estimation

Author: David Mease, Abraham J. Wyner, Andreas Buja

Abstract: The standard by which binary classifiers are usually judged, misclassification error, assumes equal costs of misclassifying the two classes or, equivalently, classifying at the 1/2 quantile of the conditional class probability function P[y = 1|x]. Boosted classification trees are known to perform quite well for such problems. In this article we consider the use of standard, off-the-shelf boosting for two more general problems: 1) classification with unequal costs or, equivalently, classification at quantiles other than 1/2, and 2) estimation of the conditional class probability function P[y = 1|x]. We first examine whether the latter problem, estimation of P[y = 1|x], can be solved with LogitBoost, and with AdaBoost when combined with a natural link function. The answer is negative: both approaches are often ineffective because they overfit P[y = 1|x] even though they perform well as classifiers. A major negative point of the present article is the disconnect between class probability estimation and classification. Next we consider the practice of over/under-sampling of the two classes. We present an algorithm that uses AdaBoost in conjunction with Over/Under-Sampling and Jittering of the data (“JOUS-Boost”). This algorithm is simple, yet successful, and it preserves the advantage of relative protection against overfitting, but for arbitrary misclassification costs and, equivalently, arbitrary quantile boundaries. We then use collections of classifiers obtained from a grid of quantiles to form estimators of class probabilities. The estimates of the class probabilities compare favorably to those obtained by a variety of methods across both simulated and real data sets. Keywords: boosting algorithms, LogitBoost, AdaBoost, class probability estimation, over-sampling, under-sampling, stratification, data jittering

6 0.34346822 63 jmlr-2007-On the Representer Theorem and Equivalent Degrees of Freedom of SVR

7 0.34030011 53 jmlr-2007-Maximum Entropy Density Estimation with Generalized Regularization and an Application to Species Distribution Modeling

8 0.33664581 89 jmlr-2007-VC Theory of Large Margin Multi-Category Classifiers     (Special Topic on Model Selection)

9 0.3339158 76 jmlr-2007-Spherical-Homoscedastic Distributions: The Equivalency of Spherical and Normal Distributions in Classification

10 0.33195189 24 jmlr-2007-Consistent Feature Selection for Pattern Recognition in Polynomial Time

11 0.32838428 20 jmlr-2007-Combining PAC-Bayesian and Generic Chaining Bounds

12 0.32468188 37 jmlr-2007-GiniSupport Vector Machine: Quadratic Entropy Based Robust Multi-Class Probability Regression

13 0.32418355 5 jmlr-2007-A Nonparametric Statistical Approach to Clustering via Mode Identification

14 0.32341057 61 jmlr-2007-On the Consistency of Multiclass Classification Methods     (Special Topic on the Conference on Learning Theory 2005)

15 0.32320467 49 jmlr-2007-Learning to Classify Ordinal Data: The Data Replication Method

16 0.32307887 90 jmlr-2007-Value Regularization and Fenchel Duality

17 0.3211827 77 jmlr-2007-Stagewise Lasso

18 0.32077289 12 jmlr-2007-Attribute-Efficient and Non-adaptive Learning of Parities and DNF Expressions     (Special Topic on the Conference on Learning Theory 2005)

19 0.31993926 62 jmlr-2007-On the Effectiveness of Laplacian Normalization for Graph Semi-supervised Learning

20 0.31974149 60 jmlr-2007-Nonlinear Estimators and Tail Bounds for Dimension Reduction inl1Using Cauchy Random Projections