nips nips2004 nips2004-27 knowledge-graph by maker-knowledge-mining

27 nips-2004-Bayesian Regularization and Nonnegative Deconvolution for Time Delay Estimation


Source: pdf

Author: Yuanqing Lin, Daniel D. Lee

Abstract: Bayesian Regularization and Nonnegative Deconvolution (BRAND) is proposed for estimating time delays of acoustic signals in reverberant environments. Sparsity of the nonnegative filter coefficients is enforced using an L1 -norm regularization. A probabilistic generative model is used to simultaneously estimate the regularization parameters and filter coefficients from the signal data. Iterative update rules are derived under a Bayesian framework using the Expectation-Maximization procedure. The resulting time delay estimation algorithm is demonstrated on noisy acoustic data.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu Abstract Bayesian Regularization and Nonnegative Deconvolution (BRAND) is proposed for estimating time delays of acoustic signals in reverberant environments. [sent-4, score-0.651]

2 Sparsity of the nonnegative filter coefficients is enforced using an L1 -norm regularization. [sent-5, score-0.427]

3 A probabilistic generative model is used to simultaneously estimate the regularization parameters and filter coefficients from the signal data. [sent-6, score-0.425]

4 Iterative update rules are derived under a Bayesian framework using the Expectation-Maximization procedure. [sent-7, score-0.081]

5 The resulting time delay estimation algorithm is demonstrated on noisy acoustic data. [sent-8, score-0.589]

6 1 Introduction Estimating the time difference of arrival is crucial for binaural acoustic sound source localization[1]. [sent-9, score-0.523]

7 1 where the azimuthal angle φ to the sound source is determined by the difference in direct propagation times of the sound to the two microphones. [sent-11, score-0.256]

8 The standard signal processing algorithm for determining the time delay between two signals s(t) and x(t) relies upon computing the cross-correlation function[2]: C(∆t) = dt x(t)s(t − ∆t) and determining the time delay ∆t that maximizes the cross-correlation. [sent-12, score-1.048]

9 In the presence of uncorrelated white noise, this procedure is equivalent to the optimal matched filter for detection of the time delayed signal. [sent-13, score-0.148]

10 However, a typical room environment is reverberant and the measured signal is contaminated with echoes from multiple paths as shown in Fig. [sent-14, score-0.444]

11 In this case, the cross-correlation and related algorithms may not be optimal for estimating the time delays. [sent-16, score-0.153]

12 An alternative approach would be to estimate the multiple time delays as a linear deconvolution problem: min x(t) − α αi s(t − ∆ti ) 2 (1) i Unfortunately, this deconvolution can be ill-conditioned resulting in very noisy solutions for the coefficients α. [sent-17, score-0.994]

13 Recently, we proposed incorporating nonnegativity constraints α ≥ 0 in the deconvolution to overcome the ill-conditioned linear solutions [3]. [sent-18, score-0.506]

14 The use of these constraints is justified by acoustic models that describe the theoretical room impulse response with nonnegative filter coeffients [4]. [sent-19, score-0.735]

15 The resulting optimization problem can be written as the nonnegative quadratic programming problem: min x − Sα 2 (2) α≥0 s ∆tΕ ∆t1 ∆t2 φ x1 x2 d Figure 1: The typical scenario of reverberant signal. [sent-20, score-0.88]

16 x2 (t) comes from the direct path (∆t2 ) and echo paths(∆tE ). [sent-21, score-0.063]

17 2 -10 -5 0 5 time delay(T s ) -200 -10 10 Linear Deconvolution 200 1 c 100 0. [sent-27, score-0.063]

18 2 -300 -10 -5 0 5 time delay(Ts ) 0 -10 10 -5 0 5 time delay(Ts ) 10 Figure 2: Time delay estimation of a speech signal with a) cross-correlation, b) phase alignment transform, c) linear deconvolution, d) nonnegative deconvolution. [sent-31, score-1.162]

19 s(t − ∆tM )} is an N × M matrix, and α is a M × 1 vector of nonnegative coefficients. [sent-42, score-0.427]

20 Figure 2 compares the performance of cross-correlation, phase alignment transform(a generalized cross-correlation algorithm), linear deconvolution, and nonnegative deconvolution for estimating the time delays in a clean speech signal containing an echo. [sent-43, score-1.305]

21 From the structure of the estimated coefficients, it is clear that nonnegative deconvolution can successfully discover the structure of the time delays present in the signal. [sent-44, score-1.01]

22 However, in the presence of large background noise, it may be necessary to regularize the nonnegative quadratic optimization to prevent overfitting. [sent-45, score-0.66]

23 In this case, we propose using an L1 -norm regularization to favor sparse solutions [5]: min x − Sα α≥0 2 ˆ +λ αi (3) i ˆ ˆ In this formula, the parameter λ (λ ≥ 0) describes the trade-off between fitting the observed data and enforcing sparse solutions. [sent-46, score-0.391]

24 The proper choice of this parameter may be crucial in obtaining the optimal time delay estimates. [sent-47, score-0.381]

25 In the rest of this manuscript, we introduce a proper generative model for these regularization parameters and filter coefficients within a probabilistic Bayesian framework. [sent-48, score-0.266]

26 We show how these parameters can be efficiently determined using appropriate iterative estimates. [sent-49, score-0.144]

27 We conclude by demonstrating and discussing the performance of our algorithm on noisy acoustic signals in reverberant environments. [sent-50, score-0.338]

28 2 Bayesian regularization Instead of arbitrarily setting values for the regularization parameters, we show how a Bayesian framework can be used to automatically estimate the correct values from the data. [sent-51, score-0.498]

29 Bayesian regularization has previously been successfully applied to neural network learning [6], model selection, and relevance vector machine (RVM) [7]. [sent-52, score-0.301]

30 In our model, we use L1 -norm sparsity regularization, and Bayesian framework will be used to optimally determine the appropriate regularization parameters. [sent-54, score-0.299]

31 Our probabilistic model assumes the observed data signal is generated by convolving the source signal with a nonnegative filter describing the room impulse response. [sent-55, score-0.885]

32 This signal is then contaminated by additive Gaussian white noise with zero-mean and covariance σ 2 : 1 1 (4) P (x|S, α, σ 2 ) = exp − 2 x − Sα 2 . [sent-56, score-0.399]

33 This prior only has support in the nonnegative orthant and the sharpness of the distribution is given by the regularization parameter λ: M P (α|λ) = λM exp{−λ αi }, α ≥ 0 . [sent-58, score-0.719]

34 (5) i=1 In order to infer the optimal settings of the regularization parameters σ 2 and λ, Bayes rule is used to maximize the posterior distribution: P (λ, σ 2 |x, S) = P (x|λ, σ 2 , S)P (λ, σ 2 ) . [sent-59, score-0.266]

35 P (x|S) (6) Assuming that P (λ, σ 2 ) is relatively flat [8], estimating σ 2 and λ is then equivalent to maximizing the likelihood: P (x|λ, σ 2 , S) = where F (α) = T λM dα exp[−F (α)] (2πσ 2 )N/2 α≥0 1 (x − Sα)T (x − Sα) + λeT α 2σ 2 (7) (8) and e = [1 1 . [sent-60, score-0.09]

36 Previous approaches to Bayesian regularization have used iterative updates heuristically derived from selfconsistent fixed point equations. [sent-66, score-0.387]

37 These updates have guaranteed convergence properties and can be intuitively understood as iteratively reestimating λ and σ 2 based upon appropriate expectations over the current estimate for Q(α). [sent-68, score-0.289]

38 9–10 are dominated by α ≈ αM L where the most likely αM L is given by: 1 αM L = arg min 2 (x − Sα)T (x − Sα) + λT α. [sent-71, score-0.058]

39 (12) α≥0 2σ This optimization is equivalent to the nonnegative quadratic programming problem in Eq. [sent-72, score-0.641]

40 The first method is based upon a multiplicative update rule for nonnegative quadratic programming [9]. [sent-76, score-0.761]

41 We first write the problem in the following form: 1 min αT Aα + bT α, α≥0 2 where A = 1 T σ 2 S S, (13) 1 and b = − σ2 ST x. [sent-77, score-0.058]

42 First, we decompose the matrix A = A+ − A− into its positive and negative components such that: A+ = ij Aij 0 if if Aij > 0 Aij ≤ 0 0 −Aij A− = ij if if Aij ≥ 0 Aij < 0 (14) Then the following is an auxiliary function that upper bounds Eq. [sent-78, score-0.255]

43 13 [9]: ˜ G(α, α) = bT α + 1 2 i ˜ (A+ α)i 2 1 αi − αi ˜ 2 A− αi αj (1 + ln ij ˜ ˜ i,j αi αj ). [sent-79, score-0.125]

44 15 yields the following iterative multiplicative rule with guaranteed convergence to αM L : αi ←− αi −bi + b2 + 4(A+ α)i (A− α)i i . [sent-81, score-0.25]

45 16 is used to efficiently compute a reasonable estimate for αM L from an arbitrary initialization. [sent-83, score-0.038]

46 However, its convergence is similar to other interior point methods in that small components of αM L will continually decrease but never equal zero. [sent-84, score-0.081]

47 In order to truly sparsify the solution, we employ an alternative method based upon the simplex algorithm for linear programming. [sent-85, score-0.161]

48 Our other optimization method is based upon finding a solution αM L that satistifies the Karush-Kuhn-Tucker (KKT) conditions for Eq. [sent-86, score-0.087]

49 (17) By introducing additional artificial variables a, the KKT conditions can be transformed into the linear optimization min i ai subject to the constraints: a ≥ 0 (18) α ≥ 0 (19) β ≥ 0 (20) Aα − β + sign(−b)a = −b (21) αi βi = 0, i = 1, 2, . [sent-91, score-0.091]

50 However, this can be effectively implemented in the simplex procedure by modifying the selection of the pivot element to ensure that αi and βi are never both in the set of basic variables. [sent-96, score-0.107]

51 With this simple modification of the simplex algorithm, the optimal αM L can be efficiently computed. [sent-97, score-0.076]

52 2 Approximation of Q(α) Once the most likely αM L has been determined, the simplest approach for estimating the new λ and σ 2 in Eqs. [sent-99, score-0.09]

53 Unfortunately, this simple approximation will cause λ and σ to diverge from bad initial estimates. [sent-101, score-0.058]

54 To overcome these difficulties, we use a slightly more sophisticated method of estimating the expectations to properly consider variability in the distribution Q(α). [sent-102, score-0.194]

55 We first note that the solution αM L of the nonnegative quadratic optimization in Eq. [sent-103, score-0.562]

56 12 naturally partitions the elements of the vector α into two distinct subsets αI and αJ , consisting of components i ∈ I such that (αM L )i = 0, and components j ∈ J such that (αM L )j > 0, respectively. [sent-104, score-0.082]

57 It will then be useful to approximate the distribution Q(α) as the factored form: Q(α) ≈ QI (αI )QJ (αJ ) (23) Consider the components αJ near the maximum likelihood solution αM L . [sent-105, score-0.041]

58 For the other components αI , it is important to consider the nonnegativity constraints, since αM L = 0 is on the boundary of the distribution. [sent-107, score-0.127]

59 (25) ˆ QI (αI ) is then approximated with factorial exponential distribution QI (αI ) so that the integrals in Eqs. [sent-109, score-0.042]

60 1 −αi /µi ˆ e , αI ≥ 0 (26) QI (αI ) = µi i∈I which has support only for nonnegative αI ≥ 0. [sent-111, score-0.427]

61 The mean-field parameters µ are optimally obtained by minimizing the KL-divergence: ˆ QI (αI ) ˆ . [sent-112, score-0.068]

62 (27) min dαI QI (αI ) ln µ≥0 αI ≥0 QI (αI ) This integral can easily be computed in terms of the parameters µ and yields the minimization: 1 ˆ ˆ (28) min − ln µi + bT µ + µT Aµ, I µ≥0 2 i∈I ˆ ˆ where bI = (AαM L + b)I , A = AII + diag(AII ). [sent-113, score-0.276]

63 To solve this minimization problem, we use an auxiliary function for Eq. [sent-114, score-0.12]

64 Minimization of this auxiliary function yields the following multiplicative update rules for µi : µi ←− µi ˆ2 + 4(A+ µ)i [(A− µ)i + ˆ ˆ bi −ˆi + b 1 µi ] ˆ 2(A+ µ)i . [sent-116, score-0.281]

65 (30) These iterations are then guaranteed to converge to the optimal mean-field parameters for the distribution QI (αI ). [sent-117, score-0.073]

66 ˆ Given the factorized approximation QI (αI )QJ (αJ ), the expectations in Eqs. [sent-118, score-0.071]

67 Determine αM L by solving the nonnegative quadratic programming in Eq. [sent-123, score-0.608]

68 Calculate the mean α and covariance C for this distribution. [sent-128, score-0.039]

69 Reestimate regularization parameters λ and σ 2 using Eqs. [sent-130, score-0.266]

70 2 0 -4 4 0 5 10 Iteration Number 15 20 10 0 5 10 Iteration Number 15 20 Figure 3: Iterative estimation of λ (M/λ in the figure, indicating the reverberation level) and σ 2 when x(t) is contaminated by background white noise at -5 dB, -20 dB, and -40 dB levels. [sent-142, score-0.328]

71 3 Results We illustrate the performance of our algorithm in estimating the regularization parameters as well as the nonnegative filter coefficients of a speech source signal s(t). [sent-144, score-1.051]

72 The observed signal x(t) is simulated by a time-delayed version of the source signal mixed with an echo along with additive Gaussian white noise η(t): x(t) = s(t − Ts ) + 0. [sent-145, score-0.514]

73 (35) We compare the results of the algorithm as the noise level is changed. [sent-148, score-0.094]

74 3 shows the convergence of the estimates for λ and σ 2 as the noise level is varied between -5 dB and -40 dB. [sent-150, score-0.134]

75 There is rapid convergence of both parameters even with bad initial estimates. [sent-151, score-0.134]

76 The resulting value of the σ 2 parameter is very close to the true noise level. [sent-152, score-0.056]

77 Additionally, the estimated λ parameter is inversely related to the reverberation level of the environment, given by the sum of the true filter coefficients. [sent-153, score-0.132]

78 4 demonstrates the importance of correctly determining the regularization parameters in estimating the time delay structure in the presence of noise. [sent-155, score-0.81]

79 Using the Bayesian regularization procedure, the resulting estimate for αM L correctly models the direct path time delay as well as the secondary echo. [sent-156, score-0.649]

80 However, if the regularization parameters are manually set incorrectly to over-sparsify the solution, the resulting estimates for the time delays may be quite inaccurate. [sent-157, score-0.524]

81 4 Discussion In summary, we propose using a Bayesian framework to automatically regularize nonnegative deconvolutions for estimating time delays in acoustic signals. [sent-158, score-0.965]

82 We present two methods for efficiently solving the resulting nonnegative quadratic programming problem. [sent-159, score-0.608]

83 We also derive an iterative algorithm from Expectation-Maximization to estimate the regularization parameters. [sent-160, score-0.376]

84 We show how these iterative updates can simulataneously estimate the timedelay structure in the signal, as well as the background noise level and reverberation level of the room. [sent-161, score-0.452]

85 Our results indicate that the algorithm is able to quickly converge to an optimal solution, even with bad initial estimates. [sent-162, score-0.097]

86 Preliminary tests with an acoustic robotic platform indicate that these algorithms can successfully be implemented on a real-time system. [sent-163, score-0.27]

87 2 0 -20 -15 -10 -5 0 5 10 15 20 5 10 15 20 Time delay (Ts ) 2 λσ =200 1 b 0. [sent-169, score-0.318]

88 2 0 -20 -15 -10 -5 0 Time delay (Ts ) Figure 4: Estimated time delay structure from αM L with different regularizations: a) Bayesian regularization, b) manually set regularization. [sent-173, score-0.734]

89 Dotted lines indicate the true positions of the time delays. [sent-174, score-0.102]

90 We are currently working to extend the algorithm to the situation where the source signal needs to also be estimated. [sent-175, score-0.223]

91 In this case, priors for the source signal are used to regularize the source estimates. [sent-176, score-0.443]

92 These priors are similar to those used for blind source separation. [sent-177, score-0.156]

93 We are investigating algorithms that can simultaneously estimate the hyperparameters for these priors in addition to the other parameters within a consistent Bayesian framework. [sent-178, score-0.128]

94 Singer, “Discriminative binaural sound localization,” in Advances in Neural Information Processing Systems, S. [sent-181, score-0.149]

95 , “The generalized correlation method for estimation of time delay,” IEEE Transactions on ASSP, vol. [sent-193, score-0.11]

96 Saul, “Nonnegative deconvolution for time of arrival estimation,” in ICASSP, 2004. [sent-202, score-0.432]

97 Field, “Emergence of simple-cell receptive field properties by learning a sparse code for nature images,” Nature, vol. [sent-216, score-0.035]

98 Tipping, “Sparse Bayesian learning and the relevance vector machine,” Journal of Machine Learning Research, vol. [sent-225, score-0.032]

99 Lee, “Multiplicative updates for nonnegative quadratic programming in support vector machines,” in Advances in Neural Information Processing Systems, S. [sent-236, score-0.657]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('nonnegative', 0.427), ('deconvolution', 0.321), ('delay', 0.318), ('regularization', 0.23), ('qi', 0.201), ('acoustic', 0.161), ('delays', 0.16), ('reverberant', 0.144), ('bayesian', 0.132), ('aij', 0.127), ('lter', 0.122), ('signal', 0.121), ('aii', 0.115), ('qj', 0.115), ('iterative', 0.108), ('db', 0.107), ('quadratic', 0.102), ('source', 0.102), ('reverberation', 0.094), ('coef', 0.094), ('estimating', 0.09), ('ts', 0.089), ('auxiliary', 0.088), ('nonnegativity', 0.086), ('contaminated', 0.08), ('programming', 0.079), ('sound', 0.077), ('cients', 0.077), ('simplex', 0.076), ('bt', 0.073), ('ajj', 0.072), ('binaural', 0.072), ('suzanna', 0.072), ('expectations', 0.071), ('room', 0.066), ('multiplicative', 0.065), ('regularize', 0.064), ('ij', 0.063), ('echo', 0.063), ('time', 0.063), ('ln', 0.062), ('min', 0.058), ('bad', 0.058), ('noise', 0.056), ('priors', 0.054), ('upon', 0.054), ('exp', 0.052), ('white', 0.051), ('updates', 0.049), ('impulse', 0.048), ('arrival', 0.048), ('bi', 0.047), ('estimation', 0.047), ('rules', 0.047), ('localization', 0.046), ('alignment', 0.046), ('lee', 0.045), ('speech', 0.045), ('st', 0.043), ('kkt', 0.042), ('integrals', 0.042), ('eld', 0.042), ('components', 0.041), ('obermayer', 0.04), ('convergence', 0.04), ('determining', 0.039), ('covariance', 0.039), ('indicate', 0.039), ('successfully', 0.039), ('estimate', 0.038), ('transform', 0.038), ('level', 0.038), ('sparsity', 0.037), ('scenario', 0.037), ('guaranteed', 0.037), ('parameters', 0.036), ('becker', 0.036), ('sparse', 0.035), ('manually', 0.035), ('update', 0.034), ('unfortunately', 0.034), ('presence', 0.034), ('signals', 0.033), ('overcome', 0.033), ('lin', 0.033), ('paths', 0.033), ('optimization', 0.033), ('solutions', 0.033), ('constraints', 0.033), ('phase', 0.032), ('relevance', 0.032), ('optimally', 0.032), ('minimization', 0.032), ('pivot', 0.031), ('hagan', 0.031), ('orthant', 0.031), ('platform', 0.031), ('sharpness', 0.031), ('simulataneously', 0.031), ('sparsify', 0.031)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999988 27 nips-2004-Bayesian Regularization and Nonnegative Deconvolution for Time Delay Estimation

Author: Yuanqing Lin, Daniel D. Lee

Abstract: Bayesian Regularization and Nonnegative Deconvolution (BRAND) is proposed for estimating time delays of acoustic signals in reverberant environments. Sparsity of the nonnegative filter coefficients is enforced using an L1 -norm regularization. A probabilistic generative model is used to simultaneously estimate the regularization parameters and filter coefficients from the signal data. Iterative update rules are derived under a Bayesian framework using the Expectation-Maximization procedure. The resulting time delay estimation algorithm is demonstrated on noisy acoustic data.

2 0.20569341 152 nips-2004-Real-Time Pitch Determination of One or More Voices by Nonnegative Matrix Factorization

Author: Fei Sha, Lawrence K. Saul

Abstract: An auditory “scene”, composed of overlapping acoustic sources, can be viewed as a complex object whose constituent parts are the individual sources. Pitch is known to be an important cue for auditory scene analysis. In this paper, with the goal of building agents that operate in human environments, we describe a real-time system to identify the presence of one or more voices and compute their pitch. The signal processing in the front end is based on instantaneous frequency estimation, a method for tracking the partials of voiced speech, while the pattern-matching in the back end is based on nonnegative matrix factorization, an unsupervised algorithm for learning the parts of complex objects. While supporting a framework to analyze complicated auditory scenes, our system maintains real-time operability and state-of-the-art performance in clean speech.

3 0.14729548 5 nips-2004-A Harmonic Excitation State-Space Approach to Blind Separation of Speech

Author: Rasmus K. Olsson, Lars K. Hansen

Abstract: We discuss an identification framework for noisy speech mixtures. A block-based generative model is formulated that explicitly incorporates the time-varying harmonic plus noise (H+N) model for a number of latent sources observed through noisy convolutive mixtures. All parameters including the pitches of the source signals, the amplitudes and phases of the sources, the mixing filters and the noise statistics are estimated by maximum likelihood, using an EM-algorithm. Exact averaging over the hidden sources is obtained using the Kalman smoother. We show that pitch estimation and source separation can be performed simultaneously. The pitch estimates are compared to laryngograph (EGG) measurements. Artificial and real room mixtures are used to demonstrate the viability of the approach. Intelligible speech signals are re-synthesized from the estimated H+N models.

4 0.1137949 42 nips-2004-Computing regularization paths for learning multiple kernels

Author: Francis R. Bach, Romain Thibaux, Michael I. Jordan

Abstract: The problem of learning a sparse conic combination of kernel functions or kernel matrices for classification or regression can be achieved via the regularization by a block 1-norm [1]. In this paper, we present an algorithm that computes the entire regularization path for these problems. The path is obtained by using numerical continuation techniques, and involves a running time complexity that is a constant times the complexity of solving the problem for one value of the regularization parameter. Working in the setting of kernel linear regression and kernel logistic regression, we show empirically that the effect of the block 1-norm regularization differs notably from the (non-block) 1-norm regularization commonly used for variable selection, and that the regularization path is of particular value in the block case. 1

5 0.10901728 132 nips-2004-Nonlinear Blind Source Separation by Integrating Independent Component Analysis and Slow Feature Analysis

Author: Tobias Blaschke, Laurenz Wiskott

Abstract: In contrast to the equivalence of linear blind source separation and linear independent component analysis it is not possible to recover the original source signal from some unknown nonlinear transformations of the sources using only the independence assumption. Integrating the objectives of statistical independence and temporal slowness removes this indeterminacy leading to a new method for nonlinear blind source separation. The principle of temporal slowness is adopted from slow feature analysis, an unsupervised method to extract slowly varying features from a given observed vectorial signal. The performance of the algorithm is demonstrated on nonlinearly mixed speech data. 1

6 0.094578147 54 nips-2004-Distributed Information Regularization on Graphs

7 0.093170002 4 nips-2004-A Generalized Bradley-Terry Model: From Group Competition to Individual Skill

8 0.087372914 70 nips-2004-Following Curved Regularized Optimization Solution Paths

9 0.086821921 97 nips-2004-Learning Efficient Auditory Codes Using Spikes Predicts Cochlear Filters

10 0.081841059 187 nips-2004-The Entire Regularization Path for the Support Vector Machine

11 0.075934805 96 nips-2004-Learning, Regularization and Ill-Posed Inverse Problems

12 0.073548004 50 nips-2004-Dependent Gaussian Processes

13 0.070175618 98 nips-2004-Learning Gaussian Process Kernels via Hierarchical Bayes

14 0.06998767 31 nips-2004-Blind One-microphone Speech Separation: A Spectral Learning Approach

15 0.069727413 102 nips-2004-Learning first-order Markov models for control

16 0.065435268 67 nips-2004-Exponentiated Gradient Algorithms for Large-margin Structured Classification

17 0.065306105 105 nips-2004-Log-concavity Results on Gaussian Process Methods for Supervised and Unsupervised Learning

18 0.064493299 191 nips-2004-The Variational Ising Classifier (VIC) Algorithm for Coherently Contaminated Data

19 0.063239701 161 nips-2004-Self-Tuning Spectral Clustering

20 0.062890328 174 nips-2004-Spike Sorting: Bayesian Clustering of Non-Stationary Data


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.211), (1, 0.003), (2, -0.018), (3, -0.038), (4, -0.175), (5, -0.167), (6, 0.152), (7, 0.018), (8, -0.02), (9, -0.011), (10, 0.164), (11, -0.0), (12, 0.076), (13, 0.031), (14, -0.082), (15, -0.009), (16, -0.003), (17, -0.106), (18, 0.062), (19, 0.013), (20, 0.105), (21, 0.006), (22, 0.028), (23, -0.059), (24, 0.093), (25, 0.009), (26, -0.07), (27, -0.051), (28, -0.064), (29, 0.031), (30, 0.058), (31, 0.008), (32, -0.024), (33, 0.023), (34, -0.053), (35, -0.017), (36, -0.045), (37, -0.132), (38, -0.022), (39, -0.07), (40, 0.159), (41, 0.047), (42, -0.082), (43, 0.109), (44, 0.223), (45, 0.071), (46, -0.018), (47, 0.083), (48, -0.036), (49, 0.04)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.9674961 27 nips-2004-Bayesian Regularization and Nonnegative Deconvolution for Time Delay Estimation

Author: Yuanqing Lin, Daniel D. Lee

Abstract: Bayesian Regularization and Nonnegative Deconvolution (BRAND) is proposed for estimating time delays of acoustic signals in reverberant environments. Sparsity of the nonnegative filter coefficients is enforced using an L1 -norm regularization. A probabilistic generative model is used to simultaneously estimate the regularization parameters and filter coefficients from the signal data. Iterative update rules are derived under a Bayesian framework using the Expectation-Maximization procedure. The resulting time delay estimation algorithm is demonstrated on noisy acoustic data.

2 0.60055929 152 nips-2004-Real-Time Pitch Determination of One or More Voices by Nonnegative Matrix Factorization

Author: Fei Sha, Lawrence K. Saul

Abstract: An auditory “scene”, composed of overlapping acoustic sources, can be viewed as a complex object whose constituent parts are the individual sources. Pitch is known to be an important cue for auditory scene analysis. In this paper, with the goal of building agents that operate in human environments, we describe a real-time system to identify the presence of one or more voices and compute their pitch. The signal processing in the front end is based on instantaneous frequency estimation, a method for tracking the partials of voiced speech, while the pattern-matching in the back end is based on nonnegative matrix factorization, an unsupervised algorithm for learning the parts of complex objects. While supporting a framework to analyze complicated auditory scenes, our system maintains real-time operability and state-of-the-art performance in clean speech.

3 0.55692959 5 nips-2004-A Harmonic Excitation State-Space Approach to Blind Separation of Speech

Author: Rasmus K. Olsson, Lars K. Hansen

Abstract: We discuss an identification framework for noisy speech mixtures. A block-based generative model is formulated that explicitly incorporates the time-varying harmonic plus noise (H+N) model for a number of latent sources observed through noisy convolutive mixtures. All parameters including the pitches of the source signals, the amplitudes and phases of the sources, the mixing filters and the noise statistics are estimated by maximum likelihood, using an EM-algorithm. Exact averaging over the hidden sources is obtained using the Kalman smoother. We show that pitch estimation and source separation can be performed simultaneously. The pitch estimates are compared to laryngograph (EGG) measurements. Artificial and real room mixtures are used to demonstrate the viability of the approach. Intelligible speech signals are re-synthesized from the estimated H+N models.

4 0.50148451 96 nips-2004-Learning, Regularization and Ill-Posed Inverse Problems

Author: Lorenzo Rosasco, Andrea Caponnetto, Ernesto D. Vito, Francesca Odone, Umberto D. Giovannini

Abstract: Many works have shown that strong connections relate learning from examples to regularization techniques for ill-posed inverse problems. Nevertheless by now there was no formal evidence neither that learning from examples could be seen as an inverse problem nor that theoretical results in learning theory could be independently derived using tools from regularization theory. In this paper we provide a positive answer to both questions. Indeed, considering the square loss, we translate the learning problem in the language of regularization theory and show that consistency results and optimal regularization parameter choice can be derived by the discretization of the corresponding inverse problem. 1

5 0.4333919 31 nips-2004-Blind One-microphone Speech Separation: A Spectral Learning Approach

Author: Francis R. Bach, Michael I. Jordan

Abstract: We present an algorithm to perform blind, one-microphone speech separation. Our algorithm separates mixtures of speech without modeling individual speakers. Instead, we formulate the problem of speech separation as a problem in segmenting the spectrogram of the signal into two or more disjoint sets. We build feature sets for our segmenter using classical cues from speech psychophysics. We then combine these features into parameterized affinity matrices. We also take advantage of the fact that we can generate training examples for segmentation by artificially superposing separately-recorded signals. Thus the parameters of the affinity matrices can be tuned using recent work on learning spectral clustering [1]. This yields an adaptive, speech-specific segmentation algorithm that can successfully separate one-microphone speech mixtures. 1

6 0.43285006 190 nips-2004-The Rescorla-Wagner Algorithm and Maximum Likelihood Estimation of Causal Parameters

7 0.43097118 25 nips-2004-Assignment of Multiplicative Mixtures in Natural Images

8 0.42817619 70 nips-2004-Following Curved Regularized Optimization Solution Paths

9 0.42315131 120 nips-2004-Modeling Conversational Dynamics as a Mixed-Memory Markov Process

10 0.41963115 207 nips-2004-ℓ₀-norm Minimization for Basis Selection

11 0.39303491 54 nips-2004-Distributed Information Regularization on Graphs

12 0.37666729 42 nips-2004-Computing regularization paths for learning multiple kernels

13 0.36505631 97 nips-2004-Learning Efficient Auditory Codes Using Spikes Predicts Cochlear Filters

14 0.36407042 9 nips-2004-A Method for Inferring Label Sampling Mechanisms in Semi-Supervised Learning

15 0.35637367 132 nips-2004-Nonlinear Blind Source Separation by Integrating Independent Component Analysis and Slow Feature Analysis

16 0.35264441 130 nips-2004-Newscast EM

17 0.34048367 104 nips-2004-Linear Multilayer Independent Component Analysis for Large Natural Scenes

18 0.33908233 138 nips-2004-Online Bounds for Bayesian Algorithms

19 0.33508483 191 nips-2004-The Variational Ising Classifier (VIC) Algorithm for Coherently Contaminated Data

20 0.32756323 196 nips-2004-Triangle Fixing Algorithms for the Metric Nearness Problem


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(13, 0.621), (15, 0.082), (17, 0.013), (26, 0.026), (31, 0.017), (33, 0.125), (39, 0.014), (50, 0.022)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.97757173 85 nips-2004-Instance-Based Relevance Feedback for Image Retrieval

Author: Giorgio Gia\-cin\-to, Fabio Roli

Abstract: High retrieval precision in content-based image retrieval can be attained by adopting relevance feedback mechanisms. These mechanisms require that the user judges the quality of the results of the query by marking all the retrieved images as being either relevant or not. Then, the search engine exploits this information to adapt the search to better meet user’s needs. At present, the vast majority of proposed relevance feedback mechanisms are formulated in terms of search model that has to be optimized. Such an optimization involves the modification of some search parameters so that the nearest neighbor of the query vector contains the largest number of relevant images. In this paper, a different approach to relevance feedback is proposed. After the user provides the first feedback, following retrievals are not based on knn search, but on the computation of a relevance score for each image of the database. This score is computed as a function of two distances, namely the distance from the nearest non-relevant image and the distance from the nearest relevant one. Images are then ranked according to this score and the top k images are displayed. Reported results on three image data sets show that the proposed mechanism outperforms other state-of-the-art relevance feedback mechanisms. 1 In t rod u ct i on A large number of content-based image retrieval (CBIR) systems rely on the vector representation of images in a multidimensional feature space representing low-level image characteristics, e.g., color, texture, shape, etc. [1]. Content-based queries are often expressed by visual examples in order to retrieve from the database the images that are “similar” to the examples. This kind of retrieval is often referred to as K nearest-neighbor retrieval. It is easy to see that the effectiveness of content-based image retrieval systems (CBIR) strongly depends on the choice of the set of visual features, on the choice of the “metric” used to model the user’s perception of image similarity, and on the choice of the image used to query the database [1]. Typically, if we allow different users to mark the images retrieved with a given query as relevant or non-relevant, different subsets of images will be marked as relevant. Accordingly, the need for mechanisms to adapt the CBIR system response based on some feedback from the user is widely recognized. It is interesting to note that while relevance feedback mechanisms have been first introduced in the information retrieval field [2], they are receiving more attention in the CBIR field (Huang). The vast majority of relevance feedback techniques proposed in the literature is based on modifying the values of the search parameters as to better represent the concept the user bears in mind. To this end, search parameters are computed as a function of the relevance values assigned by the user to all the images retrieved so far. As an example, relevance feedback is often formulated in terms of the modification of the query vector, and/or in terms of adaptive similarity metrics. [3]-[7]. Recently, pattern classification paradigms such as SVMs have been proposed [8]. Feedback is thus used to model the concept of relevant images and adjust the search consequently. Concept modeling may be difficult on account of the distribution of relevant images in the selected feature space. “Narrow domain” image databases allows extracting good features, so that images bearing similar concepts belong to compact clusters. On the other hand, “broad domain” databases, such as image collection used by graphic professionals, or those made up of images from the Internet, are more difficult to subdivide in cluster because of the high variability of concepts [1]. In these cases, it is worth extracting only low level, non-specialized features, and image retrieval is better formulated in terms of a search problem rather then concept modeling. The present paper aims at offering an original contribution in this direction. Rather then modeling the concept of “relevance” the user bears in mind, feedback is used to assign each image of the database a relevance score. Such a score depends only from two dissimilarities (distances) computed against the images already marked by the user: the dissimilarity from the set of relevant images, and the dissimilarity from the set of non-relevant images. Despite its computational simplicity, this mechanism allows outperforming state-of-the-art relevance feedback mechanisms both on “narrow domain” databases, and on “broad domain” databases. This paper is organized as follows. Section 2 illustrates the idea behind the proposed mechanism and provides the basic assumptions. Section 3 details the proposed relevance feedback mechanism. Results on three image data sets are presented in Section 4, where performances of other relevance feedback mechanisms are compared. Conclusions are drawn in Section 5. 2 In st an ce- b ased rel evan ce est i m at i on The proposed mechanism has been inspired by classification techniques based on the “nearest case” [9]-[10]. Nearest-case theory provided the mechanism to compute the dissimilarity of each image from the sets of relevant and non–relevant images. The ratio between the nearest relevant image and the nearest non-relevant image has been used to compute the degree of relevance of each image of the database [11]. The present section illustrates the rationale behind the use of the nearest-case paradigm. Let us assume that each image of the database has been represented by a number of low-level features, and that a (dis)similarity measure has been defined so that the proximity between pairs of images represents some kind of “conceptual” similarity. In other words, the chosen feature space and similarity metric is meaningful at least for a restricted number of users. A search in image databases is usually performed by retrieving the k most similar images with respect to a given query. The dimension of k is usually small, to avoid displaying a large number of images at a time. Typical values for k are between 10 and 20. However, as the “relevant” images that the user wishes to retrieve may not fit perfectly with the similarity metric designed for the search engine, the user may be interested in exploring other regions of the feature space. To this end, the user marks the subset of “relevant” images out of the k retrieved. Usually, such relevance feedback is used to perform a new k-nn search by modifying some search parameters, i.e., the position of the query point, the similarity metric, and other tuning parameters [1]-[7]. Recent works proposed the use of support vector machine to learn the distribution of relevant images [8]. These techniques require some assumption about the general form of the distribution of relevant images in the feature space. As it is difficult to make any assumption about such a distribution for broad domain databases, we propose to exploit the information about the relevance of the images retrieved so far in a nearest-neighbor fashion. Nearest-neighbor techniques, as used in statistical pattern recognition, case-based reasoning, or instance-based learning, are effective in all applications where it is difficult to produce a high-level generalization of a “class” of objects [9]-[10],[12][13]. Relevance learning in content base image retrieval may well fit into this definition, as it is difficult to provide a general model that can be adapted to represent different concepts of similarity. In addition, the number of available cases may be too small to estimate the optimal set of parameters for such a general model. On the other hand, it can be more effective to use each “relevant” image as well as each “non-relevant” image, as “cases” or “instances” against which the images of the database should be compared. Consequently, we assume that an image is as much as relevant as much as its dissimilarity from the nearest relevant image is small. Analogously, an image is as much as non-relevant as much as its dissimilarity from the nearest non-relevant image is small. 3 Rel evan ce S core Com p u t ati on According to previous section, each image of the database can be thus characterized by a “degree of relevance” and a “degree of non-relevance” according to the dissimilarities from the nearest relevant image, and from the nearest non-relevant image, respectively. However, it should be noted that these degrees should be treated differently because only “relevant” images represent a “concept” in the user’s mind, while “non-relevant” images may represent a number of other concepts different from user’s interest. In other words, while it is meaningful to treat the degree of relevance as a degree of membership to the class of relevant images, the same does not apply to the degree of non-relevance. For this reason, we propose to use the “degree of non-relevance” to weight the “degree of relevance”. Let us denote with R the subset of indexes j ∈ {1,...,k} related to the set of relevant images retrieved so far and the original query (that is relevant by default), and with NR the subset of indexes j ∈ (1,...,k} related to the set of non-relevant images retrieved so far. For each image I of the database, according to the nearest neighbor rule, let us compute the dissimilarity from the nearest image in R and the dissimilarity from the nearest image in NR. Let us denote these dissimilarities as dR(I) and dNR(I), respectively. The value of dR(I) can be clearly used to measure the degree of relevance of image I, assuming that small values of dR(I) are related to very relevant images. On the other hand, the hypothesis that image I is relevant to the user’s query can be supported by a high value of dNR(I). Accordingly, we defined the relevance score ! dR ( I ) $ relevance ( I ) = # 1 + dN ( I ) &

same-paper 2 0.97361076 27 nips-2004-Bayesian Regularization and Nonnegative Deconvolution for Time Delay Estimation

Author: Yuanqing Lin, Daniel D. Lee

Abstract: Bayesian Regularization and Nonnegative Deconvolution (BRAND) is proposed for estimating time delays of acoustic signals in reverberant environments. Sparsity of the nonnegative filter coefficients is enforced using an L1 -norm regularization. A probabilistic generative model is used to simultaneously estimate the regularization parameters and filter coefficients from the signal data. Iterative update rules are derived under a Bayesian framework using the Expectation-Maximization procedure. The resulting time delay estimation algorithm is demonstrated on noisy acoustic data.

3 0.96382004 176 nips-2004-Sub-Microwatt Analog VLSI Support Vector Machine for Pattern Classification and Sequence Estimation

Author: Shantanu Chakrabartty, Gert Cauwenberghs

Abstract: An analog system-on-chip for kernel-based pattern classification and sequence estimation is presented. State transition probabilities conditioned on input data are generated by an integrated support vector machine. Dot product based kernels and support vector coefficients are implemented in analog programmable floating gate translinear circuits, and probabilities are propagated and normalized using sub-threshold current-mode circuits. A 14-input, 24-state, and 720-support vector forward decoding kernel machine is integrated on a 3mm×3mm chip in 0.5µm CMOS technology. Experiments with the processor trained for speaker verification and phoneme sequence estimation demonstrate real-time recognition accuracy at par with floating-point software, at sub-microwatt power. 1

4 0.93893099 138 nips-2004-Online Bounds for Bayesian Algorithms

Author: Sham M. Kakade, Andrew Y. Ng

Abstract: We present a competitive analysis of Bayesian learning algorithms in the online learning setting and show that many simple Bayesian algorithms (such as Gaussian linear regression and Bayesian logistic regression) perform favorably when compared, in retrospect, to the single best model in the model class. The analysis does not assume that the Bayesian algorithms’ modeling assumptions are “correct,” and our bounds hold even if the data is adversarially chosen. For Gaussian linear regression (using logloss), our error bounds are comparable to the best bounds in the online learning literature, and we also provide a lower bound showing that Gaussian linear regression is optimal in a certain worst case sense. We also give bounds for some widely used maximum a posteriori (MAP) estimation algorithms, including regularized logistic regression. 1

5 0.93240929 114 nips-2004-Maximum Likelihood Estimation of Intrinsic Dimension

Author: Elizaveta Levina, Peter J. Bickel

Abstract: We propose a new method for estimating intrinsic dimension of a dataset derived by applying the principle of maximum likelihood to the distances between close neighbors. We derive the estimator by a Poisson process approximation, assess its bias and variance theoretically and by simulations, and apply it to a number of simulated and real datasets. We also show it has the best overall performance compared with two other intrinsic dimension estimators. 1

6 0.88888985 36 nips-2004-Class-size Independent Generalization Analsysis of Some Discriminative Multi-Category Classification

7 0.84246629 5 nips-2004-A Harmonic Excitation State-Space Approach to Blind Separation of Speech

8 0.71441811 181 nips-2004-Synergies between Intrinsic and Synaptic Plasticity in Individual Model Neurons

9 0.65524787 135 nips-2004-On-Chip Compensation of Device-Mismatch Effects in Analog VLSI Neural Networks

10 0.64789146 28 nips-2004-Bayesian inference in spiking neurons

11 0.64147788 58 nips-2004-Edge of Chaos Computation in Mixed-Mode VLSI - A Hard Liquid

12 0.63704568 163 nips-2004-Semi-parametric Exponential Family PCA

13 0.63633126 22 nips-2004-An Investigation of Practical Approximate Nearest Neighbor Algorithms

14 0.63290417 131 nips-2004-Non-Local Manifold Tangent Learning

15 0.63257486 102 nips-2004-Learning first-order Markov models for control

16 0.63007265 116 nips-2004-Message Errors in Belief Propagation

17 0.63000166 204 nips-2004-Variational Minimax Estimation of Discrete Distributions under KL Loss

18 0.62605387 75 nips-2004-Heuristics for Ordering Cue Search in Decision Making

19 0.62353081 119 nips-2004-Mistake Bounds for Maximum Entropy Discrimination

20 0.61947584 70 nips-2004-Following Curved Regularized Optimization Solution Paths