Author: Yuxuan Wang, Deliang Wang
Abstract: While human listeners excel at selectively attending to a conversation in a cocktail party, machine performance is still far inferior by comparison. We show that the cocktail party problem, or the speech separation problem, can be effectively approached via structured prediction. To account for temporal dynamics in speech, we employ conditional random fields (CRFs) to classify speech dominance within each time-frequency unit for a sound mixture. To capture complex, nonlinear relationship between input and output, both state and transition feature functions in CRFs are learned by deep neural networks. The formulation of the problem as classification allows us to directly optimize a measure that is well correlated with human speech intelligibility. The proposed system substantially outperforms existing ones in a variety of noises.
[(0, 0.092), (21, 0.027), (38, 0.085), (41, 0.298), (42, 0.032), (44, 0.018), (54, 0.027), (55, 0.024), (74, 0.032), (76, 0.077), (80, 0.107), (87, 0.012), (92, 0.054)]
simIndex simValue paperId paperTitle
same-paper 1 0.75011718 72 nips-2012-Cocktail Party Processing via Structured Prediction
Author: Yuxuan Wang, Deliang Wang
Abstract: While human listeners excel at selectively attending to a conversation in a cocktail party, machine performance is still far inferior by comparison. We show that the cocktail party problem, or the speech separation problem, can be effectively approached via structured prediction. To account for temporal dynamics in speech, we employ conditional random fields (CRFs) to classify speech dominance within each time-frequency unit for a sound mixture. To capture complex, nonlinear relationship between input and output, both state and transition feature functions in CRFs are learned by deep neural networks. The formulation of the problem as classification allows us to directly optimize a measure that is well correlated with human speech intelligibility. The proposed system substantially outperforms existing ones in a variety of noises.
2 0.73185641 345 nips-2012-Topic-Partitioned Multinetwork Embeddings
Author: Peter Krafft, Juston Moore, Bruce Desmarais, Hanna M. Wallach
Abstract: We introduce a new Bayesian admixture model intended for exploratory analysis of communication networks—specifically, the discovery and visualization of topic-specific subnetworks in email data sets. Our model produces principled visualizations of email networks, i.e., visualizations that have precise mathematical interpretations in terms of our model and its relationship to the observed data. We validate our modeling assumptions by demonstrating that our model achieves better link prediction performance than three state-of-the-art network models and exhibits topic coherence comparable to that of latent Dirichlet allocation. We showcase our model’s ability to discover and visualize topic-specific communication patterns using a new email data set: the New Hanover County email network. We provide an extensive analysis of these communication patterns, leading us to recommend our model for any exploratory analysis of email networks or other similarly-structured communication data. Finally, we advocate for principled visualization as a primary objective in the development of new network models. 1
3 0.54452741 280 nips-2012-Proper losses for learning from partial labels
Author: Jesús Cid-sueiro
Abstract: This paper discusses the problem of calibrating posterior class probabilities from partially labelled data. Each instance is assumed to be labelled as belonging to one of several candidate categories, at most one of them being true. We generalize the concept of proper loss to this scenario, we establish a necessary and sufficient condition for a loss function to be proper, and we show a direct procedure to construct a proper loss for partial labels from a conventional proper loss. The problem can be characterized by the mixing probability matrix relating the true class of the data and the observed labels. The full knowledge of this matrix is not required, and losses can be constructed that are proper for a wide set of mixing probability matrices. 1
4 0.51982194 191 nips-2012-Learning the Architecture of Sum-Product Networks Using Clustering on Variables
Author: Aaron Dennis, Dan Ventura
Abstract: The sum-product network (SPN) is a recently-proposed deep model consisting of a network of sum and product nodes, and has been shown to be competitive with state-of-the-art deep models on certain difficult tasks such as image completion. Designing an SPN network architecture that is suitable for the task at hand is an open question. We propose an algorithm for learning the SPN architecture from data. The idea is to cluster variables (as opposed to data instances) in order to identify variable subsets that strongly interact with one another. Nodes in the SPN network are then allocated towards explaining these interactions. Experimental evidence shows that learning the SPN architecture significantly improves its performance compared to using a previously-proposed static architecture. 1
5 0.51605755 192 nips-2012-Learning the Dependency Structure of Latent Factors
Author: Yunlong He, Yanjun Qi, Koray Kavukcuoglu, Haesun Park
Abstract: In this paper, we study latent factor models with dependency structure in the latent space. We propose a general learning framework which induces sparsity on the undirected graphical model imposed on the vector of latent factors. A novel latent factor model SLFA is then proposed as a matrix factorization problem with a special regularization term that encourages collaborative reconstruction. The main benefit (novelty) of the model is that we can simultaneously learn the lowerdimensional representation for data and model the pairwise relationships between latent factors explicitly. An on-line learning algorithm is devised to make the model feasible for large-scale learning problems. Experimental results on two synthetic data and two real-world data sets demonstrate that pairwise relationships and latent factors learned by our model provide a more structured way of exploring high-dimensional data, and the learned representations achieve the state-of-the-art classification performance. 1
6 0.50186574 342 nips-2012-The variational hierarchical EM algorithm for clustering hidden Markov models
7 0.50033814 233 nips-2012-Multiresolution Gaussian Processes
8 0.50007951 270 nips-2012-Phoneme Classification using Constrained Variational Gaussian Process Dynamical System
9 0.49942887 65 nips-2012-Cardinality Restricted Boltzmann Machines
10 0.49930447 104 nips-2012-Dual-Space Analysis of the Sparse Linear Model
11 0.49886811 77 nips-2012-Complex Inference in Neural Circuits with Probabilistic Population Codes and Topic Models
12 0.49792489 355 nips-2012-Truncation-free Online Variational Inference for Bayesian Nonparametric Models
13 0.4977102 200 nips-2012-Local Supervised Learning through Space Partitioning
14 0.49749586 172 nips-2012-Latent Graphical Model Selection: Efficient Methods for Locally Tree-like Graphs
15 0.49635652 197 nips-2012-Learning with Recursive Perceptual Representations
16 0.49498212 251 nips-2012-On Lifting the Gibbs Sampling Algorithm
17 0.49330455 229 nips-2012-Multimodal Learning with Deep Boltzmann Machines
18 0.49300578 124 nips-2012-Factorial LDA: Sparse Multi-Dimensional Text Models
19 0.49139935 316 nips-2012-Small-Variance Asymptotics for Exponential Family Dirichlet Process Mixture Models
20 0.4905929 19 nips-2012-A Spectral Algorithm for Latent Dirichlet Allocation