nips nips2008 nips2008-219 nips2008-219-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Yair Weiss, Antonio Torralba, Rob Fergus
Abstract: Semantic hashing[1] seeks compact binary codes of data-points so that the Hamming distance between codewords correlates with semantic similarity. In this paper, we show that the problem of finding a best code for a given dataset is closely related to the problem of graph partitioning and can be shown to be NP hard. By relaxing the original problem, we obtain a spectral method whose solutions are simply a subset of thresholded eigenvectors of the graph Laplacian. By utilizing recent results on convergence of graph Laplacian eigenvectors to the Laplace-Beltrami eigenfunctions of manifolds, we show how to efficiently calculate the code of a novel datapoint. Taken together, both learning the code and applying it to a novel point are extremely simple. Our experiments show that our codes outperform the state-of-the art. 1
[1] R. R. Salakhutdinov and G. E. Hinton. Learning a nonlinear embedding by preserving class neighbourhood structure. In AISTATS, 2007.
[2] A. Torralba, R. Fergus, and W. T. Freeman. Tiny images. Technical Report MIT-CSAIL-TR-2007-024, Computer Science and Artificial Intelligence Lab, Massachusetts Institute of Technology, 2007.
[3] A. Torralba, R. Fergus, and Y. Weiss. Small codes and large databases for recognition. In CVPR, 2008.
[4] James Hays and Alexei A Efros. (SIGGRAPH 2007), 26(3), 2007. Scene completion using millions of photographs. ACM Transactions on Graphics
[5] R. R. Salakhutdinov and G. E. Hinton. Semantic hashing. In SIGIR workshop on Information Retrieval and applications of Graphical Models, 2007.
[6] Mikhail Belkin and Partha Niyogi. Laplacian eigenmaps and spectral techniques for embedding and clustering. In NIPS, pages 585–591, 2001.
[7] Geoffrey E. Hinton and Sam T. Roweis. Stochastic neighbor embedding. In NIPS, pages 833–840, 2002.
[8] A. Andoni and P. Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In FOCS, pages 459–468, 2006.
[9] R. R. Salakhutdinov and G. E. Hinton. Learning a nonlinear embedding by preserving class neighbourhood structure. In AI and Statistics, 2007.
[10] G. Shakhnarovich, P. Viola, and T. Darrell. Fast pose estimation with parameter sensitive hashing. In ICCV, 2003.
[11] A.Y. Ng, M.I. Jordan, and Y. Weiss. On spectral clustering, analysis and an algorithm. In Advances in Neural Information Processing 14, 2001.
[12] J. Shi and J. Malik. Normalized cuts and image segmentation. In Proc. IEEE Conf. Computer Vision and Pattern Recognition, pages 731–737, 1997.
[13] Yoshua Bengio, Olivier Delalleau, Nicolas Le Roux, Jean-Fran¸ois Paiement, Pascal Vincent, and Marie Ouimet. Learnc ing eigenfunctions links spectral embedding and kernel pca. Neural Computation, 16(10):2197–2219, 2004.
[14] Charless Fowlkes, Serge Belongie, Fan R. K. Chung, and Jitendra Malik. Spectral grouping using the nystr¨m method. o IEEE Trans. Pattern Anal. Mach. Intell., 26(2):214–225, 2004.
[15] R. R. Coifman, S. Lafon, A. B. Lee, M. Maggioni, B. Nadler, F. Warner, and S. W. Zucker. Geometric diffusions as a tool for harmonic analysis and structure definition of data: Diffusion maps. Proceedings of the National Academy of Sciences, 102(21):7426–7431, May 2005.
[16] M. Belkin and P. Niyogi. Towards a theoretical foundation for laplacian based manifold methods. Journal of Computer and System Sciences, 2007.
[17] Boaz Nadler, Stephane Lafon amd Ronald R. Coifman, and Ioannis G. Kevrekidis. Diffusion maps, spectral clustering and reaction coordinates of dynamical systems. Arxiv, 2008. http://arxiv.org/. 8