nips nips2011 nips2011-231 nips2011-231-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Dominique Tschopp, Suhas Diggavi, Payam Delgosha, Soheil Mohajer
Abstract: This paper addresses the problem of finding the nearest neighbor (or one of the R-nearest neighbors) of a query object q in a database of n objects, when we can only use a comparison oracle. The comparison oracle, given two reference objects and a query object, returns the reference object most similar to the query object. The main problem we study is how to search the database for the nearest neighbor (NN) of a query, while minimizing the questions. The difficulty of this problem depends on properties of the underlying database. We show the importance of a characterization: combinatorial disorder D which defines approximate triangle n inequalities on ranks. We present a lower bound of Ω(D log D + D2 ) average number of questions in the search phase for any randomized algorithm, which demonstrates the fundamental role of D for worst case behavior. We develop 3 a randomized scheme for NN retrieval in O(D3 log2 n + D log2 n log log nD ) 3 questions. The learning requires asking O(nD3 log2 n + D log2 n log log nD ) questions and O(n log2 n/ log(2D)) bits to store.
[1] N. Goyal, Y. Lifshits, and H. Schutze, “Disorder inequality: A combinatorial approach to nearest neighbor search,” in WSDM, 2008, pp. 25–32.
[2] S. Dasgupta and Y. Freund, “Random projection trees and low dimensional manifolds,” in STOC, 2008, pp. 537–546. ´
[3] D. Tschopp, “Routing and search on large scale networks,” Ph.D. dissertation, Ecole Polytechnique F´ d´ rale de Lausanne (EPFL), 2010. e e
[4] D. Tschopp, S. Diggavi, P. Delgosha, and S. Mohajer, “Randomized algorithms for comparison-based search: Supplementary material,” 2011, submitted to NIPS as supplementary material.
[5] K. Clarkson, “Nearest-neighbor searching and metric space dimensions,” in Nearest-Neighbor Methods for Learning and Vision: Theory and Practice, G. Shakhnarovich, T. Darrell, and P. Indyk, Eds. MIT Press, 2006, pp. 15–59.
[6] Y. Rubner, C. Tomasi, and L. J. Guibas, “The earth mover’s distance as a metric for image retrieval,” International Journal of Computer Vision, vol. 40, no. 2, pp. 99–121, 2000.
[7] E. Demidenko, “Kolmogorov-smirnov test for image comparison,” in Computational Science and Its Applications - ICCSA, 2004, pp. 933–939.
[8] M. Nachtegael, S. Schulte, V. De Witte, T. Mlange, and E. Kerre, “Image similarity, from fuzzy sets to color image applications,” in Advances in Visual Information Systems, 2007, pp. 26–37.
[9] S. Santini and R. Jain, “Similarity measures,” IEEE transactions on Pattern Analysis and Machine Intelligence, vol. 21, no. 9, pp. 871–883, 1999.
[10] G. Chechik, V. Sharma, U. Shalit, and S. Bengio, “Large scale online learning of image similarity through ranking,” Journal of Machine Learning Research, vol. 11, pp. 1109–1135, 2010.
[11] A. Frome, Y. Singer, F. Sha, and J. Malik, “Learning globally-consistent local distance functions for shape-based image retrieval and classification,” in ICCV, 2007, pp. 1–8.
[12] Y. Lifshits and S. Zhang, “Combinatorial algorithms for nearest neighbors, near-duplicates and smallworld design,” in SODA, 2009, pp. 318–326.
[13] R. Krauthgamer and J. R. Lee, “Navigating nets: simple algorithms for proximity search,” in SODA, 2004, pp. 798–807.
[14] D. R. Karger and M. Ruhl, “Finding nearest neighbors in growth-restricted metrics,” in STOC, 2002, pp. 741–750.
[15] T. Cormen, C. Leiserson, R. Rivest, and C. Stein, “Introduction to algorithms,” MIT Press and McGrawHill Book Company, vol. 7, pp. 1162–1171, 1976.
[16] R. Motwani and P. Raghavan, Randomized Algorithms. 9 Cambridge University Press, 1995.