nips nips2013 nips2013-293 nips2013-293-reference knowledge-graph by maker-knowledge-mining

293 nips-2013-Sign Cauchy Projections and Chi-Square Kernel


Source: pdf

Author: Ping Li, Gennady Samorodnitsk, John Hopcroft

Abstract: The method of stable random projections is useful for efficiently approximating the lα distance (0 < α ≤ 2) in high dimension and it is naturally suitable for data streams. In this paper, we propose to use only the signs of the projected data and we analyze the probability of collision (i.e., when the two signs differ). Interestingly, when α = 1 (i.e., Cauchy random projections), we show that the probability of collision can be accurately approximated as functions of the chi-square (χ2 ) similarity. In text and vision applications, the χ2 similarity is a popular measure when the features are generated from histograms (which are a typical example of data streams). Experiments confirm that the proposed method is promising for large-scale learning applications. The full paper is available at arXiv:1308.1009. There are many future research problems. For example, when α → 0, the collision probability is a function of the resemblance (of the binary-quantized data). This provides an effective mechanism for resemblance estimation in data streams. 1


reference text

[1] http://googleresearch.blogspot.com/2010/04/ lessons-learned-developing-practical.html.

[2] Bogdan Alexe, Thomas Deselaers, and Vittorio Ferrari. What is an object? In CVPR, pages 73–80, 2010.

[3] Leon Bottou. http://leon.bottou.org/projects/sgd.

[4] Olivier Chapelle, Patrick Haffner, and Vladimir N. Vapnik. Support vector machines for histogram-based image classification. IEEE Trans. Neural Networks, 10(5):1055–1064, 1999.

[5] Moses S. Charikar. Similarity estimation techniques from rounding algorithms. In STOC, 2002.

[6] Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. Liblinear: A library for large linear classification. Journal of Machine Learning Research, 9:1871–1874, 2008.

[7] Yoav Freund, Sanjoy Dasgupta, Mayank Kabra, and Nakul Verma. Learning the structure of manifolds using random projections. In NIPS, Vancouver, BC, Canada, 2008.

[8] Jerome H. Friedman, F. Baskett, and L. Shustek. An algorithm for finding nearest neighbors. IEEE Transactions on Computers, 24:1000–1006, 1975.

[9] Michel X. Goemans and David P. Williamson. Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. Journal of ACM, 42(6):1115–1145, 1995.

[10] Matthias Hein and Olivier Bousquet. Hilbertian metrics and positive definite kernels on probability measures. In AISTATS, pages 136–143, Barbados, 2005.

[11] Piotr Indyk. Stable distributions, pseudorandom generators, embeddings, and data stream computation. Journal of ACM, 53(3):307–323, 2006.

[12] Piotr Indyk and Rajeev Motwani. Approximate nearest neighbors: Towards removing the curse of dimensionality. In STOC, pages 604–613, Dallas, TX, 1998.

[13] Yugang Jiang, Chongwah Ngo, and Jun Yang. Towards optimal bag-of-features for object categorization and semantic video retrieval. In CIVR, pages 494–501, Amsterdam, Netherlands, 2007.

[14] Thorsten Joachims. Training linear svms in linear time. In KDD, pages 217–226, Pittsburgh, PA, 2006.

[15] Fuxin Li, Guy Lebanon, and Cristian Sminchisescu. A linear approximation to the χ2 kernel with geometric convergence. Technical report, arXiv:1206.4074, 2013.

[16] Ping Li. Very sparse stable random projections for dimension reduction in lα (0 < α ≤ 2) norm. In KDD, San Jose, CA, 2007.

[17] Ping Li. Estimators and tail bounds for dimension reduction in lα (0 < α ≤ 2) using stable random projections. In SODA, pages 10 – 19, San Francisco, CA, 2008.

[18] Ping Li. Improving compressed counting. In UAI, Montreal, CA, 2009.

[19] Ping Li, Michael Mitzenmacher, and Anshumali Shrivastava. Coding for random projections. 2013.

[20] Ping Li, Art B Owen, and Cun-Hui Zhang. One permutation hashing. In NIPS, Lake Tahoe, NV, 2012.

[21] Ping Li, Cun-Hui Zhang, and Tong Zhang. Compressed counting meets compressed sensing. 2013.

[22] S. Muthukrishnan. Data streams: Algorithms and applications. Foundations and Trends in Theoretical Computer Science, 1:117–236, 2 2005.

[23] Noam Nisan. Pseudorandom generators for space-bounded computations. In STOC, 1990.

[24] Gennady Samorodnitsky and Murad S. Taqqu. Stable Non-Gaussian Random Processes. Chapman & Hall, New York, 1994.

[25] Shai Shalev-Shwartz, Yoram Singer, and Nathan Srebro. Pegasos: Primal estimated sub-gradient solver for svm. In ICML, pages 807–814, Corvalis, Oregon, 2007.

[26] Andrea Vedaldi and Andrew Zisserman. Efficient additive kernels via explicit feature maps. IEEE Trans. Pattern Anal. Mach. Intell., 34(3):480–492, 2012.

[27] Sreekanth Vempati, Andrea Vedaldi, Andrew Zisserman, and C. V. Jawahar. Generalized rbf feature maps for efficient detection. In BMVC, pages 1–11, Aberystwyth, UK, 2010.

[28] Gang Wang, Derek Hoiem, and David A. Forsyth. Building text features for object image classification. In CVPR, pages 1367–1374, Miami, Florida, 2009.

[29] Jinjun Wang, Jianchao Yang, Kai Yu, Fengjun Lv, Thomas S. Huang, and Yihong Gong. Localityconstrained linear coding for image classification. In CVPR, pages 3360–3367, San Francisco, CA, 2010.

[30] Kilian Weinberger, Anirban Dasgupta, John Langford, Alex Smola, and Josh Attenberg. Feature hashing for large scale multitask learning. In ICML, pages 1113–1120, 2009.

[31] Haiquan (Chuck) Zhao, Nan Hua, Ashwin Lall, Ping Li, Jia Wang, and Jun Xu. Towards a universal sketch for origin-destination network measurements. In Network and Parallel Computing, pages 201–213, 2011. 9