nips nips2007 nips2007-152 nips2007-152-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Kaihua Zhu, Hao Wang, Hongjie Bai, Jian Li, Zhihuan Qiu, Hang Cui, Edward Y. Chang
Abstract: Support Vector Machines (SVMs) suffer from a widely recognized scalability problem in both memory use and computational time. To improve scalability, we have developed a parallel SVM algorithm (PSVM), which reduces memory use through performing a row-based, approximate matrix factorization, and which loads only essential data to each machine to perform parallel computation. Let n denote the number of training instances, p the reduced matrix dimension after factorization (p is significantly smaller than n), and m the number of machines. PSVM reduces the memory requirement from O(n2 ) to O(np/m), and improves computation time to O(np2 /m). Empirical study shows PSVM to be effective. PSVM Open Source is available for download at http://code.google.com/p/psvm/.
Bach, F. R., & Jordan, M. I. (2005). Predictive low-rank decomposition for kernel methods. Proceedings of the 22nd International Conference on Machine Learning. Boyd, S. (2004). Convex optimization. Cambridge University Press. Chang, C.-C., & Lin, C.-J. (2001). LIBSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/ cjlin/libsvm. Chu, C.-T., Kim, S. K., Lin, Y.-A., Yu, Y., Bradski, G., Ng, A. Y., & Olukotun, K. (2006). Map reduce for machine learning on multicore. NIPS. Dean, J., & Ghemawat, S. (2004). Mapreduce: Simplified data processing on large clusters. OSDI’04: Symposium on Operating System Design and Implementation. Fine, S., & Scheinberg, K. (2001). Efficient svm training using low-rank kernel representations. Journal of Machine Learning Research, 2, 243–264. Ghemawat, S., Gobioff, H., & Leung, S.-T. (2003). The google file system. 19th ACM Symposium on Operating Systems Principles. Golub, G. H., & Loan, C. F. V. (1996). Matrix computations. Johns Hopkins University Press. Graf, H. P., Cosatto, E., Bottou, L., Dourdanovic, I., & Vapnik, V. (2005). Parallel support vector machines: The cascade svm. In Advances in neural information processing systems 17, 521–528. Joachims, T. (1998). Making large-scale svm learning practical. Advances in Kernel Methods Support Vector Learning. Joachims, T. (2006). Training linear svms in linear time. ACM KDD, 217–226. Lee, Y.-J., & Mangasarian, O. L. (2001). Rsvm: Reduced support vector machines. First SIAM International Conference on Data Mining. Chicago. Loosli, G., & Canu, S. (2006). Comments on the core vector machines: Fast svm training on very large data sets (Technical Report). Mehrotra, S. (1992). On the implementation of a primal-dual interior point method. SIAM J. Optimization, 2. Platt, J. (1998). Sequential minimal optimization: A fast algorithm for training support vector machines (Technical Report MSR-TR-98-14). Microsoft Research. Tsang, I. W., Kwok, J. T., & Cheung, P.-M. (2005). Core vector machines: Fast svm training on very large data sets. Journal of Machine Learning Research, 6, 363–392. Vapnik, V. (1995). The nature of statistical learning theory. New York: Springer. Vishwanathan, S., Smola, A. J., & Murty, M. N. (2003). Simplesvm. ICML. 8