jmlr jmlr2009 jmlr2009-90 jmlr2009-90-reference knowledge-graph by maker-knowledge-mining

90 jmlr-2009-Structure Spaces


Source: pdf

Author: Brijnesh J. Jain, Klaus Obermayer

Abstract: Finite structures such as point patterns, strings, trees, and graphs occur as ”natural” representations of structured data in different application areas of machine learning. We develop the theory of structure spaces and derive geometrical and analytical concepts such as the angle between structures and the derivative of functions on structures. In particular, we show that the gradient of a differentiable structural function is a well-defined structure pointing in the direction of steepest ascent. Exploiting the properties of structure spaces, it will turn out that a number of problems in structural pattern recognition such as central clustering or learning in structured output spaces can be formulated as optimization problems with cost functions that are locally Lipschitz. Hence, methods from nonsmooth analysis are applicable to optimize those cost functions. Keywords: graphs, graph matching, learning in structured domains, nonsmooth optimization


reference text

B. Bonev, F. Escolano, M.A. Lozano, P. Suau, M.A. Cazorla, and W. Aguilar. Constellations and the unsupervised learning of graph. Lecture Notes In Computer Science, 4538:340, 2007. A. Caprara and G. Lancia. Structural alignment of large–size proteins via lagrangian relaxation. In Proceedings of the 6th Annual International Conference on Research in Computational Molecular Biology (RECOMB 2002), pages 100–108, 2002. A. Caprara, R. Carr, S. Istrail, G. Lancia, and B. Walenz. 1001 optimal pdb structure alignments: Integer programming methods for finding the maximum contact map overlap. Journal of Computational Biology, 11(1):27–52, 2004. F. H. Clarke. Optimization and Nonsmooth Analysis. SIAM, 1990. T. F. Cox and M. A. A. Cox. Multidimensional Scaling. CRC Press, 2000. L. Dehaspe, H. Toivonen, and R.D. King. Finding frequent substructures in chemical compounds. In In Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 30–36, 1998. 2710 S TRUCTURE S PACES P. Dosch and E. Valveny. Report on the second symbol recognition contest. In GREC 2005, pages 381–397, 2006. M.A. Eshera and K.S. Fu. An image understanding system using attributed symbolic representation and inexact graph-matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8:604–618, 1986. T. G¨ rtner. A survey of kernels for structured data. ACM SIGKDD Explorations Newsletter, 5(1): a 49–58, 2003. T. G¨ rtner. Kernels for Structured Data. PhD thesis, University of Bonn, Germany, 2005. a S. Gold and A. Rangarajan. A graduated assignment algorithm for graph matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(4):377–388, 1996. S. Gold, A. Rangarajan, and E. Mjolsness. Learning with pre-knowledge: Clustering with point and graph-matching distance measures. Neural Computation, 8:787–804, 1996. L. Goldfarb. A new approach to pattern recognition. Progress in Pattern Recognition, 2:241–402, 1985. D. Goldman, S. Istrail, and C.H. Papadimitriou. Algorithmic aspects of protein structure similarity. In 40th Annual Symposium on Foundations of Computer Science, pages 512–521, 1999. T. Graepel and K. Obermayer. A self-organizing map for proximity data. Neural Computation, 11: 139–155, 1999. T. Graepel, R. Herbrich, P. Bollmann-Sdorra, and K. Obermayer. Classification on pairwise proximity data. In Advances in Neural Information Processing, volume 11, pages 438–444,, 1999. S. G¨ nter and H. Bunke. Self-organizing map for clustering in the graph domain. Pattern Recogniu tion Letters, 23:401–417, 2002. D. Gusfield. Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, 1997. J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. In In Proceedings of the ACM SIGMOD International Conference on Management of Database, pages 1–12, 2000. M. Hein, O. Bousquet, and B. Sch¨ lkopf. Maximal margin classification for metric spaces. Journal o of Computer and System Sciences, 71(3):333–359, 2005. R. Herbrich, T. Graepel, P. Bollmann-Sdorra, and K. Obermayer. Learning a preference relation for information retrieval. In Workshop Text Categorization and Machine Learning, International Conference on Machine Learning 1998, page 8084, 1998. D. Hochbaum and D. Shmoys. A best possible heuristic for the k-center problem. Mathematics of Operations Research, 10(2):180–184, 1985. S. Hochreiter and K. Obermayer. Kernel Methods in Computational Biology, chapter Gene Selection for Microarray Data, pages 319–356. MIT Press, Cambridge, Massachusetts, 2004. 2711 JAIN AND O BERMAYER S. Hochreiter and K. Obermayer. Support vector machines for dyadic data. Neural Computation, 18:1472–1510, 2006. T. Hofmann and J. M. Buhmann. Pairwise data clustering by deterministic annealing. IEEE Transaction on Pattern Analysis and Machine Intelligence, 19(1):1–14, 1997. L. Holm and C. Sander. Protein structure comparison by alignment of distance matrices. Journal of Molecular Biology, 233(1):123–38, 1993. A. Inokuchi, T. Washio, and H. Motoda. An apriori-based algorithm for mining frequent substructures from graph data. In In Proceedings of the 4th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD’00), pages 13–23, 2000. B.J. Jain and M. Lappe. Joining softassign and dynamic programming for the contact map overlap problem. In Proceedings of the 1st International Conference on Bioinformatics Research and Development (BIRD 2007), 2007. B.J. Jain and K. Obermayer. On the sample mean of graphs. In Proceedings of the International Joint Conference on Neural Networks (IJCNN 2008), 2008. B.J. Jain and K. Obermayer. Multiple alignment of contact maps. In International Joint Conference on Neural Networks (IJCNN 2009), 2009. B.J. Jain and F. Wysotzki. Central clustering of attributed graphs. Machine Learning, 56(1):169– 207, 2004. X. Jiang, A. M¨ nger, and H. Bunke. On median graphs: Properties, algorithms, and applications. u IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(10):1144–1151, 2001. H Kashima, K Tsuda, and A Inokuchi. Marginalized kernels between labeled graphs. In Proceedings of the Twentieth International Conference on Machine Learning (ICML-2003), 2003. J. Kazius, R. McGuire, and R. Bursi. Derivation and validation of toxicophores for mutagenicity prediction. Journal of Medicinal Chemistry, 48(1):312–320, 2005. M. Kuramochi and G. Karypis. Frequent subgraph discovery. In In Proceedings of IEEE International Conference on Data Mining ICDM’01, pages 313–320, 2001. V. Levenshtein. Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics-Doklady, 10:707–710, 1966. Huma Lodhi, Craig Saunders, John Shawe-Taylor, Nello Cristianini, and Chris Watkins. Text classification using string kernels. Journal of Machine Learning Research, 2:419–444, 2002. M. A. Lozano and F. Escolano. Em algorithm for clustering an ensemble of graphs with comb matching. In A. Rangarajan, M. Figueiredo, and J. Zerubia, editors, Energy Minimization Methods in Computer Vision and Pattern Recognition, EMMCVPR 2003, LNCS 2683, pages 52–67. Springer, 2003. B. Luo, R. C. Wilson, and E. R. Hancock. Spectral embedding of graphs. Pattern Recognition, 36 (10):2213–2230, 2003. 2712 S TRUCTURE S PACES M.M. M¨ kel¨ and P. Neittaanm¨ ki. Nonsmooth Optimization: Analysis and Algorithms with Applia a a cations to Optimal Control. World Scientific, 1992. H. Quang Minh and T. Hofmann. Learning over compact metric spaces. In Proceedings of the 17th Annual Conference on Learning Theory (COLT 2004), 2004. R. Myers, R.C. Wilson, and E.R. Hancock. Bayesian graph edit distance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(6):628–635, 2000. M. Neuhaus and H. Bunke. A graph matching based approach to fingerprint classification using directional variance. In AVBPA 2005, pages 191–200, 2005. E. Pekalska, P. Paclik, and R.P.W. Duin. A generalized kernel approach to dissimilarity-based classification. Journal of Machine Learning Research, 2:175–211, 2001. N. Ratliff, J. A. Bagnell, and M. Zinkevich. Subgradient methods for maximum margin structured learning. In In Proceddings of the 23rd International Conference on Machine Learning (ICML’06), Workshop on Learning in Structured Output Spaces, 2006. J.W. Raymond and P. Willett. Maximum common subgraph isomorphism algorithms for the matching of chemical structure. Journal of Computer-Aided Molecular Design, 16:521–533, 2002. K. Riesen and H. Bunke. IAM graph database repository for graph based pattern recognition and machine learning. In SSPR 2008, pages 287–297, 2008. A. Robles-Kelly and E.R. Hancock. Graph edit distance from spectral seriation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(3):365– 378, 2005. A. Sanfeliu and K.S. Fu. A distance measure between attributed relational graphs for pattern recognition. IEEE Transactions on Systems, Man, and Cybernetics, 13:353–362, 1983. I. J. Schoenberg. On certain metric spaces arising from euclidean spaces by a change of metric and their imbedding in hilbert space. Annals of Mathematics, 38(4):787–793, 1937. L. G. Shapiro and R.M. Haralick. A metric for comparing relational descriptions. IEEE Transaction on Pattern Analysis and Machine Intelligence, 7:90–94, 1985. D. Shasha and K. Zhang. Simple fast algorithms for the editing distance between trees and related problems. SIAM Journal of Computing, 18:245–262, 1989. N.Z. Shor. Minimization Methods for non-differentiable Functions. Springer, 1985. R. Sinkhorn. A relationship between arbitrary positive matrices and doubly stochastic matrices. Annals of Mathematical Statistics, 35(2):876 – 879, 1964. B. Taskar. Learning Structured Prediction Models: A Large Margin Approach. PhD thesis, Stanford University, 2004. S. Theodoridis and K. Koutroumbas. Pattern Recognition. Elsevier, 4 edition, 2009. 2713 JAIN AND O BERMAYER I. Tsochantaridis, T. Hofmann, T. Joachims, and Y. Altun. Support vector machine learning for interdependent and structured output spaces. In In Proceedings of the 21st International Conference on Machine Learning (ICML’04), 2004. U. von Luxburg and O. Bousquet. Distance-based classification with Lipschitz functions. Journal of Machine Learning Research, 5:669–665, 2004. C. Watson and C. Wilson. NIST Special Database 4, Fingerprint Database. National Institute of Standards and Technology, 1992. W. Xie and N.V. Sahinidis. A reduction-based exact algorithm for the contact map overlap problem. Journal of Computational Biology, 14(5):637–654, 2007. X. Yan and J. Han. gSpan: Graph-based substructure pattern mining. In Second IEEE International Conference on Data Mining (ICDM’02), pages 721–724, 2002. K. Zhang. A constrained edit distance between unordered labeled trees. Algorithmica, 15:205–222, 1996. 2714