jmlr jmlr2013 jmlr2013-78 jmlr2013-78-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Dana Angluin, James Aspnes, Sarah Eisenstat, Aryeh Kontorovich
Abstract: PAC learning of unrestricted regular languages is long known to be a difficult problem. The class of shuffle ideals is a very restricted subclass of regular languages, where the shuffle ideal generated by a string u is the collection of all strings containing u as a subsequence. This fundamental language family is of theoretical interest in its own right and provides the building blocks for other important language families. Despite its apparent simplicity, the class of shuffle ideals appears quite difficult to learn. In particular, just as for unrestricted regular languages, the class is not properly PAC learnable in polynomial time if RP = NP, and PAC learning the class improperly in polynomial time would imply polynomial time algorithms for certain fundamental problems in cryptography. In the positive direction, we give an efficient algorithm for properly learning shuffle ideals in the statistical query (and therefore also PAC) model under the uniform distribution. Keywords: PAC learning, statistical queries, regular languages, deterministic finite automata, shuffle ideals, subsequences
Dana Angluin. On the complexity of minimum inference of regular sets. Information and Control, 39(3):337–350, 1978. ISSN 0019-9958. doi: 10.1016/S0019-9958(78)90683-6. Dana Angluin. Inference of reversible languages. Journal of the ACM, 29(3):741–765, July 1982. ISSN 0004-5411. doi: 10.1145/322326.322334. Dana Angluin. Learning regular sets from queries and counterexamples. Information and Computation, 75(2):87–106, November 1987. ISSN 0890-5401. doi: 10.1016/0890-5401(87)90052-6. Dana Angluin, James Aspnes, and Aryeh Kontorovich. On the learnability of shuffle ideals. In Proceedings of the 23rd International Conference on Algorithmic Learning Theory, ALT ’12, pages 111–123, Berlin, Heidelberg, 2012. Springer-Verlag. ISBN 978-3-642-34105-2. doi: 10.1007/978-3-642-34106-9 12. Nader H. Bshouty. Exact learning of formulas in parallel. Machine Learning, 26(1):25–41, January 1997. ISSN 0885-6125. doi: 10.1023/A:1007320031970. Nader H. Bshouty, Jeffrey C. Jackson, and Christino Tamon. Exploring learnability between exact and PAC. Journal of Computer and System Sciences, 70(4):471–484, June 2005. ISSN 00220000. doi: 10.1016/j.jcss.2004.10.002. Alexander Clark and Franck Thollard. PAC-learnability of probabilistic deterministic finite state automata. Journal of Machine Learning Research, 5:473–497, December 2004. ISSN 15324435. 1528 O N THE L EARNABILITY OF S HUFFLE I DEALS Corinna Cortes, Leonid (Aryeh) Kontorovich, and Mehryar Mohri. Learning languages with rational kernels. In Proceedings of the 20th Annual Conference on Learning Theory, COLT ’07, pages 349–364, Berlin, Heidelberg, 2007. Springer-Verlag. ISBN 978-3-540-72925-9. Colin de la Higuera. A bibliographical study of grammatical inference. Pattern Recognition, 38(9): 1332–1348, September 2005. ISSN 0031-3203. doi: 10.1016/j.patcog.2005.01.003. Samuel Eilenberg and Saunders Mac Lane. On the groups of H(Π, n), I. Annals of Mathematics. Second Series, 58:55–106, July 1953. ISSN 0003-486X. Funda Erg¨ n, S. Ravi Kumar, and Ronitt Rubinfeld. On learning bounded-width branching prou grams. In Proceedings of the Eighth Annual Conference on Computational Learning Theory, COLT ’95, pages 361–368, New York, NY, USA, 1995. ACM. ISBN 0-89791-723-5. doi: 10.1145/225298.225342. E. Mark Gold. Complexity of automaton identification from given data. Information and Control, 37(3):302–320, 1978. Yoshiyasu Ishigami and Sei’ichi Tani. VC-dimensions of finite automata and commutative finite automata with k letters and n states. Discrete Applied Mathematics, 74(3):229–240, May 1997. ISSN 0166-218X. doi: 10.1016/S0166-218X(96)00050-9. Michael Kearns. Efficient noise-tolerant learning from statistical queries. Journal of the ACM, 45 (6):983–1006, November 1998. ISSN 0004-5411. doi: 10.1145/293347.293351. Michael Kearns and Leslie Valiant. Cryptographic limitations on learning Boolean formulae and finite automata. Journal of the ACM, 41(1):67–95, January 1994. ISSN 0004-5411. doi: 10.1145/174644.174647. Michael J. Kearns and Umesh V. Vazirani. An Introduction to Computational Learning Theory. MIT Press, Cambridge, MA, USA, 1994. ISBN 0-262-11193-4. Ondˇej Kl´ma and Libor Pol´ k. Hierarchies of piecewise testable languages. In Proceedings of the r ı a 12th International Conference on Developments in Language Theory, DLT ’08, pages 479–490, Berlin, Heidelberg, 2008. Springer-Verlag. ISBN 978-3-540-85779-2. doi: 10.1007/978-3-54085780-8 38. Leonid (Aryeh) Kontorovich and Boaz Nadler. Universal kernel-based learning with applications to regular languages. Journal of Machine Learning Research, 10:1095–1129, June 2009. ISSN 1532-4435. Leonid (Aryeh) Kontorovich, Dana Ron, and Yoram Singer. A Markov model for the acquisition of morphological structure. Technical Report CMU-CS-03-147, June 2003. Leonid (Aryeh) Kontorovich, Corinna Cortes, and Mehryar Mohri. Learning linearly separable languages. In Proceedings of the 17th International Conference on Algorithmic Learning Theory, ALT ’06, pages 288–303, Berlin, Heidelberg, 2006. Springer-Verlag. ISBN 3-540-46649-5, 9783-540-46649-9. doi: 10.1007/11894841 24. 1529 A NGLUIN , A SPNES , E ISENSTAT AND KONTOROVICH Leonid (Aryeh) Kontorovich, Corinna Cortes, and Mehryar Mohri. Kernel methods for learning languages. Theoretical Computer Science, 405(3):223–236, October 2008. ISSN 0304-3975. doi: 10.1016/j.tcs.2008.06.037. Kimmo Koskenniemi. Two-level model for morphological analysis. In Proceedings of the Eighth International Joint Conference on Artificial Intelligence - Volume 2, IJCAI ’83, pages 683–685, San Francisco, CA, USA, 1983. Morgan Kaufmann Publishers Inc. M. Lothaire. Combinatorics on Words, volume 17 of Encyclopedia of Mathematics and Its Applications. Addison-Wesley, 1983. Mehryar Mohri. On some applications of finite-state automata theory to natural language processing. Natural Language Engineering, 2(1):61–80, March 1996. ISSN 1351-3249. doi: 10.1017/S135132499600126X. Mehryar Mohri. Finite-state transducers in language and speech processing. Computational Linguistics, 23(2):269–311, June 1997. ISSN 0891-2017. Mehryar Mohri, Fernando Pereira, and Michael Riley. Weighted finite-state transducers in speech recognition. Computer Speech & Language, 16(1):69–88, 2002. ISSN 0885-2308. doi: 10.1006/csla.2001.0184. Mehryar Mohri, Pedro J. Moreno, and Eugene Weinstein. Efficient and robust music identification with weighted finite-state transducers. IEEE Transactions on Audio, Speech & Language Processing, 18(1):197–207, January 2010. ISSN 1063-6676. doi: 10.1109/TASL.2009.2023170. Jos´ Oncina and Pedro Garc´a. Identifying regular languages in polynomial time. In Advances e ı in Structural and Syntactic Pattern Recognition, volume 5 of Machine Perception and Artificial Intelligence, pages 99–108. World Scientific Publishing, 1992. Nick Palmer and Paul W. Goldberg. PAC-learnability of probabilistic deterministic finite state automata in terms of variation distance. Theoretical Compututer Science, 387(1):18–31, November 2007. ISSN 0304-3975. doi: 10.1016/j.tcs.2007.07.023. Rajesh Parekh and Vasant G. Honavar. Learning DFA from simple examples. Machine Learning, 44(1-2):9–35, July 2001. ISSN 0885-6125. doi: 10.1023/A:1010822518073. Gheorghe P˘ un, editor. Mathematical Aspects of Natural and Formal Languages. World Scientific a Publishing Co., Inc., River Edge, NJ, USA, 1994. ISBN 9-8102-1914-8. Leonard Pitt and Manfred K. Warmuth. Prediction-preserving reducibility. Journal of Computer and System Sciences, 41(3):430–467, December 1990. ISSN 0022-0000. doi: 10.1016/00220000(90)90028-J. Leonard Pitt and Manfred K. Warmuth. The minimum consistent DFA problem cannot be approximated within any polynomial. Journal of the ACM, 40(1):95–142, January 1993. ISSN 0004-5411. doi: 10.1145/138027.138042. 1530 O N THE L EARNABILITY OF S HUFFLE I DEALS Owen Rambow, Srinivas Bangalore, Tahir Butt, Alexis Nasr, and Richard Sproat. Creating a finitestate parser with application semantics. In Proceedings of the 19th International Conference on Computational Linguistics - Volume 2, COLING ’02, pages 1–5, Stroudsburg, PA, USA, 2002. Association for Computational Linguistics. doi: 10.3115/1071884.1071910. Dana Ron, Yoram Singer, and Naftali Tishby. On the learnability and usage of acyclic probabilistic finite automata. Journal of Computer and System Sciences, 56(2):133–152, 1998. ISSN 00220000. doi: 10.1006/jcss.1997.1555. Imre Simon. Piecewise testable events. In Proceedings of the 2nd GI Conference on Automata Theory and Formal Languages, pages 214–222, London, UK, 1975. Springer-Verlag. ISBN 3540-07407-4. Richard Sproat, William Gale, Chilin Shih, and Nancy Chang. A stochastic finite-state wordsegmentation algorithm for Chinese. Computational Linguistics, 22(3):377–404, September 1996. ISSN 0891-2017. Leslie G. Valiant. A theory of the learnable. Communications of the ACM, 27(11):1134–1142, November 1984. ISSN 0001-0782. 1531