jmlr jmlr2013 jmlr2013-78 jmlr2013-78-reference knowledge-graph by maker-knowledge-mining

78 jmlr-2013-On the Learnability of Shuffle Ideals

Source: pdf

Author: Dana Angluin, James Aspnes, Sarah Eisenstat, Aryeh Kontorovich

Abstract: PAC learning of unrestricted regular languages is long known to be a difﬁcult problem. The class of shufﬂe ideals is a very restricted subclass of regular languages, where the shufﬂe ideal generated by a string u is the collection of all strings containing u as a subsequence. This fundamental language family is of theoretical interest in its own right and provides the building blocks for other important language families. Despite its apparent simplicity, the class of shufﬂe ideals appears quite difﬁcult to learn. In particular, just as for unrestricted regular languages, the class is not properly PAC learnable in polynomial time if RP = NP, and PAC learning the class improperly in polynomial time would imply polynomial time algorithms for certain fundamental problems in cryptography. In the positive direction, we give an efﬁcient algorithm for properly learning shufﬂe ideals in the statistical query (and therefore also PAC) model under the uniform distribution. Keywords: PAC learning, statistical queries, regular languages, deterministic ﬁnite automata, shufﬂe ideals, subsequences

reference text

Dana Angluin. On the complexity of minimum inference of regular sets. Information and Control, 39(3):337–350, 1978. ISSN 0019-9958. doi: 10.1016/S0019-9958(78)90683-6. Dana Angluin. Inference of reversible languages. Journal of the ACM, 29(3):741–765, July 1982. ISSN 0004-5411. doi: 10.1145/322326.322334. Dana Angluin. Learning regular sets from queries and counterexamples. Information and Computation, 75(2):87–106, November 1987. ISSN 0890-5401. doi: 10.1016/0890-5401(87)90052-6. Dana Angluin, James Aspnes, and Aryeh Kontorovich. On the learnability of shufﬂe ideals. In Proceedings of the 23rd International Conference on Algorithmic Learning Theory, ALT ’12, pages 111–123, Berlin, Heidelberg, 2012. Springer-Verlag. ISBN 978-3-642-34105-2. doi: 10.1007/978-3-642-34106-9 12. Nader H. Bshouty. Exact learning of formulas in parallel. Machine Learning, 26(1):25–41, January 1997. ISSN 0885-6125. doi: 10.1023/A:1007320031970. Nader H. Bshouty, Jeffrey C. Jackson, and Christino Tamon. Exploring learnability between exact and PAC. Journal of Computer and System Sciences, 70(4):471–484, June 2005. ISSN 00220000. doi: 10.1016/j.jcss.2004.10.002. Alexander Clark and Franck Thollard. PAC-learnability of probabilistic deterministic ﬁnite state automata. Journal of Machine Learning Research, 5:473–497, December 2004. ISSN 15324435. 1528 O N THE L EARNABILITY OF S HUFFLE I DEALS Corinna Cortes, Leonid (Aryeh) Kontorovich, and Mehryar Mohri. Learning languages with rational kernels. In Proceedings of the 20th Annual Conference on Learning Theory, COLT ’07, pages 349–364, Berlin, Heidelberg, 2007. Springer-Verlag. ISBN 978-3-540-72925-9. Colin de la Higuera. A bibliographical study of grammatical inference. Pattern Recognition, 38(9): 1332–1348, September 2005. ISSN 0031-3203. doi: 10.1016/j.patcog.2005.01.003. Samuel Eilenberg and Saunders Mac Lane. On the groups of H(Π, n), I. Annals of Mathematics. Second Series, 58:55–106, July 1953. ISSN 0003-486X. Funda Erg¨ n, S. Ravi Kumar, and Ronitt Rubinfeld. On learning bounded-width branching prou grams. In Proceedings of the Eighth Annual Conference on Computational Learning Theory, COLT ’95, pages 361–368, New York, NY, USA, 1995. ACM. ISBN 0-89791-723-5. doi: 10.1145/225298.225342. E. Mark Gold. Complexity of automaton identiﬁcation from given data. Information and Control, 37(3):302–320, 1978. Yoshiyasu Ishigami and Sei’ichi Tani. VC-dimensions of ﬁnite automata and commutative ﬁnite automata with k letters and n states. Discrete Applied Mathematics, 74(3):229–240, May 1997. ISSN 0166-218X. doi: 10.1016/S0166-218X(96)00050-9. Michael Kearns. Efﬁcient noise-tolerant learning from statistical queries. Journal of the ACM, 45 (6):983–1006, November 1998. ISSN 0004-5411. doi: 10.1145/293347.293351. Michael Kearns and Leslie Valiant. Cryptographic limitations on learning Boolean formulae and ﬁnite automata. Journal of the ACM, 41(1):67–95, January 1994. ISSN 0004-5411. doi: 10.1145/174644.174647. Michael J. Kearns and Umesh V. Vazirani. An Introduction to Computational Learning Theory. MIT Press, Cambridge, MA, USA, 1994. ISBN 0-262-11193-4. Ondˇej Kl´ma and Libor Pol´ k. Hierarchies of piecewise testable languages. In Proceedings of the r ı a 12th International Conference on Developments in Language Theory, DLT ’08, pages 479–490, Berlin, Heidelberg, 2008. Springer-Verlag. ISBN 978-3-540-85779-2. doi: 10.1007/978-3-54085780-8 38. Leonid (Aryeh) Kontorovich and Boaz Nadler. Universal kernel-based learning with applications to regular languages. Journal of Machine Learning Research, 10:1095–1129, June 2009. ISSN 1532-4435. Leonid (Aryeh) Kontorovich, Dana Ron, and Yoram Singer. A Markov model for the acquisition of morphological structure. Technical Report CMU-CS-03-147, June 2003. Leonid (Aryeh) Kontorovich, Corinna Cortes, and Mehryar Mohri. Learning linearly separable languages. In Proceedings of the 17th International Conference on Algorithmic Learning Theory, ALT ’06, pages 288–303, Berlin, Heidelberg, 2006. Springer-Verlag. ISBN 3-540-46649-5, 9783-540-46649-9. doi: 10.1007/11894841 24. 1529 A NGLUIN , A SPNES , E ISENSTAT AND KONTOROVICH Leonid (Aryeh) Kontorovich, Corinna Cortes, and Mehryar Mohri. Kernel methods for learning languages. Theoretical Computer Science, 405(3):223–236, October 2008. ISSN 0304-3975. doi: 10.1016/j.tcs.2008.06.037. Kimmo Koskenniemi. Two-level model for morphological analysis. In Proceedings of the Eighth International Joint Conference on Artiﬁcial Intelligence - Volume 2, IJCAI ’83, pages 683–685, San Francisco, CA, USA, 1983. Morgan Kaufmann Publishers Inc. M. Lothaire. Combinatorics on Words, volume 17 of Encyclopedia of Mathematics and Its Applications. Addison-Wesley, 1983. Mehryar Mohri. On some applications of ﬁnite-state automata theory to natural language processing. Natural Language Engineering, 2(1):61–80, March 1996. ISSN 1351-3249. doi: 10.1017/S135132499600126X. Mehryar Mohri. Finite-state transducers in language and speech processing. Computational Linguistics, 23(2):269–311, June 1997. ISSN 0891-2017. Mehryar Mohri, Fernando Pereira, and Michael Riley. Weighted ﬁnite-state transducers in speech recognition. Computer Speech & Language, 16(1):69–88, 2002. ISSN 0885-2308. doi: 10.1006/csla.2001.0184. Mehryar Mohri, Pedro J. Moreno, and Eugene Weinstein. Efﬁcient and robust music identiﬁcation with weighted ﬁnite-state transducers. IEEE Transactions on Audio, Speech & Language Processing, 18(1):197–207, January 2010. ISSN 1063-6676. doi: 10.1109/TASL.2009.2023170. Jos´ Oncina and Pedro Garc´a. Identifying regular languages in polynomial time. In Advances e ı in Structural and Syntactic Pattern Recognition, volume 5 of Machine Perception and Artiﬁcial Intelligence, pages 99–108. World Scientiﬁc Publishing, 1992. Nick Palmer and Paul W. Goldberg. PAC-learnability of probabilistic deterministic ﬁnite state automata in terms of variation distance. Theoretical Compututer Science, 387(1):18–31, November 2007. ISSN 0304-3975. doi: 10.1016/j.tcs.2007.07.023. Rajesh Parekh and Vasant G. Honavar. Learning DFA from simple examples. Machine Learning, 44(1-2):9–35, July 2001. ISSN 0885-6125. doi: 10.1023/A:1010822518073. Gheorghe P˘ un, editor. Mathematical Aspects of Natural and Formal Languages. World Scientiﬁc a Publishing Co., Inc., River Edge, NJ, USA, 1994. ISBN 9-8102-1914-8. Leonard Pitt and Manfred K. Warmuth. Prediction-preserving reducibility. Journal of Computer and System Sciences, 41(3):430–467, December 1990. ISSN 0022-0000. doi: 10.1016/00220000(90)90028-J. Leonard Pitt and Manfred K. Warmuth. The minimum consistent DFA problem cannot be approximated within any polynomial. Journal of the ACM, 40(1):95–142, January 1993. ISSN 0004-5411. doi: 10.1145/138027.138042. 1530 O N THE L EARNABILITY OF S HUFFLE I DEALS Owen Rambow, Srinivas Bangalore, Tahir Butt, Alexis Nasr, and Richard Sproat. Creating a ﬁnitestate parser with application semantics. In Proceedings of the 19th International Conference on Computational Linguistics - Volume 2, COLING ’02, pages 1–5, Stroudsburg, PA, USA, 2002. Association for Computational Linguistics. doi: 10.3115/1071884.1071910. Dana Ron, Yoram Singer, and Naftali Tishby. On the learnability and usage of acyclic probabilistic ﬁnite automata. Journal of Computer and System Sciences, 56(2):133–152, 1998. ISSN 00220000. doi: 10.1006/jcss.1997.1555. Imre Simon. Piecewise testable events. In Proceedings of the 2nd GI Conference on Automata Theory and Formal Languages, pages 214–222, London, UK, 1975. Springer-Verlag. ISBN 3540-07407-4. Richard Sproat, William Gale, Chilin Shih, and Nancy Chang. A stochastic ﬁnite-state wordsegmentation algorithm for Chinese. Computational Linguistics, 22(3):377–404, September 1996. ISSN 0891-2017. Leslie G. Valiant. A theory of the learnable. Communications of the ACM, 27(11):1134–1142, November 1984. ISSN 0001-0782. 1531