acl acl2010 acl2010-103 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Jeffrey Heinz ; James Rogers
Abstract: Strictly Piecewise (SP) languages are a subclass of regular languages which encode certain kinds of long-distance dependencies that are found in natural languages. Like the classes in the Chomsky and Subregular hierarchies, there are many independently converging characterizations of the SP class (Rogers et al., to appear). Here we define SP distributions and show that they can be efficiently estimated from positive data.
Reference: text
sentIndex sentText sentNum sentScore
1 edu Abstract Strictly Piecewise (SP) languages are a subclass of regular languages which encode certain kinds of long-distance dependencies that are found in natural languages. [sent-2, score-0.279]
2 Like the classes in the Chomsky and Subregular hierarchies, there are many independently converging characterizations of the SP class (Rogers et al. [sent-3, score-0.089]
3 Here we define SP distributions and show that they can be efficiently estimated from positive data. [sent-5, score-0.138]
4 edu which make distinctions on the basis of contiguous subsequences. [sent-14, score-0.061]
5 The Strictly Local languages are the formal-language theoretic foundation for n-gram models (Garcia et al. [sent-15, score-0.112]
6 , 1990), which are widely used in natural language processing (NLP) in part because such distributions can be estimated from positive data (i. [sent-16, score-0.138]
7 N-gram models describe probability distributions over all strings on the basis of the Markov assumption (Markov, 1913): that the probability of the next symbol only depends on the previous contiguous sequence of length n 1. [sent-19, score-0.372]
8 From the perspective of formal language theory, tFhreosme dthiestr piebrustpioencsti are perhaps properly called Strictly k-Local distributions (SLk) where k = n. [sent-20, score-0.103]
9 This paper defines Strictly k-Piecewise (SPk) distributions and shows how they too can be efficiently estimated from positive data. [sent-22, score-0.138]
10 In contrast with the Markov assumption, our assumption is that the probability of the next symbol is conditioned on the previous set of discontiguous subsequences of length k − 1 in the string. [sent-23, score-0.366]
11 As a result, SP distributions are efficiently computable even though they condition the probability of the next symbol on the occurrences of earlier (possibly very distant) discontiguous subsequences. [sent-25, score-0.198]
12 Essentially, these SP distributions reflect a kind of long-term memory. [sent-26, score-0.103]
13 On the other hand, SP models have no shortterm memory and are unable to make distinctions on the basis of contiguous subsequences. [sent-27, score-0.061]
14 Since SP languages are the analogue of SL languages, which are the formal-language theoretical foundation for n-gram models, which are widely used in NLP, it is expected that SP distributions and their estimation will also find wide application. [sent-32, score-0.215]
15 §3 provides rveidleevsan bt background on ctahel subregular h§i3erarchy. [sent-38, score-0.074]
16 Σ denotes a finite set of symbols and a string over Σ is a finite sequence of symbols drawn from that set. [sent-46, score-0.156]
17 Σk, Σ≤k, Σ≥k, and Σ∗ denote all strings over this alphabet of length k, of length less than or equal to k, of length greater than or equal to k, and of any finite length, respectively. [sent-47, score-0.229]
18 Tpthye prefixes wof| a string w are Pfx(w) = {v : ∃u ∈ Σ∗ such that vu = w}. [sent-50, score-0.065]
19 A Deterministic FiniPte-state Automaton (DFA) is a tuple M = hQ, Σ, q0, δ, Fi where Q is the issta ate set, Σ M Mis =the h alphabet, q0 iis thhee esta Qrt state, δ is a deterministic transition function with domain Q Σ and codomain Q, F is the set of accepting s tΣate asn. [sent-57, score-0.089]
20 , w) is the (unique) state reachable from state q via the sequence w, if any, or w)↑ otherwise. [sent-60, score-0.11]
21 The language recognized by a Dw)FA↑ oMth ris- odˆm dˆ(q, dˆ(q, L(M) Td=hef {w ∈ Σ∗ | dˆ(q0, w)↓ A state is useful iff for all q w ∈ Σta∗t es iusch u tehfuatl δ(q0, w) = w ∈ ΣΣ∗ such that δ(q, w) ∈ are ∈no Σt usseufcuhl. [sent-61, score-0.13]
22 Useless states useless states are Two strings w and v over Σ are distinguished by a DFA M iff w) v). [sent-65, score-0.368]
23 fA wll uDF ∈As L whic⇐h⇒ recognize ∈L Lm fuosrt dˆ(q0, = dˆ(q0, × distinguish strings which are inequivalent in this sense, but no DFA recognizing L necessarily distinguishes any strings which are equivalent. [sent-67, score-0.13]
24 Hence the number of equivalence classes of strings over Σ modulo Nerode equivalence with respect to L gives a (tight) lower bound on the number of states required to recognize L. [sent-68, score-0.233]
25 A DFA is minimal if the size of its state set is minimal among DFAs accepting the same language. [sent-69, score-0.093]
26 A Probabilistic Deterministic Finitestate Automaton (PDFA) is a tuple M = hQ, Σ, q0, δ, F, Ti where Q is the state set, Σ = =is h Qth,eΣ alphabet, q0 iws tehree s Qtart i state, δta tise a deterministic transition function, F and T are the final-state and transition probabilities. [sent-79, score-0.106]
27 The probability a PDF)Ai, assigns to w is obtained by multiplying the transition probabilities with the final probability along path if w’s 887 bAc: 23 / 1 0 a:3/10acBb: 21 4/ 9 9 Figure 1: A picture of a PDFA with states labeled A and B. [sent-91, score-0.269]
28 ·F(qk+1) (2) if dˆ(q0, w)↓ and 0 otherwise A probability distribution is regular deterministic iff there is a PDFA which generates it. [sent-95, score-0.232]
29 The structural components of a PDFA M are its Tsthaete sst Q, uitrsa alphabet Σ, sits o ftra an PsiDtiFonAs a anrde its initial state q0. [sent-96, score-0.055]
30 These distributions have |Q| · ( |Σ | + 1) independent parameters (since vfoer | Qea|·c(h| Σst|a +te 1th)e irneare |Σ | possible transitions plus the possibility of finality. [sent-99, score-0.103]
31 ) We define the product of PDFA in terms of coemission probabilities (Vidal et al. [sent-100, score-0.142]
32 S dtea-tistically speaking, the co-emission product makes an independence assumption: the probability of σ being co-emitted from q1, . [sent-155, score-0.106]
33 , qn is exactly what one expects if there is no interaction between the individual factors; that is, between the probabilities of σ being emitted from any qi. [sent-158, score-0.171]
34 Also note order of product is irrelevant up to renaming of the states, and so therefore we also speak of taking the product of a set of PDFAs (as opposed to an ordered vector). [sent-159, score-0.11]
35 Estimating regular deterministic distributions is well-studied problem (Vidal et al. [sent-160, score-0.209]
36 Let S be a finite sample of words drawn from a regular deterministic distribution D. [sent-164, score-0.147]
37 The optimization problem (3) is simple for deterministic automata with known structural components. [sent-173, score-0.122]
38 For all states q ∈ Q and symbols a ∈ Σ, iTmhaet Med. [sent-177, score-0.151]
39 L Fesotrim alalti sotnat oesf qthe ∈ probability of T(q, a) is obtained by dividing the number of times this transition is used in parsing the sample S by the 888 Acb: 32 a:3Bcba: :142 Figure 2: The automata shows the counts obtained by parsing M with sample S = {ab, bba, ǫ, cab, acb, cc} . [sent-178, score-0.122]
40 number of times state q is encountered in the parsing of S. [sent-180, score-0.055]
41 Similarly, the ML estimation of F(q) is obtained by calculating the relative frequency of state q being final with state q being encountered in the parsing of S. [sent-181, score-0.11]
42 Languages in the weakest of these classes are defined only in terms of the set of factors (SL) or subsequences (SP) which are licensed to occur in the string (equivalently the complement of that set with respect to Σ≤k, the forbidden factors or forbidden subsequences). [sent-194, score-0.365]
43 For example, the set containing the forbidden 2-factors {ab, ba} de- sfiente cso a Strictly 2e-L foorbcaild language owrshi {cahb i,bnacl}ud deesall strings except those with contiguous substrings {ab, ba}. [sent-195, score-0.167]
44 dels (Jurafsky and Martin, 2008) assign probabilities to symbols given the preceding contiguous substrings up to length n − 1, we say they tdigesucoriubse s Strictly ns- uLpoc toal l ednigsttrhib nut −ion 1s,. [sent-197, score-0.192]
45 The Locally Testable (LT) and Piecewise Testable languages are exactly those that are definable by propositional formulae in which the atomic formulae are blocks of symbols interpreted factors (LT) or subsequences (PT) of the string. [sent-199, score-0.583]
46 The languages that are testable in the strict sense (SL and SP) are exactly those that are definable by formulae of this sort restricted to conjunctions of negative literals. [sent-200, score-0.498]
47 Going the other way, the languages that are definable by First-Order formulae with adjacency (successor) but not precedence (less-than) are exactly the Locally Threshold Testable (LTT) languages. [sent-201, score-0.29]
48 The Star-Free languages are those that are First-Order definable with precedence alone (adjacency being FO defin- able from precedence). [sent-202, score-0.23]
49 Finally, by extending to Monadic Second-Order formulae (with either signature, since they are MSO definable from each other), one obtains the full class of Regular languages (McNaughton and Papert, 1971; Thomas, 1982; Rogers and Pullum, to appear; Rogers et al. [sent-203, score-0.256]
50 The relation between strings which is fundamental along the Piecewise branch is the subse889 quence relation, which is a partial order on Σ∗: w ⊑ v ⇐de⇒f w = ε or w (∃w0 , . [sent-205, score-0.065]
51 For w ∈ Σ∗, let =def {v ∈ Σk | v ⊑ w} and P≤k(w) =def {v ∈ Σ≤k | v ⊑ w}, Pk(w) the set of subsequences of length k, respectively length no greater than k, of w. [sent-210, score-0.36]
52 Simila(rL to) itsh efi Strictly aLlloc Lal ⊆ languages, Strictly Piecewise languages are defined only in terms of the set of subsequences (up to some length k) which are licensed to occur in the string. [sent-213, score-0.383]
53 A language is SPk iff it is L(G) for some SPk grammar G. [sent-216, score-0.075]
54 (2008), we call the set of strings that contain w as a subsequence the principal shuffle of w: ideal2 SI(w) = {v ∈ Σ∗ | w ⊑ v}. [sent-220, score-0.291]
55 The shuffle ideal of a set of strings is defined as SI(S) = ∪w∈SSI(w) Rogers et al. [sent-221, score-0.262]
56 (to appear) establish that the SP languages have a variety of characteristic properties. [sent-222, score-0.112]
57 Theorem 1 The following are equivalent:3 2Properly SI(w) is the principal ideal generated by {w} wrt thPero ipnverelryse S oI(fw ⊑). [sent-223, score-0.119]
58 We only note that 5 implies 1 by DeMorgan’s theorem and the fact that every shuffle ideal is finitely generated (see also Lothaire (1997)). [sent-226, score-0.197]
59 L = SI(X), X ⊆ Σ∗ (L is the complement of a shuffle ideal). [sent-234, score-0.205]
60 The DFA representation of the complement of a shuffle ideal is especially important. [sent-235, score-0.25]
61 Lemma 1 Let w ∈ Σk, w = σ1 · · · σk, and MSI(w) = hQ, Σ, q0, δ, Fi, where Q = {i | 1≤ i≤ k}, q0 = 1, F = Q and for all qi ∈ Q, σ ∈ Σ k}: δ(qi,σ) =q ↑i +1 Then MSI(w) is a iof t σh σe=r =w σi siea. [sent-236, score-0.072]
62 an d i =< k , minimal, trimmed DFA that rec- ognizes the complement of SI(w), i. [sent-237, score-0.142]
63 The trimmed automata product of this set yields a DFA, with the properties below (Rogers et al. [sent-243, score-0.215]
64 Lemma 2 Let M be a trimmed DFA recognizing a SPk language co bnest aru tcrtiemdm as DdeFscAri rbeecdo anbizoivneg. [sent-245, score-0.089]
65 quences up to length 1of prefixes of the language. [sent-249, score-0.106]
66 de Intt iicsal s (modulo relabeling eor-f state names) to one obtained by the trimmed product of the DFA representations of the complement of the principal shuffle ideals of aa and bc, which are the prohibited subsequences. [sent-256, score-0.478]
67 States in the DFA in Figure 5 correspond to the subsequences up to length 1 of the prefixes of the language. [sent-257, score-0.336]
68 With this in mind, it follows that the DFA of Σ∗ = L(Σ, Σk) has states which correspond to the subsequences up to length k − 1 of tshpeo prefixes eo fs uΣb∗s. [sent-258, score-0.45]
69 hIenn fact, 2th aensed ΣDF =As { a r,ebve,acl} the differences between SP languages and PT languages: they are exactly those expressed in Lemma 2. [sent-260, score-0.112]
70 Within the state space defined by the subsequences up to length k − 1 of the prefixes of the language, if the cleonngdtihti okn −s i 1n oLfem them par 2e are violated, tnhgeuna atghee, ,D ifF tAhse describe languages that are PT but not SP. [sent-261, score-0.503]
71 ally, PT2 languages are obtained by arbitrarily removing arcs, states, and the finality of states from the DFA in Figure 6, and SP2 ones are obtained by non-arbitrarily removing them in accordance with Lemma 2. [sent-263, score-0.26]
72 Names olafn tghuea sgteat egsiv reenfle bcyt Gsub =sets h {oaf, subsequences up etos length 1 of prefixes of the language. [sent-266, score-0.336]
73 5 SP Distributions In the same way that SL distributions (n-gram models) generalize SL languages, SP distributions generalize SP languages. [sent-268, score-0.206]
74 Recall that SP languages are characterizable by the intersection of the complements of principal shuffle ideals. [sent-269, score-0.372]
75 The following lemma follows directly from the finite-state representation of PTk distributions. [sent-279, score-0.073]
76 σi−1),σi) · (4) F(P≤k−1 (w)) 2|Σ|k−1 PTk distributions have (|Σ| +1) parameters (since there are states and |Σ| + 1possible events, i. [sent-286, score-0.217]
77 σi−1))(5) · Pr(# | P≤k−1(w)) Thus, the probability assigned to a word depends not on the observed contiguous sequences as in a Markov model, but on observed subsequences. [sent-295, score-0.112]
78 Like SP languages, SP distributions can be defined in terms ofthe product ofmachines very similar to the complement of principal shuffle ideals. [sent-296, score-0.437]
79 DMefwin = hQ, Σ, q0, F, Ti is a w-subsequencedistinguishing ,PDδ,FFA, (w-SD-PDFA) iff Q = Pfx(w), q0 = ǫ, for all u ∈ Pfx(w) and each σ ∈ Σ, δ, δ(u, σ) = iff uσ ∈ Pfx(w) and u σot ihffer uwσis ∈e uσ and F and T satisfy Equation 1. [sent-298, score-0.15]
80 Figure 7 shows the structure of Ma which is almFoigstu rthee 7 same as hthee complement Mof the principal shuffle ideal in Figure 4. [sent-299, score-0.324]
81 The only difference is the additional self-loop labeled a on the rightmost state labeled a. [sent-300, score-0.055]
82 n dM its states distinguish those cǫbabcaa Figure 7: The structure of PDFA Ma. [sent-302, score-0.114]
83 It is the same (modulo st statreu names) as DthFeA AD FMA in Figure 4 except for the self-loop labeled a on state a. [sent-303, score-0.055]
84 strings which contain a (state a) from those that do not (state ǫ). [sent-304, score-0.065]
85 In the same way that missing edges propagate down in DFA representations of SP languages (Lemma 2), the final and transitional probabilities must propagate down in PDFA representations of SPk distributions. [sent-306, score-0.329]
86 In other words, the final and transitional probabilities at states further along paths beginning at the start state must be determined by final and transitional probabilities at earlier states non-increasingly. [sent-307, score-0.465]
87 This is captured by defining SP distributions as a product of k-sets of SD-PDFAs (see Definition 5 below). [sent-308, score-0.158]
88 While the standard product based on coemission probability could be used for this purpose, we adopt a modified version of it defined for k-sets of SD-PDFAs: the positive co-emission probability. [sent-309, score-0.175]
89 The automata product based on the positive co-emission probability not only ensures that the probabilities propagate as necessary, but also that such probabilities are made on the basis of observed subsequences, and not unobserved ones. [sent-310, score-0.381]
90 For each w ∈ Σt A≤k− b1e, l eat Mw = hQw, Σ, q0w, δw, Fw,Twi bwe ∈the Σ w-subsequence-distinguishing bTehe th positive sceoq-uemencisesi-odnis probability multaneously emitted from states qǫ, the statesets Qǫ, . [sent-316, score-0.2]
91 qui qw Y=w Similarly, the probability that a word simultaneously ends at n states qǫ ∈ Qǫ, . [sent-330, score-0.254]
92 qui qw Y=w In other words, the positive co-emission probability is the product of the probabilities restricted to those assigned to the maximal states in each Mw. [sent-339, score-0.397]
93 The positive co-emission product (⊗+) is defineTdh just as vweit cho -ceom-eismsiisosnion pr probabilities, substituting PCT and PCF for CT and CF, respectively, in Definition 1. [sent-345, score-0.127]
94 The definition of ⊗+ ensures ,th iant Dtheef probabilities propagate on fth ⊗e basis of observed subsequences, and not on the basis of unobserved ones. [sent-346, score-0.152]
95 ⊣ Definition 5 A distribution D is k-Strictly Piece- SPDk)⇐de⇒f wise (written D ∈ D can be described by a w PrDitFteAn Dwh ∈ich S iDs th⇐e positive bceo -deemscisrsibieodn product of a k-set of subsequence-distinguishing PDFAs. [sent-352, score-0.09]
96 Unlike PDFAs for PT distributions, which distinguish states, the number of states in a k- 2|Σ|k−1 set of SD-PDFAs is Pi s] [t>S]. [sent-354, score-0.114]
97 7 Conclusion SP distributions are the stochastic version of SP languages, which model long-distance dependencies. [sent-364, score-0.103]
98 Although SP distributions distinguish states, they do so with tractably many parameters and states because of an assumption that distinct subsequences do not interact. [sent-365, score-0.447]
99 As shown, these distributions are efficiently estimable from positive data. [sent-366, score-0.138]
100 Inference of ktestable languages in the strict sense and applications to syntactic pattern recognition. [sent-427, score-0.162]
wordName wordTfidf (topN-words)
[('dfa', 0.384), ('pdfa', 0.32), ('sp', 0.276), ('subsequences', 0.23), ('rogers', 0.207), ('testable', 0.192), ('piecewise', 0.162), ('qni', 0.152), ('shuffle', 0.152), ('spk', 0.148), ('hq', 0.133), ('qn', 0.118), ('states', 0.114), ('languages', 0.112), ('vidal', 0.108), ('distributions', 0.103), ('qw', 0.089), ('trimmed', 0.089), ('strictly', 0.087), ('definable', 0.084), ('higuera', 0.081), ('iff', 0.075), ('principal', 0.074), ('subregular', 0.074), ('lemma', 0.073), ('qi', 0.072), ('automata', 0.071), ('sl', 0.069), ('dfas', 0.067), ('pdfas', 0.067), ('pfx', 0.067), ('prd', 0.067), ('prefixes', 0.065), ('strings', 0.065), ('propagate', 0.063), ('contiguous', 0.061), ('formulae', 0.06), ('def', 0.059), ('heinz', 0.059), ('lothaire', 0.059), ('mcnaughton', 0.059), ('regular', 0.055), ('state', 0.055), ('product', 0.055), ('garc', 0.054), ('harmony', 0.054), ('modulo', 0.054), ('si', 0.054), ('complement', 0.053), ('probabilities', 0.053), ('deterministic', 0.051), ('probability', 0.051), ('acb', 0.051), ('chumash', 0.051), ('ptdk', 0.051), ('strict', 0.05), ('la', 0.049), ('characterizations', 0.048), ('let', 0.048), ('ti', 0.047), ('locally', 0.047), ('qu', 0.046), ('ideal', 0.045), ('pct', 0.044), ('msi', 0.044), ('discontiguous', 0.044), ('imre', 0.044), ('kontorovich', 0.044), ('papert', 0.044), ('phonotactic', 0.044), ('pullum', 0.044), ('jeffrey', 0.043), ('hierarchies', 0.042), ('length', 0.041), ('finite', 0.041), ('forbidden', 0.041), ('converging', 0.041), ('accepting', 0.038), ('transitional', 0.038), ('pr', 0.037), ('symbols', 0.037), ('definition', 0.036), ('jos', 0.036), ('colin', 0.036), ('simon', 0.036), ('phonological', 0.036), ('positive', 0.035), ('enrique', 0.034), ('precedence', 0.034), ('andmachine', 0.034), ('augmentative', 0.034), ('bakovi', 0.034), ('characterizable', 0.034), ('coemission', 0.034), ('coleman', 0.034), ('dthfea', 0.034), ('finality', 0.034), ('inese', 0.034), ('newell', 0.034), ('pcf', 0.034)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999911 103 acl-2010-Estimating Strictly Piecewise Distributions
Author: Jeffrey Heinz ; James Rogers
Abstract: Strictly Piecewise (SP) languages are a subclass of regular languages which encode certain kinds of long-distance dependencies that are found in natural languages. Like the classes in the Chomsky and Subregular hierarchies, there are many independently converging characterizations of the SP class (Rogers et al., to appear). Here we define SP distributions and show that they can be efficiently estimated from positive data.
2 0.38027453 217 acl-2010-String Extension Learning
Author: Jeffrey Heinz
Abstract: This paper provides a unified, learningtheoretic analysis of several learnable classes of languages discussed previously in the literature. The analysis shows that for these classes an incremental, globally consistent, locally conservative, set-driven learner always exists. Additionally, the analysis provides a recipe for constructing new learnable classes. Potential applications include learnable models for aspects of natural language and cognition.
3 0.069737151 89 acl-2010-Distributional Similarity vs. PU Learning for Entity Set Expansion
Author: Xiao-Li Li ; Lei Zhang ; Bing Liu ; See-Kiong Ng
Abstract: Distributional similarity is a classic technique for entity set expansion, where the system is given a set of seed entities of a particular class, and is asked to expand the set using a corpus to obtain more entities of the same class as represented by the seeds. This paper shows that a machine learning model called positive and unlabeled learning (PU learning) can model the set expansion problem better. Based on the test results of 10 corpora, we show that a PU learning technique outperformed distributional similarity significantly. 1
4 0.068425864 41 acl-2010-Automatic Selectional Preference Acquisition for Latin Verbs
Author: Barbara McGillivray
Abstract: We present a system that automatically induces Selectional Preferences (SPs) for Latin verbs from two treebanks by using Latin WordNet. Our method overcomes some of the problems connected with data sparseness and the small size of the input corpora. We also suggest a way to evaluate the acquired SPs on unseen events extracted from other Latin corpora.
5 0.068222545 234 acl-2010-The Use of Formal Language Models in the Typology of the Morphology of Amerindian Languages
Author: Andres Osvaldo Porta
Abstract: The aim of this work is to present some preliminary results of an investigation in course on the typology of the morphology of the native South American languages from the point of view of the formal language theory. With this object, we give two contrasting examples of descriptions of two Aboriginal languages finite verb forms morphology: Argentinean Quechua (quichua santiague n˜o) and Toba. The description of the morphology of the finite verb forms of Argentinean quechua, uses finite automata and finite transducers. In this case the construction is straightforward using two level morphology and then, describes in a very natural way the Argentinean Quechua morphology using a regular language. On the contrary, the Toba verbs morphology, with a system that simultaneously uses prefixes and suffixes, has not a natural description as regular language. Toba has a complex system of causative suffixes, whose successive applications determinate the use of prefixes belonging different person marking prefix sets. We adopt the solution of Creider et al. (1995) to naturally deal with this and other similar morphological processes which involve interactions between prefixes and suffixes and then we describe the toba morphology using linear context-free languages.1 .
6 0.057524499 162 acl-2010-Learning Common Grammar from Multilingual Corpus
7 0.049649883 184 acl-2010-Open-Domain Semantic Role Labeling by Modeling Word Spans
8 0.049213208 195 acl-2010-Phylogenetic Grammar Induction
9 0.048454907 255 acl-2010-Viterbi Training for PCFGs: Hardness Results and Competitiveness of Uniform Initialization
10 0.048151266 120 acl-2010-Fully Unsupervised Core-Adjunct Argument Classification
11 0.04619367 211 acl-2010-Simple, Accurate Parsing with an All-Fragments Grammar
12 0.045356184 93 acl-2010-Dynamic Programming for Linear-Time Incremental Parsing
13 0.044921763 95 acl-2010-Efficient Inference through Cascades of Weighted Tree Transducers
14 0.043858968 66 acl-2010-Compositional Matrix-Space Models of Language
15 0.042706635 116 acl-2010-Finding Cognate Groups Using Phylogenies
16 0.041839775 97 acl-2010-Efficient Path Counting Transducers for Minimum Bayes-Risk Decoding of Statistical Machine Translation Lattices
17 0.041531481 14 acl-2010-A Risk Minimization Framework for Extractive Speech Summarization
18 0.041465849 67 acl-2010-Computing Weakest Readings
19 0.041458957 55 acl-2010-Bootstrapping Semantic Analyzers from Non-Contradictory Texts
20 0.037903465 191 acl-2010-PCFGs, Topic Models, Adaptor Grammars and Learning Topical Collocations and the Structure of Proper Names
topicId topicWeight
[(0, -0.12), (1, 0.01), (2, 0.014), (3, -0.024), (4, -0.01), (5, -0.07), (6, 0.06), (7, -0.013), (8, 0.1), (9, -0.031), (10, -0.068), (11, 0.027), (12, 0.068), (13, -0.045), (14, -0.04), (15, -0.043), (16, 0.006), (17, 0.059), (18, -0.012), (19, 0.108), (20, -0.087), (21, -0.107), (22, -0.197), (23, -0.221), (24, 0.161), (25, -0.011), (26, -0.191), (27, 0.156), (28, 0.148), (29, 0.174), (30, -0.129), (31, -0.018), (32, -0.063), (33, 0.151), (34, 0.172), (35, 0.053), (36, -0.055), (37, -0.013), (38, -0.042), (39, 0.045), (40, 0.234), (41, 0.046), (42, -0.227), (43, 0.053), (44, 0.047), (45, -0.059), (46, 0.093), (47, 0.039), (48, -0.061), (49, 0.211)]
simIndex simValue paperId paperTitle
same-paper 1 0.96520489 103 acl-2010-Estimating Strictly Piecewise Distributions
Author: Jeffrey Heinz ; James Rogers
Abstract: Strictly Piecewise (SP) languages are a subclass of regular languages which encode certain kinds of long-distance dependencies that are found in natural languages. Like the classes in the Chomsky and Subregular hierarchies, there are many independently converging characterizations of the SP class (Rogers et al., to appear). Here we define SP distributions and show that they can be efficiently estimated from positive data.
2 0.92603242 217 acl-2010-String Extension Learning
Author: Jeffrey Heinz
Abstract: This paper provides a unified, learningtheoretic analysis of several learnable classes of languages discussed previously in the literature. The analysis shows that for these classes an incremental, globally consistent, locally conservative, set-driven learner always exists. Additionally, the analysis provides a recipe for constructing new learnable classes. Potential applications include learnable models for aspects of natural language and cognition.
3 0.49401599 234 acl-2010-The Use of Formal Language Models in the Typology of the Morphology of Amerindian Languages
Author: Andres Osvaldo Porta
Abstract: The aim of this work is to present some preliminary results of an investigation in course on the typology of the morphology of the native South American languages from the point of view of the formal language theory. With this object, we give two contrasting examples of descriptions of two Aboriginal languages finite verb forms morphology: Argentinean Quechua (quichua santiague n˜o) and Toba. The description of the morphology of the finite verb forms of Argentinean quechua, uses finite automata and finite transducers. In this case the construction is straightforward using two level morphology and then, describes in a very natural way the Argentinean Quechua morphology using a regular language. On the contrary, the Toba verbs morphology, with a system that simultaneously uses prefixes and suffixes, has not a natural description as regular language. Toba has a complex system of causative suffixes, whose successive applications determinate the use of prefixes belonging different person marking prefix sets. We adopt the solution of Creider et al. (1995) to naturally deal with this and other similar morphological processes which involve interactions between prefixes and suffixes and then we describe the toba morphology using linear context-free languages.1 .
4 0.43013445 186 acl-2010-Optimal Rank Reduction for Linear Context-Free Rewriting Systems with Fan-Out Two
Author: Benoit Sagot ; Giorgio Satta
Abstract: Linear Context-Free Rewriting Systems (LCFRSs) are a grammar formalism capable of modeling discontinuous phrases. Many parsing applications use LCFRSs where the fan-out (a measure of the discontinuity of phrases) does not exceed 2. We present an efficient algorithm for optimal reduction of the length of production right-hand side in LCFRSs with fan-out at most 2. This results in asymptotical running time improvement for known parsing algorithms for this class.
5 0.27654076 162 acl-2010-Learning Common Grammar from Multilingual Corpus
Author: Tomoharu Iwata ; Daichi Mochihashi ; Hiroshi Sawada
Abstract: We propose a corpus-based probabilistic framework to extract hidden common syntax across languages from non-parallel multilingual corpora in an unsupervised fashion. For this purpose, we assume a generative model for multilingual corpora, where each sentence is generated from a language dependent probabilistic contextfree grammar (PCFG), and these PCFGs are generated from a prior grammar that is common across languages. We also develop a variational method for efficient inference. Experiments on a non-parallel multilingual corpus of eleven languages demonstrate the feasibility of the proposed method.
6 0.27481145 29 acl-2010-An Exact A* Method for Deciphering Letter-Substitution Ciphers
7 0.26817924 40 acl-2010-Automatic Sanskrit Segmentizer Using Finite State Transducers
8 0.26786691 255 acl-2010-Viterbi Training for PCFGs: Hardness Results and Competitiveness of Uniform Initialization
9 0.25784269 61 acl-2010-Combining Data and Mathematical Models of Language Change
10 0.25060156 7 acl-2010-A Generalized-Zero-Preserving Method for Compact Encoding of Concept Lattices
11 0.23935373 95 acl-2010-Efficient Inference through Cascades of Weighted Tree Transducers
12 0.23918775 195 acl-2010-Phylogenetic Grammar Induction
13 0.23882641 111 acl-2010-Extracting Sequences from the Web
14 0.2321692 226 acl-2010-The Human Language Project: Building a Universal Corpus of the World's Languages
15 0.22954796 89 acl-2010-Distributional Similarity vs. PU Learning for Entity Set Expansion
16 0.22822498 41 acl-2010-Automatic Selectional Preference Acquisition for Latin Verbs
17 0.20889252 67 acl-2010-Computing Weakest Readings
18 0.20202971 128 acl-2010-Grammar Prototyping and Testing with the LinGO Grammar Matrix Customization System
19 0.20156911 66 acl-2010-Compositional Matrix-Space Models of Language
20 0.20014344 116 acl-2010-Finding Cognate Groups Using Phylogenies
topicId topicWeight
[(7, 0.019), (14, 0.021), (25, 0.039), (33, 0.115), (42, 0.011), (44, 0.012), (59, 0.048), (73, 0.026), (78, 0.031), (83, 0.044), (84, 0.476), (98, 0.075)]
simIndex simValue paperId paperTitle
same-paper 1 0.91993082 103 acl-2010-Estimating Strictly Piecewise Distributions
Author: Jeffrey Heinz ; James Rogers
Abstract: Strictly Piecewise (SP) languages are a subclass of regular languages which encode certain kinds of long-distance dependencies that are found in natural languages. Like the classes in the Chomsky and Subregular hierarchies, there are many independently converging characterizations of the SP class (Rogers et al., to appear). Here we define SP distributions and show that they can be efficiently estimated from positive data.
2 0.90029466 126 acl-2010-GernEdiT - The GermaNet Editing Tool
Author: Verena Henrich ; Erhard Hinrichs
Abstract: GernEdiT (short for: GermaNet Editing Tool) offers a graphical interface for the lexicographers and developers of GermaNet to access and modify the underlying GermaNet resource. GermaNet is a lexical-semantic wordnet that is modeled after the Princeton WordNet for English. The traditional lexicographic development of GermaNet was error prone and time-consuming, mainly due to a complex underlying data format and no opportunity of automatic consistency checks. GernEdiT replaces the earlier development by a more userfriendly tool, which facilitates automatic checking of internal consistency and correctness of the linguistic resource. This paper pre- sents all these core functionalities of GernEdiT along with details about its usage and usability. 1
3 0.72827047 220 acl-2010-Syntactic and Semantic Factors in Processing Difficulty: An Integrated Measure
Author: Jeff Mitchell ; Mirella Lapata ; Vera Demberg ; Frank Keller
Abstract: The analysis of reading times can provide insights into the processes that underlie language comprehension, with longer reading times indicating greater cognitive load. There is evidence that the language processor is highly predictive, such that prior context allows upcoming linguistic material to be anticipated. Previous work has investigated the contributions of semantic and syntactic contexts in isolation, essentially treating them as independent factors. In this paper we analyze reading times in terms of a single predictive measure which integrates a model of semantic composition with an incremental parser and a language model.
4 0.70930177 216 acl-2010-Starting from Scratch in Semantic Role Labeling
Author: Michael Connor ; Yael Gertner ; Cynthia Fisher ; Dan Roth
Abstract: A fundamental step in sentence comprehension involves assigning semantic roles to sentence constituents. To accomplish this, the listener must parse the sentence, find constituents that are candidate arguments, and assign semantic roles to those constituents. Each step depends on prior lexical and syntactic knowledge. Where do children learning their first languages begin in solving this problem? In this paper we focus on the parsing and argumentidentification steps that precede Semantic Role Labeling (SRL) training. We combine a simplified SRL with an unsupervised HMM part of speech tagger, and experiment with psycholinguisticallymotivated ways to label clusters resulting from the HMM so that they can be used to parse input for the SRL system. The results show that proposed shallow representations of sentence structure are robust to reductions in parsing accuracy, and that the contribution of alternative representations of sentence structure to successful semantic role labeling varies with the integrity of the parsing and argumentidentification stages.
5 0.60915011 18 acl-2010-A Study of Information Retrieval Weighting Schemes for Sentiment Analysis
Author: Georgios Paltoglou ; Mike Thelwall
Abstract: Most sentiment analysis approaches use as baseline a support vector machines (SVM) classifier with binary unigram weights. In this paper, we explore whether more sophisticated feature weighting schemes from Information Retrieval can enhance classification accuracy. We show that variants of the classic tf.idf scheme adapted to sentiment analysis provide significant increases in accuracy, especially when using a sublinear function for term frequency weights and document frequency smoothing. The techniques are tested on a wide selection of data sets and produce the best accuracy to our knowledge.
6 0.59604883 217 acl-2010-String Extension Learning
7 0.56412953 136 acl-2010-How Many Words Is a Picture Worth? Automatic Caption Generation for News Images
8 0.49872193 65 acl-2010-Complexity Metrics in an Incremental Right-Corner Parser
9 0.48549446 59 acl-2010-Cognitively Plausible Models of Human Language Processing
10 0.41693485 13 acl-2010-A Rational Model of Eye Movement Control in Reading
11 0.40022114 66 acl-2010-Compositional Matrix-Space Models of Language
12 0.38166773 67 acl-2010-Computing Weakest Readings
13 0.37363699 41 acl-2010-Automatic Selectional Preference Acquisition for Latin Verbs
14 0.3679291 234 acl-2010-The Use of Formal Language Models in the Typology of the Morphology of Amerindian Languages
15 0.36567634 162 acl-2010-Learning Common Grammar from Multilingual Corpus
16 0.36546829 191 acl-2010-PCFGs, Topic Models, Adaptor Grammars and Learning Topical Collocations and the Structure of Proper Names
17 0.36518914 175 acl-2010-Models of Metaphor in NLP
18 0.36320055 108 acl-2010-Expanding Verb Coverage in Cyc with VerbNet
19 0.36149153 14 acl-2010-A Risk Minimization Framework for Extractive Speech Summarization
20 0.35935897 131 acl-2010-Hierarchical A* Parsing with Bridge Outside Scores