acl acl2012 acl2012-209 knowledge-graph by maker-knowledge-mining

209 acl-2012-Unsupervised Semantic Role Induction with Global Role Ordering

Source: pdf

Author: Nikhil Garg ; James Henserdon

Abstract: We propose a probabilistic generative model for unsupervised semantic role induction, which integrates local role assignment decisions and a global role ordering decision in a unified model. The role sequence is divided into intervals based on the notion of primary roles, and each interval generates a sequence of secondary roles and syntactic constituents using local features. The global role ordering consists of the sequence of primary roles only, thus making it a partial ordering.

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Unsupervised Semantic Role Induction with Global Role Ordering Nikhil Garg University of Geneva Switzerland nikhi l garg@ unige . [sent-1, score-0.043]

2 ch Abstract We propose a probabilistic generative model for unsupervised semantic role induction, which integrates local role assignment decisions and a global role ordering decision in a unified model. [sent-3, score-1.306]

3 The role sequence is divided into intervals based on the notion of primary roles, and each interval generates a sequence of secondary roles and syntactic constituents using local features. [sent-4, score-1.232]

4 The global role ordering consists of the sequence of primary roles only, thus making it a partial ordering. [sent-5, score-1.045]

5 1 Introduction Unsupervised semantic role induction has gained significant interest recently (Lang and Lapata, 2011b) due to limited amounts of annotated corpora. [sent-6, score-0.391]

6 A Semantic Role Labeling (SRL) system should provide consistent argument labels across different syntactic realizations of the same verb (Palmer et al. [sent-7, score-0.115]

7 Sentence a is in active voice with sequence (A0, PREDICATE, A1) and sentence b is in passive voice with sequence (A1, PREDICATE, A0). [sent-11, score-0.458]

8 Additional global preferences, such as arguments A0 and A1 rarely repeat in a frame (as seen in the corpus), could also be useful in addition to local features. [sent-12, score-0.25]

9 145 James Henderson University of Geneva Switzerland j ame s henders on@ unige . [sent-13, score-0.043]

10 ch Supervised SRL systems have mostly used local classifiers that assign a role to each constituent independently of others, and only modeled limited correlations among roles in a sequence (Toutanova et al. [sent-15, score-0.801]

11 The correlations have been modeled via role sets (Gildea and Jurafsky, 2002), role repetition constraints (Punyakanok et al. [sent-17, score-0.591]

12 Lang and Lapata (201 1a; 2011b) use the relative position (left/right) of the argument w. [sent-23, score-0.089]

13 Grenager and Manning (2006) use an ordering of the linking of semantic roles and syntactic relations. [sent-27, score-0.694]

14 However, as the space of possible linkings is large, language-specific knowledge is used to constrain this space. [sent-28, score-0.024]

15 (2008), we propose to use global role ordering preferences but in a generative model in contrast to their discriminative one. [sent-30, score-0.65]

16 Further, unlike Grenager and Manning (2006), we do not explicitly generate the linking of semantic roles and syntactic relations, thus keeping the pa- rameter space tractable. [sent-31, score-0.468]

17 The main contribution of this work is an unsupervised model that uses global role ordering and repetition preferences without assuming any language-specific constraints. [sent-32, score-0.716]

18 Following Gildea and Jurafsky (2002), previous work has typically broken the SRL task into (i) argument identification, and (ii) argument classification (M` arquez et al. [sent-33, score-0.225]

19 Given the dependency parse tree of a sentence with correctly identified arguments, the aim is to assign a semantic role label to each argument. [sent-36, score-0.326]

20 begin by introducing a few terms: We Primary Role (PR) For every predicate, we assume the existence of K primary roles (PRs) denoted by P1, P2, . [sent-40, score-0.393]

21 These roles are not allowed to repeat in a frame and serve as “anchor points” in the global role ordering. [sent-44, score-0.788]

22 Intuitively, the model attempts to choose PRs such that they occur with high frequency, do not repeat, and their ordering influences the positioning of other roles. [sent-45, score-0.278]

23 Note that a PR may correspond to either a core role or a modifier role. [sent-46, score-0.344]

24 For ease of explication, we create 3 additional PRs: START denoting the start of the role sequence, END denoting its end, and PRED denoting the predicate. [sent-47, score-0.448]

25 Secondary Role (SR) The roles that are not PRs are called secondary roles (SRs). [sent-48, score-0.722]

26 Given N roles in total, there are (N K) SRs, denoted by S1, S2, . [sent-49, score-0.323]

27 tUhenlrieke a PRs, S−R Ks are Rnso,t d ecnonostetrdai bnyed to occur only once in a frame and do not participate in the global role ordering. [sent-53, score-0.437]

28 Interval An interval is a sequence of SRs bounded by PRs, for instance (P2, S3, S5, PRED). [sent-54, score-0.279]

29 Ordering An ordering is the sequence of PRs observed in a frame. [sent-55, score-0.32]

30 For example, if the complete role − 146 nodes represent visible and hidden variables resp. [sent-56, score-0.258]

31 sequence is (START, P2, S1, S1, PRED, S3, END), the ordering is defined as (START, P2, PRED, END). [sent-57, score-0.32]

32 Features We have explored 1 frame level (global) feature (i) voice: active/passive, and 3 argument level (local) features (i) deprel: dependency relation of an argument to its head in the dependency parse tree, (ii) head: head word of the argument, and (iii) pos-head: Part-of-Speech tag of head. [sent-58, score-0.351]

33 Algorithm 1 describes the generative story of our model and Figure 1illustrates it graphically. [sent-59, score-0.032]

34 Given a predicate and its voice, an ordering is selected from a multinomial. [sent-60, score-0.357]

35 This ordering gives us the sequence of PRs (PR1 , PR2, . [sent-61, score-0.32]

36 Each pair of consec- utive PRs, PRi, PRi+1, in an ordering corresponds to an interval Ii. [sent-65, score-0.43]

37 For each such interval, we generate 0 or more SRs (SRi1 , SRi2, . [sent-66, score-0.026]

38 Generate an indicator variable: CONTINUE/STOP from a binomial distribution. [sent-70, score-0.094]

39 If CONTINUE, generate a SR from the multinomial corresponding to the interval. [sent-71, score-0.072]

40 Generate another indicator variable and continue the process till a STOP has been generated. [sent-72, score-0.098]

41 In addition to the interval, the indicator variable also depends on whether we are generating the first SR (adj = 0) or a subsequent one (adj = 1). [sent-73, score-0.067]

42 For each role, primary as well as secondary, we now generate the corresponding constituent by generating each of its features independently (F1, F2 , . [sent-74, score-0.165]

43 Given a frame instance with predicate p and voice vc, Figure 2 gives (i) Eq. [sent-78, score-0.33]

44 1: the joint distribution of the ordering o, role sequence r, and constituent sequence f, and (ii) Eq. [sent-79, score-0.689]

45 Firstly, making the role ordering dependent only on PRs aligns with the observation by Pradhan et al. [sent-85, score-0.51]

46 (2008) that including the ordering information of only core roles helped improve the SRL performance as opposed to the complete role sequence. [sent-87, score-0.919]

47 Although our assumption here is softer in that we assume the existence of some roles which define the ordering which may or may not correspond to core roles. [sent-88, score-0.661]

48 Secondly, generating the SRs independently of each other given the interval is based on the intuition that knowing the core roles informs us about the expected non-core roles that occur between them. [sent-89, score-1.013]

49 This intuition is supported by the statistics in the annotated data, where we found that if we consider the core roles as PRs, then most of the intervals tend to have only a few types of SRs and a given SR tends to occur only in a few types of intervals. [sent-90, score-0.584]

50 The concept of intervals is also related to the linguistic theory of topological fields (Diderichsen, 1966; Drach, 1937). [sent-91, score-0.122]

51 This simplifying assumption that given the PRs at the interval boundary, the SRs in that interval are independent of the other roles in the sequence, keeps the parameter space limited, which helps unsupervised learning. [sent-92, score-0.773]

52 Thirdly, not allowing some or all roles to repeat has been employed as a useful constraint in previous work (Punyakanok et al. [sent-93, score-0.377]

53 Lastly, conditioning the (STOP/CONTINUE) indicator variable on the adjacency value (adj) is inspired from the DMV model (Klein and Manning, 2004) for unsupervised dependency parsing. [sent-95, score-0.163]

54 We found in the annotated corpus that if we map core roles to PRs, then most of the time the intervals do not generate any SRs at all. [sent-96, score-0.557]

55 In the E-step, we calculate the expected counts of all the hidden variables in our model using the InsideOutside algorithm (Baker, 1979). [sent-99, score-0.032]

56 In the M-step, we add the counts corresponding to the Bayesian priors to the expected counts and use the resulting counts to calculate the MAP estimate of the parameters. [sent-100, score-0.096]

57 , 2008), only consider verbal predicates, and run unsupervised training on the standard training set. [sent-102, score-0.063]

58 The evaluation measures are also the same: (i) Purity (PU) that measures how well an induced cluster corresponds to a single gold role, (ii) Collocation (CO) that measures how well a gold role corresponds to a single induced cluster, and (iii) F1 which is the harmonic mean of PU and CO. [sent-103, score-0.416]

59 Final scores are computed by weighting each predicate by the number of its argument instances. [sent-104, score-0.194]

60 1 Results Since the dataset has 21 semantic roles in total, we fix the total number of roles in our model to be 21. [sent-109, score-0.714]

61 1Removing the Bayesian priors completely, resulted in the EM algorithm getting to a local maxima quite early, giving a substantially lower performance. [sent-111, score-0.043]

62 This baseline maps 20 most frequent deprel to a role each, and the rest are mapped to the 21st role. [sent-119, score-0.366]

63 By just using deprel as a feature, the proposed model outperforms the baseline by 0. [sent-120, score-0.082]

64 In this configuration, the only addition over the baseline is the ordering model. [sent-122, score-0.252]

65 Adding head as a feature leads to sparsity, which results in a substantial decrease in collocation (lines 1b and 1d). [sent-123, score-0.13]

66 To address sparsity, we induced a distributed hidden representation for each word via a neural network, capturing the semantic similarity between words. [sent-125, score-0.108]

67 the number of PRs3 in the best performing configuration (Table 1, line 1c). [sent-136, score-0.033]

68 3Note that the system might not use all available PRs to label a given frame instance. [sent-140, score-0.079]

69 981307 Table 2: Performance variation with the number of PRs (excluding START, END and PRED) With only this additional ordering information, the performance is the same as the baseline. [sent-145, score-0.252]

70 Adding just 1 PR leads to a big increase in both purity and collocation. [sent-146, score-0.101]

71 Increasing the number of PRs beyond 1 leads to a gradual increase in purity and decline in collocation, with the best F1 score at 2 PRs. [sent-147, score-0.101]

72 In the extreme case, where all the roles are PRs and there are no SRs, the model would just learn the complete sequence of roles, which would make the parameter space too large to be tractable. [sent-149, score-0.423]

73 For calculating purity, each induced cluster (or role) is mapped to a particular gold role that has the maximum instances in the cluster. [sent-150, score-0.377]

74 Analyzing the output of our model (line 1c in Table 1), we found that about 98% of the PRs and 40% of the SRs got mapped to the gold core roles (A0,A1, etc. [sent-151, score-0.46]

75 This suggests that the model is indeed following the intuition that (i) the ordering of core roles is important information for SRL systems, and (ii) the intervals bounded by core roles provide good context information for classification of other roles. [sent-153, score-1.252]

76 4 Conclusions We propose a unified generative model for unsupervised semantic role induction that incorporates global role correlations as well as local feature information. [sent-154, score-0.901]

77 The results indicate that a small number of ordered primary roles (PRs) is a good representation of global ordering constraints for SRL. [sent-155, score-0.719]

78 This representation keeps the parameter space small enough for unsupervised learning. [sent-156, score-0.094]

79 Corpus-based induction of syntactic structure: Models of dependency and constituency. [sent-197, score-0.091]

80 Semantic role labeling: an introduction to the special issue. [sent-221, score-0.258]

81 The proposition bank: An annotated corpus of semantic roles. [sent-228, score-0.068]

82 The conll-2008 shared task on joint parsing of syntactic and semantic dependencies. [sent-257, score-0.094]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('prs', 0.546), ('roles', 0.323), ('srs', 0.261), ('role', 0.258), ('ordering', 0.252), ('interval', 0.178), ('lang', 0.164), ('pred', 0.146), ('voice', 0.146), ('intervals', 0.122), ('srl', 0.109), ('predicate', 0.105), ('argument', 0.089), ('vc', 0.087), ('core', 0.086), ('arlalw', 0.082), ('deprel', 0.082), ('lapata', 0.081), ('frame', 0.079), ('secondary', 0.076), ('global', 0.074), ('purity', 0.073), ('primary', 0.07), ('sequence', 0.068), ('semantic', 0.068), ('adj', 0.067), ('induction', 0.065), ('unsupervised', 0.063), ('punyakanok', 0.061), ('pr', 0.055), ('collocation', 0.055), ('binomial', 0.055), ('iz', 0.055), ('seq', 0.055), ('gildea', 0.054), ('repeat', 0.054), ('sr', 0.053), ('pradhan', 0.052), ('toutanova', 0.051), ('ri', 0.049), ('start', 0.049), ('pri', 0.048), ('garg', 0.048), ('rat', 0.048), ('head', 0.047), ('arquez', 0.047), ('denoting', 0.047), ('multinomial', 0.046), ('unige', 0.043), ('local', 0.043), ('constituent', 0.043), ('grenager', 0.04), ('correlations', 0.04), ('induced', 0.04), ('ii', 0.039), ('indicator', 0.039), ('switzerland', 0.038), ('end', 0.035), ('repetition', 0.035), ('surdeanu', 0.035), ('thompson', 0.035), ('preferences', 0.034), ('adjacency', 0.033), ('bounded', 0.033), ('line', 0.033), ('counts', 0.032), ('car', 0.032), ('ia', 0.032), ('extreme', 0.032), ('geneva', 0.032), ('pu', 0.032), ('generative', 0.032), ('continue', 0.031), ('keeps', 0.031), ('passive', 0.03), ('sd', 0.029), ('variable', 0.028), ('cluster', 0.028), ('leads', 0.028), ('palmer', 0.027), ('intuition', 0.027), ('dirichlet', 0.027), ('labeling', 0.027), ('fi', 0.026), ('generate', 0.026), ('syntactic', 0.026), ('mapped', 0.026), ('da', 0.026), ('independently', 0.026), ('occur', 0.026), ('gold', 0.025), ('linking', 0.025), ('sparsity', 0.024), ('excluding', 0.024), ('informs', 0.024), ('dro', 0.024), ('fpt', 0.024), ('linkings', 0.024), ('zp', 0.024), ('drove', 0.024)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000006 209 acl-2012-Unsupervised Semantic Role Induction with Global Role Ordering

Author: Nikhil Garg ; James Henserdon

2 0.32627741 64 acl-2012-Crosslingual Induction of Semantic Roles

Author: Ivan Titov ; Alexandre Klementiev

Abstract: We argue that multilingual parallel data provides a valuable source of indirect supervision for induction of shallow semantic representations. Specifically, we consider unsupervised induction of semantic roles from sentences annotated with automatically-predicted syntactic dependency representations and use a stateof-the-art generative Bayesian non-parametric model. At inference time, instead of only seeking the model which explains the monolingual data available for each language, we regularize the objective by introducing a soft constraint penalizing for disagreement in argument labeling on aligned sentences. We propose a simple approximate learning algorithm for our set-up which results in efficient inference. When applied to German-English parallel data, our method obtains a substantial improvement over a model trained without using the agreement signal, when both are tested on non-parallel sentences.

3 0.14432791 176 acl-2012-Sentence Compression with Semantic Role Constraints

Author: Katsumasa Yoshikawa ; Ryu Iida ; Tsutomu Hirao ; Manabu Okumura

Abstract: For sentence compression, we propose new semantic constraints to directly capture the relations between a predicate and its arguments, whereas the existing approaches have focused on relatively shallow linguistic properties, such as lexical and syntactic information. These constraints are based on semantic roles and superior to the constraints of syntactic dependencies. Our empirical evaluation on the Written News Compression Corpus (Clarke and Lapata, 2008) demonstrates that our system achieves results comparable to other state-of-the-art techniques.

4 0.14095636 147 acl-2012-Modeling the Translation of Predicate-Argument Structure for SMT

Author: Deyi Xiong ; Min Zhang ; Haizhou Li

Abstract: Predicate-argument structure contains rich semantic information of which statistical machine translation hasn’t taken full advantage. In this paper, we propose two discriminative, feature-based models to exploit predicateargument structures for statistical machine translation: 1) a predicate translation model and 2) an argument reordering model. The predicate translation model explores lexical and semantic contexts surrounding a verbal predicate to select desirable translations for the predicate. The argument reordering model automatically predicts the moving direction of an argument relative to its predicate after translation using semantic features. The two models are integrated into a state-of-theart phrase-based machine translation system and evaluated on Chinese-to-English transla- , tion tasks with large-scale training data. Experimental results demonstrate that the two models significantly improve translation accuracy.

5 0.091070406 172 acl-2012-Selective Sharing for Multilingual Dependency Parsing

Author: Tahira Naseem ; Regina Barzilay ; Amir Globerson

Abstract: We present a novel algorithm for multilingual dependency parsing that uses annotations from a diverse set of source languages to parse a new unannotated language. Our motivation is to broaden the advantages of multilingual learning to languages that exhibit significant differences from existing resource-rich languages. The algorithm learns which aspects of the source languages are relevant for the target language and ties model parameters accordingly. The model factorizes the process of generating a dependency tree into two steps: selection of syntactic dependents and their ordering. Being largely languageuniversal, the selection component is learned in a supervised fashion from all the training languages. In contrast, the ordering decisions are only influenced by languages with similar properties. We systematically model this cross-lingual sharing using typological features. In our experiments, the model consistently outperforms a state-of-the-art multilingual parser. The largest improvement is achieved on the non Indo-European languages yielding a gain of 14.4%.1

6 0.074179389 33 acl-2012-Automatic Event Extraction with Structured Preference Modeling

7 0.07042031 214 acl-2012-Verb Classification using Distributional Similarity in Syntactic and Semantic Structures

8 0.068683237 167 acl-2012-QuickView: NLP-based Tweet Search

9 0.060733221 90 acl-2012-Extracting Narrative Timelines as Temporal Dependency Structures

10 0.059061196 4 acl-2012-A Comparative Study of Target Dependency Structures for Statistical Machine Translation

11 0.052619249 191 acl-2012-Temporally Anchored Relation Extraction

12 0.050045229 48 acl-2012-Classifying French Verbs Using French and English Lexical Resources

13 0.048088882 117 acl-2012-Improving Word Representations via Global Context and Multiple Word Prototypes

14 0.047334518 158 acl-2012-PORT: a Precision-Order-Recall MT Evaluation Metric for Tuning

15 0.047323752 208 acl-2012-Unsupervised Relation Discovery with Sense Disambiguation

16 0.047058631 108 acl-2012-Hierarchical Chunk-to-String Translation

17 0.045905076 5 acl-2012-A Comparison of Chinese Parsers for Stanford Dependencies

18 0.042679086 57 acl-2012-Concept-to-text Generation via Discriminative Reranking

19 0.041791622 213 acl-2012-Utilizing Dependency Language Models for Graph-based Dependency Parsing Models

20 0.040754382 109 acl-2012-Higher-order Constituent Parsing and Parser Combination

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.151), (1, 0.03), (2, -0.077), (3, 0.02), (4, -0.012), (5, -0.045), (6, -0.02), (7, 0.079), (8, 0.043), (9, 0.064), (10, 0.015), (11, -0.079), (12, 0.118), (13, -0.086), (14, -0.37), (15, 0.074), (16, 0.054), (17, -0.098), (18, 0.031), (19, 0.159), (20, 0.01), (21, 0.054), (22, 0.006), (23, -0.077), (24, 0.02), (25, 0.01), (26, 0.078), (27, 0.052), (28, -0.047), (29, 0.005), (30, -0.034), (31, -0.04), (32, -0.012), (33, -0.005), (34, 0.063), (35, 0.097), (36, -0.161), (37, 0.044), (38, -0.076), (39, -0.067), (40, 0.115), (41, -0.09), (42, -0.126), (43, 0.013), (44, -0.047), (45, 0.011), (46, -0.086), (47, -0.185), (48, -0.167), (49, 0.008)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.97149622 209 acl-2012-Unsupervised Semantic Role Induction with Global Role Ordering

Author: Nikhil Garg ; James Henserdon

2 0.86491686 64 acl-2012-Crosslingual Induction of Semantic Roles

Author: Ivan Titov ; Alexandre Klementiev

3 0.5319038 147 acl-2012-Modeling the Translation of Predicate-Argument Structure for SMT

Author: Deyi Xiong ; Min Zhang ; Haizhou Li

4 0.49591795 176 acl-2012-Sentence Compression with Semantic Role Constraints

Author: Katsumasa Yoshikawa ; Ryu Iida ; Tsutomu Hirao ; Manabu Okumura

5 0.40418777 53 acl-2012-Combining Textual Entailment and Argumentation Theory for Supporting Online Debates Interactions

Author: Elena Cabrio ; Serena Villata

Abstract: Blogs and forums are widely adopted by online communities to debate about various issues. However, a user that wants to cut in on a debate may experience some difficulties in extracting the current accepted positions, and can be discouraged from interacting through these applications. In our paper, we combine textual entailment with argumentation theory to automatically extract the arguments from debates and to evaluate their acceptability.

6 0.37905526 172 acl-2012-Selective Sharing for Multilingual Dependency Parsing

7 0.31493947 33 acl-2012-Automatic Event Extraction with Structured Preference Modeling

8 0.30591148 11 acl-2012-A Feature-Rich Constituent Context Model for Grammar Induction

9 0.28675467 214 acl-2012-Verb Classification using Distributional Similarity in Syntactic and Semantic Structures

10 0.27682206 129 acl-2012-Learning High-Level Planning from Text

11 0.25076652 167 acl-2012-QuickView: NLP-based Tweet Search

12 0.24811888 34 acl-2012-Automatically Learning Measures of Child Language Development

13 0.24442405 127 acl-2012-Large-Scale Syntactic Language Modeling with Treelets

14 0.23037712 117 acl-2012-Improving Word Representations via Global Context and Multiple Word Prototypes

15 0.22861962 4 acl-2012-A Comparative Study of Target Dependency Structures for Statistical Machine Translation

16 0.22405177 57 acl-2012-Concept-to-text Generation via Discriminative Reranking

17 0.21999374 218 acl-2012-You Had Me at Hello: How Phrasing Affects Memorability

18 0.21858364 189 acl-2012-Syntactic Annotations for the Google Books NGram Corpus

19 0.21778744 68 acl-2012-Decoding Running Key Ciphers

20 0.21746969 112 acl-2012-Humor as Circuits in Semantic Networks

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(26, 0.443), (28, 0.031), (30, 0.021), (37, 0.085), (39, 0.03), (74, 0.025), (82, 0.018), (85, 0.02), (90, 0.098), (92, 0.054), (94, 0.019), (99, 0.032)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.91504306 91 acl-2012-Extracting and modeling durations for habits and events from Twitter

Author: Jennifer Williams ; Graham Katz

Abstract: We seek to automatically estimate typical durations for events and habits described in Twitter tweets. A corpus of more than 14 million tweets containing temporal duration information was collected. These tweets were classified as to their habituality status using a bootstrapped, decision tree. For each verb lemma, associated duration information was collected for episodic and habitual uses of the verb. Summary statistics for 483 verb lemmas and their typical habit and episode durations has been compiled and made available. This automatically generated duration information is broadly comparable to hand-annotation. 1

same-paper 2 0.90690953 209 acl-2012-Unsupervised Semantic Role Induction with Global Role Ordering

Author: Nikhil Garg ; James Henserdon

3 0.88078624 43 acl-2012-Building Trainable Taggers in a Web-based, UIMA-Supported NLP Workbench

Author: Rafal Rak ; BalaKrishna Kolluru ; Sophia Ananiadou

Abstract: Argo is a web-based NLP and text mining workbench with a convenient graphical user interface for designing and executing processing workflows of various complexity. The workbench is intended for specialists and nontechnical audiences alike, and provides the ever expanding library of analytics compliant with the Unstructured Information Management Architecture, a widely adopted interoperability framework. We explore the flexibility of this framework by demonstrating workflows involving three processing components capable of performing self-contained machine learning-based tagging. The three components are responsible for the three distinct tasks of 1) generating observations or features, 2) training a statistical model based on the generated features, and 3) tagging unlabelled data with the model. The learning and tagging components are based on an implementation of conditional random fields (CRF); whereas the feature generation component is an analytic capable of extending basic token information to a comprehensive set of features. Users define the features of their choice directly from Argo’s graphical interface, without resorting to programming (a commonly used approach to feature engineering). The experimental results performed on two tagging tasks, chunking and named entity recognition, showed that a tagger with a generic set of features built in Argo is capable of competing with taskspecific solutions. 121

4 0.82722157 134 acl-2012-Learning to Find Translations and Transliterations on the Web

Author: Joseph Z. Chang ; Jason S. Chang ; Roger Jyh-Shing Jang

Abstract: Jason S. Chang Department of Computer Science, National Tsing Hua University 101, Kuangfu Road, Hsinchu, 300, Taiwan j s chang@ c s .nthu . edu .tw Jyh-Shing Roger Jang Department of Computer Science, National Tsing Hua University 101, Kuangfu Road, Hsinchu, 300, Taiwan j ang@ c s .nthu .edu .tw identifying such translation counterparts Web, we can cope with the OOV problem. In this paper, we present a new method on the for learning to finding translations and transliterations on the Web for a given term. The approach involves using a small set of terms and translations to obtain mixed-code snippets from a search engine, and automatically annotating the snippets with tags and features for training a conditional random field model. At runtime, the model is used to extracting translation candidates for a given term. Preliminary experiments and evaluation show our method cleanly combining various features, resulting in a system that outperforms previous work. 1

5 0.71565539 41 acl-2012-Bootstrapping a Unified Model of Lexical and Phonetic Acquisition

Author: Micha Elsner ; Sharon Goldwater ; Jacob Eisenstein

Abstract: ILCC, School of Informatics School of Interactive Computing University of Edinburgh Georgia Institute of Technology Edinburgh, EH8 9AB, UK Atlanta, GA, 30308, USA (a) intended: /ju want w2n/ /want e kUki/ (b) surface: [j@ w a?P w2n] [wan @ kUki] During early language acquisition, infants must learn both a lexicon and a model of phonetics that explains how lexical items can vary in pronunciation—for instance “the” might be realized as [Di] or [D@]. Previous models of acquisition have generally tackled these problems in isolation, yet behavioral evidence suggests infants acquire lexical and phonetic knowledge simultaneously. We present a Bayesian model that clusters together phonetic variants of the same lexical item while learning both a language model over lexical items and a log-linear model of pronunciation variability based on articulatory features. The model is trained on transcribed surface pronunciations, and learns by bootstrapping, without access to the true lexicon. We test the model using a corpus of child-directed speech with realistic phonetic variation and either gold standard or automatically induced word boundaries. In both cases modeling variability improves the accuracy of the learned lexicon over a system that assumes each lexical item has a unique pronunciation.

6 0.5654251 48 acl-2012-Classifying French Verbs Using French and English Lexical Resources

7 0.55443037 187 acl-2012-Subgroup Detection in Ideological Discussions

8 0.50818902 11 acl-2012-A Feature-Rich Constituent Context Model for Grammar Induction

9 0.49558139 174 acl-2012-Semantic Parsing with Bayesian Tree Transducers

10 0.49409276 5 acl-2012-A Comparison of Chinese Parsers for Stanford Dependencies

11 0.48889771 83 acl-2012-Error Mining on Dependency Trees

12 0.4885869 147 acl-2012-Modeling the Translation of Predicate-Argument Structure for SMT

13 0.48345912 214 acl-2012-Verb Classification using Distributional Similarity in Syntactic and Semantic Structures

14 0.47997671 211 acl-2012-Using Rejuvenation to Improve Particle Filtering for Bayesian Word Segmentation

15 0.47668985 64 acl-2012-Crosslingual Induction of Semantic Roles

16 0.4759472 130 acl-2012-Learning Syntactic Verb Frames using Graphical Models

17 0.47150862 206 acl-2012-UWN: A Large Multilingual Lexical Knowledge Base

18 0.4713968 45 acl-2012-Capturing Paradigmatic and Syntagmatic Lexical Relations: Towards Accurate Chinese Part-of-Speech Tagging

19 0.46972471 148 acl-2012-Modified Distortion Matrices for Phrase-Based Statistical Machine Translation

20 0.46894646 84 acl-2012-Estimating Compact Yet Rich Tree Insertion Grammars