emnlp emnlp2013 emnlp2013-198 knowledge-graph by maker-knowledge-mining

198 emnlp-2013-Using Soft Constraints in Joint Inference for Clinical Concept Recognition


Source: pdf

Author: Prateek Jindal ; Dan Roth

Abstract: This paper introduces IQPs (Integer Quadratic Programs) as a way to model joint inference for the task of concept recognition in clinical domain. IQPs make it possible to easily incorporate soft constraints in the optimization framework and still support exact global inference. We show that soft constraints give statistically significant performance improvements when compared to hard constraints.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu , Abstract This paper introduces IQPs (Integer Quadratic Programs) as a way to model joint inference for the task of concept recognition in clinical domain. [sent-3, score-0.637]

2 IQPs make it possible to easily incorporate soft constraints in the optimization framework and still support exact global inference. [sent-4, score-0.467]

3 We show that soft constraints give statistically significant performance improvements when compared to hard constraints. [sent-5, score-0.404]

4 1 Introduction In this paper, we study the problem of concept recognition in the clinical domain. [sent-6, score-0.504]

5 , 2012; Roberts and Harabagiu, 2011; Jindal and Roth, 2013) for concept recognition in clinical domain (Uzuner et al. [sent-13, score-0.504]

6 These approaches are limited by the fact that they can model only local dependencies (most often, first-order models like linear chain CRFs are used to allow tractable inference). [sent-17, score-0.036]

7 Knowledge in this domain can be thought of as belonging to two categories: (1) Background Knowledge captured in medical ontologies like UMLS (Url1, 2013), MeSH and SNOMED CT and (2) Discourse Knowledge driven by the fact that the narratives adhere to a specific writing style. [sent-19, score-0.309]

8 While the former can be used by generating more expressive knowledge-rich features, the latter is more interesting from our current perspective, 1808 since it provides global constraints on what output structures are likely and what are not. [sent-20, score-0.258]

9 We exploit this structural knowledge in our global inference formulation. [sent-21, score-0.2]

10 Integer Linear Programming (ILP) based approaches have been used for global inference in many works (Roth and Yih, 2004; Punyakanok et al. [sent-22, score-0.2]

11 However, in most of these works, researchers have focussed only on hard constraints while formulating the inference problem. [sent-28, score-0.455]

12 Formulating all the constraints as hard constraints is not always desirable because the constraints are not perfect in many cases. [sent-29, score-0.641]

13 In this paper, we propose Integer Quadratic Programs (IQPs) as a way of formulating the inference problem. [sent-30, score-0.196]

14 IQPs is a richer family of models than ILPs and it enables us to easily incorporate soft constraints into the inference procedure. [sent-31, score-0.469]

15 Our experimental results show that soft constraints indeed give much better performance than hard constraints. [sent-32, score-0.404]

16 2 Identifying Medical Concepts Task Description Our input consists ofclinical reports in free-text (unstructured) format. [sent-33, score-0.074]

17 The task is: (1) to identify the boundaries of medical concepts and (2) to assign types to such concepts. [sent-34, score-0.518]

18 We would refer to these three types by TEST, TRE and PROB in the following discussion. [sent-36, score-0.036]

19 Our Approach In the first step, we identify the concept boundaries using a CRF (with BIO encodProce Sdeiantgtlse o,f W thaesh 2i0n1gt3o nC,o UnSfeAre,n 1c8e- o2n1 E Omctpoibriecra 2l0 M13et. [sent-37, score-0.2]

20 (b) Example 2 Figure 1: This figure motivates the global inference procedure we used. [sent-41, score-0.229]

21 After finding concept boundaries, we determine the probability distribution for each concept over 4 possible types (TEST, TRE, PROB or NULL). [sent-46, score-0.368]

22 Inference Procedure: The final assignment of types to concepts is determined by an inference procedure. [sent-49, score-0.367]

23 The basic principle behind our inference procedure is: “Types of concepts which appear close to one another are often closely related. [sent-50, score-0.36]

24 For some concepts, type can be determined with more confidence. [sent-51, score-0.046]

25 And relations between concepts ’ types guide the inference procedure to determine the types of other concepts. [sent-52, score-0.432]

26 Figure 1 shows two sentences in which the concepts are shown in brackets and correct (gold) types of concepts are shown above them. [sent-54, score-0.432]

27 First, consider first and second concepts in Figure 1a. [sent-55, score-0.198]

28 These concepts follow the pattern: [Concept1] gave positive evidence for [Concept2]. [sent-56, score-0.198]

29 In clinical narratives, such a pattern strongly suggests that Concept1 is of type TEST and Concept2 is of type PROB. [sent-57, score-0.43]

30 these concepts are separated by commas and hence, form a list. [sent-61, score-0.198]

31 It is highly likely that such concepts should have the same type. [sent-62, score-0.198]

32 Each of the m concepts has to be assigned one of the following types: TEST, TRE, PROB or NULL. [sent-65, score-0.198]

33 To represent this as an inference problem, we define the indicator variables xi,j where itakes values from 1to m (corresponding to concepts) and j takes values from 1to 4 (corresponding to 4 possible types). [sent-66, score-0.133]

34 pi,j refers to the probability that the ith concept has type j. [sent-67, score-0.212]

35 Equation (2) enforces the constraint that each concept has a unique type. [sent-69, score-0.301]

36 1 Constraints Used In this subsection, we will describe two additional types of constraints (Type-2 and Type-3) that were added to the optimization procedure described above. [sent-72, score-0.32]

37 Whereas Type-1 constraints described above were formulated as hard constraints, Type-2 and Type-3 constraints are formulated as soft constraints. [sent-73, score-0.595]

38 suggest that the 2 concepts appearing in them should have the same type. [sent-77, score-0.198]

39 Also, assume that the lth constraint says that the concepts Rl and Sl should have the same type. [sent-80, score-0.368]

40 hTeo cmonocdeepl this, we dde Sfine a variable wl as follows: X4 wl = X(xRl,m − xSl,m)2 (4) mX=1 Now, if the concepts Rl and Sl have the same type, wth,e nif wl w coonulcde pbtse equal dto 0; otherwise, wl would be equal to 2. [sent-81, score-0.414]

41 So, the lth constraint can be enforced by subtracting (ρ2 · w2l) from the objective function given by Equation (1). [sent-82, score-0.221]

42 Thus, a penalty of ρ2 would be enforced iff this constraint is violated. [sent-83, score-0.232]

43 2 Type-3 Constraints Some short patterns suggest possible types for the concepts which appear in them. [sent-86, score-0.234]

44 Each such pattern, thus, enforces a constraint on the types of corre- sponding concepts. [sent-87, score-0.171]

45 Also, assume that the kth constraint says that the concept A1,k should have the type B1,k san tdha tth tahte t cheo concept A2,k should have ttyhep type B2,k. [sent-90, score-0.612]

46 Equivalently, tehpet Akth constraint can tbhee w tyripteten B as follows in boolean algebra notation: (xA1,k,B1,k = 1) ∧ (xA2,k,B2,k = 1) . [sent-91, score-0.178]

47 For the kth constraint, we i=ntr 1o)d∧uc(xe one more variable zk ∈ {0, 1} which satisfies the following condition: zk = 1 ⇔ xA1,k,B1,k ∧ xA2,k,B2,k (5) Using boolean algebra, it is easy to show that Equation (5) can be reduced to a set of linear inequalities. [sent-92, score-0.374]

48 Thus, we can incorporate the kth con1810 Xm X4 mxaxXXxi,j Xn3 · pi,j −Xρ3(1 − zk) −Xl=n21i= 1ρj2=1·P4m=1(xRl,k2m=1− xSl,m)2! [sent-93, score-0.055]

49 X4 (6) subject to Xxi,j = 1 ∀i (7) Xj=1 xi,j ∈ zk = 1⇔ xA1,k,B1,k {0, 1} ∀i, j (8) ∧ xA2,k,B2,k∀k ∈ {1. [sent-94, score-0.127]

50 n3} (9) Figure 2: Final Optimization Problem (an IQP) straint in the optimization problem by adding to it the constraint given by Equation (5) and by subtracting (ρ3 (1 zk)) from the objective function given by Equation (1). [sent-97, score-0.213]

51 Thus, a penalty of ρ3 is imposed iff kth constraint is not satisfied (zk = 0). [sent-98, score-0.252]

52 2 Final Optimization Problem - An IQP After incorporating all the constraints mentioned above, the final optimization problem (an IQP) is shown in Figure 2. [sent-100, score-0.255]

53 The datasets used for this shared task contained de-identied clinical reports from three medical institutions: Partners Healthcare (PH), Beth-Israel Deaconess Medical Center (BIDMC) and the University of Pittsburgh Medical Center (UPMC). [sent-107, score-0.662]

54 UPMC data was divided into 2 sections, namely discharge (UPMCD) and progress notes (UPMCP). [sent-108, score-0.08]

55 A total of 349 training reports and 477 test reports were made available to the participants. [sent-109, score-0.148]

56 As a result, we had only 170 clinical reports for training and 256 clinical reports for testing. [sent-111, score-0.824]

57 Table 3 shows the number of clinical reports made available by different institutions. [sent-112, score-0.412]

58 17 8 Table 2: Our final system, BKC, consistently performed the best among all 4 systems (B, BK, BC and BKC). [sent-125, score-0.035]

59 Both B and BK do not use the inference procedure. [sent-132, score-0.133]

60 BKC uses all the features and also the inference procedure. [sent-133, score-0.133]

61 However, it sets ρ2 = ρ3 = 1which effectively turns Type-2 and Type-3 constraints into hard constraints by imposing very high penalty. [sent-136, score-0.45]

62 1 Importance of Soft Constraints Figures 3a and 3b show the effect of varying the penalties (ρ2 and ρ3) for Type-2 and Type-3 constraints respectively. [sent-139, score-0.191]

63 These figures show the F1score of BKC on the development set. [sent-140, score-0.041]

64 Penalty of 0 means that the constraint is not active. [sent-141, score-0.103]

65 As we increase the penalty, the constraint becomes stronger. [sent-142, score-0.103]

66 As the penalty becomes 1, the constraint becomes hard in the sense that final assignments must respect 1811 Tunnig Penatly Parameter for Type−2 Constranits Tunnig Penatly Parameter for Type−3 Constranits rFose−1c78 809 094. [sent-143, score-0.233]

67 70891 Penatly Parameter for Type−2 Constranits (ρ 2) (a) Type-2 Constraints Penatly Parameter for Type−3 Constranits (ρ 3) (b) Type-3 Constraints Figure 3: These figures show the result of tuning the penalty parameters (ρ2 and ρ3) for soft constraints. [sent-153, score-0.248]

68 178C7 Table 4: Soft constraints (BKC) consistently perform much better than hard constraints (BKC-HARD). [sent-156, score-0.485]

69 We observe from Figures 3a and 3b that for Type-2 and Type-3 constraints, global maxima is attained at ρ2 = 0. [sent-158, score-0.067]

70 es−orF1c8 2140653 BK6C078910 2130 Effect of Traninig Data Szie on Performance Traninig Data Szie (# clnicial reports) Figure 4: This figure shows the effect of training data size on performance of concept recognition. [sent-168, score-0.166]

71 We see from Table 2 that BKC consistently performed the best for individual as well as overall categories1 . [sent-174, score-0.035]

72 Thus, the constraints are helpful even in the absence of knowledge-based features. [sent-178, score-0.191]

73 We also notice from the figure that BKC consistently outperforms the state-ofthe-art BK system as we vary the size of the training data, indicating the robustness of the joint inference procedure. [sent-185, score-0.168]

74 1812 5 Discussion and Related Work In this paper, we chose to train a rather simple sequential model (using CRF), and focused on incorporating global constraints only at inference time2. [sent-187, score-0.391]

75 While it is possible to jointly train the model with the global constraints (as illustrated by Chang et al. [sent-188, score-0.258]

76 Roth and Yih (2004, 2007) suggested the use of integer programs to model joint inference in a fully supervised setting. [sent-192, score-0.278]

77 However, they used only hard constraints in their inference formulation. [sent-194, score-0.392]

78 (2012) extended the ILP formulation and used soft constraints within the Constrained Conditional Model formulation (Chang, 2011). [sent-196, score-0.336]

79 In this paper, we extended the integer linear programming to a quadratic formulation, argu- ing that it simplifies the modeling step3, and showed that it is possible to do exact inference efficiently. [sent-198, score-0.368]

80 Conclusion This paper presented a global inference strategy (using IQP) for concept recognition which allows us to model structural knowledge of the clinical domain as soft constraints in the optimization framework. [sent-199, score-1.104]

81 Our results showed that soft constraints are more effective than hard constraints. [sent-200, score-0.404]

82 In Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, pages 359–366. [sent-221, score-0.034]

83 In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, pages 189–198. [sent-231, score-0.034]

84 In Association for Computational Linguistics, pages 280–287, Prague, Czech Republic, 6. [sent-248, score-0.034]

85 In Proceedings of the Fifteenth Conference on Computational Natural Language Learning: Shared Task, pages 40–44, Portland, Oregon, USA. [sent-260, score-0.034]

86 In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pages 1–1 1. [sent-276, score-0.034]

87 Global inference for sentence compression: An integer linear programming approach. [sent-281, score-0.328]

88 Machine-learned solutions for three stages of clinical information extraction: the state of 1813 the art at i2b2 2010. [sent-290, score-0.338]

89 Joint determination of anaphoricity and coreference resolution using integer programming. [sent-296, score-0.125]

90 In Proceedings of Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics, pages 236–243. [sent-297, score-0.034]

91 A study of machinelearning-based approaches to extract clinical entities and their assertions from discharge summaries. [sent-313, score-0.471]

92 Hybrid methods for improving information access in clinical documents: Concept, assertion, and relation identification. [sent-372, score-0.338]

93 A knowledge discovery and reuse pipeline for information extraction in clinical notes. [sent-383, score-0.338]

94 The importance of syntactic parsing and inference in semantic role labeling. [sent-396, score-0.133]

95 A flexible framework for deriving assertions from electronic medical records. [sent-410, score-0.303]

96 A linear programming formu- lation for global inference in natural language tasks. [sent-416, score-0.301]

97 Global inference for entity and relation identification via a linear programming formulation. [sent-429, score-0.234]

98 Using machine learning for concept extraction on clinical documents from multiple data sources. [sent-436, score-0.504]

99 2010 i2b2/va challenge on concepts, assertions, and relations in clinical text. [sent-481, score-0.338]

100 Feature engineering combined with machine learning and rule-based methods for structured information extraction from narrative clinical discharge summaries. [sent-490, score-0.418]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('clinical', 0.338), ('bkc', 0.293), ('medical', 0.25), ('prob', 0.232), ('concepts', 0.198), ('constraints', 0.191), ('concept', 0.166), ('iqps', 0.16), ('metamap', 0.16), ('snomed', 0.16), ('soft', 0.145), ('bk', 0.138), ('tre', 0.135), ('inference', 0.133), ('zk', 0.127), ('mesh', 0.116), ('roth', 0.114), ('constranits', 0.107), ('iqp', 0.107), ('penatly', 0.107), ('constraint', 0.103), ('accessed', 0.101), ('integer', 0.094), ('july', 0.088), ('punyakanok', 0.085), ('informatics', 0.083), ('discharge', 0.08), ('ilps', 0.08), ('upmc', 0.08), ('reports', 0.074), ('umls', 0.07), ('clarke', 0.069), ('crf', 0.068), ('mann', 0.068), ('bc', 0.068), ('hard', 0.068), ('global', 0.067), ('programming', 0.065), ('chang', 0.064), ('optimization', 0.064), ('uzuner', 0.063), ('formulating', 0.063), ('yih', 0.063), ('penalty', 0.062), ('narratives', 0.059), ('kth', 0.055), ('ct', 0.055), ('wl', 0.054), ('aronson', 0.053), ('assoc', 0.053), ('bramsen', 0.053), ('bruijn', 0.053), ('headword', 0.053), ('hhs', 0.053), ('marciniak', 0.053), ('med', 0.053), ('minard', 0.053), ('szie', 0.053), ('torii', 0.053), ('traninig', 0.053), ('tunnig', 0.053), ('assertions', 0.053), ('jindal', 0.053), ('programs', 0.051), ('mccallum', 0.051), ('subtracting', 0.046), ('memm', 0.046), ('algebra', 0.046), ('roberts', 0.046), ('type', 0.046), ('equation', 0.044), ('info', 0.042), ('american', 0.042), ('figures', 0.041), ('quadratic', 0.04), ('rl', 0.039), ('iarpa', 0.039), ('lth', 0.037), ('gideon', 0.037), ('gurobi', 0.037), ('http', 0.036), ('linear', 0.036), ('types', 0.036), ('ganchev', 0.035), ('enforced', 0.035), ('consistently', 0.035), ('boundaries', 0.034), ('pages', 0.034), ('illinois', 0.034), ('denis', 0.032), ('enforces', 0.032), ('iff', 0.032), ('coreference', 0.031), ('please', 0.03), ('says', 0.03), ('association', 0.03), ('procedure', 0.029), ('suppose', 0.029), ('resampling', 0.029), ('boolean', 0.029)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000002 198 emnlp-2013-Using Soft Constraints in Joint Inference for Clinical Concept Recognition

Author: Prateek Jindal ; Dan Roth

Abstract: This paper introduces IQPs (Integer Quadratic Programs) as a way to model joint inference for the task of concept recognition in clinical domain. IQPs make it possible to easily incorporate soft constraints in the optimization framework and still support exact global inference. We show that soft constraints give statistically significant performance improvements when compared to hard constraints.

2 0.11935897 85 emnlp-2013-Fast Joint Compression and Summarization via Graph Cuts

Author: Xian Qian ; Yang Liu

Abstract: Extractive summarization typically uses sentences as summarization units. In contrast, joint compression and summarization can use smaller units such as words and phrases, resulting in summaries containing more information. The goal of compressive summarization is to find a subset of words that maximize the total score of concepts and cutting dependency arcs under the grammar constraints and summary length constraint. We propose an efficient decoding algorithm for fast compressive summarization using graph cuts. Our approach first relaxes the length constraint using Lagrangian relaxation. Then we propose to bound the relaxed objective function by the supermodular binary quadratic programming problem, which can be solved efficiently using graph max-flow/min-cut. Since finding the tightest lower bound suffers from local optimality, we use convex relaxation for initialization. Experimental results on TAC2008 dataset demonstrate our method achieves competitive ROUGE score and has good readability, while is much faster than the integer linear programming (ILP) method.

3 0.11809659 114 emnlp-2013-Joint Learning and Inference for Grammatical Error Correction

Author: Alla Rozovskaya ; Dan Roth

Abstract: State-of-the-art systems for grammatical error correction are based on a collection of independently-trained models for specific errors. Such models ignore linguistic interactions at the sentence level and thus do poorly on mistakes that involve grammatical dependencies among several words. In this paper, we identify linguistic structures with interacting grammatical properties and propose to address such dependencies via joint inference and joint learning. We show that it is possible to identify interactions well enough to facilitate a joint approach and, consequently, that joint methods correct incoherent predictions that independentlytrained classifiers tend to produce. Furthermore, because the joint learning model considers interacting phenomena during training, it is able to identify mistakes that require mak- ing multiple changes simultaneously and that standard approaches miss. Overall, our model significantly outperforms the Illinois system that placed first in the CoNLL-2013 shared task on grammatical error correction.

4 0.11437686 1 emnlp-2013-A Constrained Latent Variable Model for Coreference Resolution

Author: Kai-Wei Chang ; Rajhans Samdani ; Dan Roth

Abstract: Coreference resolution is a well known clustering task in Natural Language Processing. In this paper, we describe the Latent Left Linking model (L3M), a novel, principled, and linguistically motivated latent structured prediction approach to coreference resolution. We show that L3M admits efficient inference and can be augmented with knowledge-based constraints; we also present a fast stochastic gradient based learning. Experiments on ACE and Ontonotes data show that L3M and its constrained version, CL3M, are more accurate than several state-of-the-art approaches as well as some structured prediction models proposed in the literature.

5 0.098296545 53 emnlp-2013-Cross-Lingual Discriminative Learning of Sequence Models with Posterior Regularization

Author: Kuzman Ganchev ; Dipanjan Das

Abstract: We present a framework for cross-lingual transfer of sequence information from a resource-rich source language to a resourceimpoverished target language that incorporates soft constraints via posterior regularization. To this end, we use automatically word aligned bitext between the source and target language pair, and learn a discriminative conditional random field model on the target side. Our posterior regularization constraints are derived from simple intuitions about the task at hand and from cross-lingual alignment information. We show improvements over strong baselines for two tasks: part-of-speech tagging and namedentity segmentation.

6 0.083098248 160 emnlp-2013-Relational Inference for Wikification

7 0.073619939 199 emnlp-2013-Using Topic Modeling to Improve Prediction of Neuroticism and Depression in College Students

8 0.072903536 42 emnlp-2013-Building Specialized Bilingual Lexicons Using Large Scale Background Knowledge

9 0.058385849 97 emnlp-2013-Identifying Web Search Query Reformulation using Concept based Matching

10 0.05797815 65 emnlp-2013-Document Summarization via Guided Sentence Compression

11 0.056466445 67 emnlp-2013-Easy Victories and Uphill Battles in Coreference Resolution

12 0.054853119 73 emnlp-2013-Error-Driven Analysis of Challenges in Coreference Resolution

13 0.052038081 149 emnlp-2013-Overcoming the Lack of Parallel Data in Sentence Compression

14 0.050323933 2 emnlp-2013-A Convex Alternative to IBM Model 2

15 0.049035136 135 emnlp-2013-Monolingual Marginal Matching for Translation Model Adaptation

16 0.048846602 49 emnlp-2013-Combining Generative and Discriminative Model Scores for Distant Supervision

17 0.047884971 204 emnlp-2013-Word Level Language Identification in Online Multilingual Communication

18 0.047058173 84 emnlp-2013-Factored Soft Source Syntactic Constraints for Hierarchical Machine Translation

19 0.044594906 76 emnlp-2013-Exploiting Discourse Analysis for Article-Wide Temporal Classification

20 0.04445897 12 emnlp-2013-A Semantically Enhanced Approach to Determine Textual Similarity


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.167), (1, 0.048), (2, 0.058), (3, 0.016), (4, -0.037), (5, -0.011), (6, 0.111), (7, -0.016), (8, 0.029), (9, -0.009), (10, 0.005), (11, -0.031), (12, 0.021), (13, 0.074), (14, 0.023), (15, -0.041), (16, -0.06), (17, -0.024), (18, 0.011), (19, 0.063), (20, 0.019), (21, 0.114), (22, -0.009), (23, 0.023), (24, 0.065), (25, 0.011), (26, -0.09), (27, -0.16), (28, 0.009), (29, 0.028), (30, -0.116), (31, -0.051), (32, -0.102), (33, -0.01), (34, -0.068), (35, 0.11), (36, 0.144), (37, 0.11), (38, 0.128), (39, 0.106), (40, 0.096), (41, -0.051), (42, -0.07), (43, 0.024), (44, -0.124), (45, -0.052), (46, -0.162), (47, -0.141), (48, -0.05), (49, -0.064)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.94152141 198 emnlp-2013-Using Soft Constraints in Joint Inference for Clinical Concept Recognition

Author: Prateek Jindal ; Dan Roth

Abstract: This paper introduces IQPs (Integer Quadratic Programs) as a way to model joint inference for the task of concept recognition in clinical domain. IQPs make it possible to easily incorporate soft constraints in the optimization framework and still support exact global inference. We show that soft constraints give statistically significant performance improvements when compared to hard constraints.

2 0.68230814 114 emnlp-2013-Joint Learning and Inference for Grammatical Error Correction

Author: Alla Rozovskaya ; Dan Roth

Abstract: State-of-the-art systems for grammatical error correction are based on a collection of independently-trained models for specific errors. Such models ignore linguistic interactions at the sentence level and thus do poorly on mistakes that involve grammatical dependencies among several words. In this paper, we identify linguistic structures with interacting grammatical properties and propose to address such dependencies via joint inference and joint learning. We show that it is possible to identify interactions well enough to facilitate a joint approach and, consequently, that joint methods correct incoherent predictions that independentlytrained classifiers tend to produce. Furthermore, because the joint learning model considers interacting phenomena during training, it is able to identify mistakes that require mak- ing multiple changes simultaneously and that standard approaches miss. Overall, our model significantly outperforms the Illinois system that placed first in the CoNLL-2013 shared task on grammatical error correction.

3 0.47517467 199 emnlp-2013-Using Topic Modeling to Improve Prediction of Neuroticism and Depression in College Students

Author: Philip Resnik ; Anderson Garron ; Rebecca Resnik

Abstract: in College Students Anderson Garron University of Maryland College Park, MD 20742 agarron@cs.umd.edu Rebecca Resnik Mindwell Psychology Bethesda 5602 Shields Drive Bethesda, MD 20817 drrebeccaresnik@gmail.com out adequate insurance or in rural areas – cannot ac- We investigate the value-add of topic modeling in text analysis for depression, and for neuroticism as a strongly associated personality measure. Using Pennebaker’s Linguistic Inquiry and Word Count (LIWC) lexicon to provide baseline features, we show that straightforward topic modeling using Latent Dirichlet Allocation (LDA) yields interpretable, psychologically relevant “themes” that add value in prediction of clinical assessments.

4 0.45724395 53 emnlp-2013-Cross-Lingual Discriminative Learning of Sequence Models with Posterior Regularization

Author: Kuzman Ganchev ; Dipanjan Das

Abstract: We present a framework for cross-lingual transfer of sequence information from a resource-rich source language to a resourceimpoverished target language that incorporates soft constraints via posterior regularization. To this end, we use automatically word aligned bitext between the source and target language pair, and learn a discriminative conditional random field model on the target side. Our posterior regularization constraints are derived from simple intuitions about the task at hand and from cross-lingual alignment information. We show improvements over strong baselines for two tasks: part-of-speech tagging and namedentity segmentation.

5 0.45493668 85 emnlp-2013-Fast Joint Compression and Summarization via Graph Cuts

Author: Xian Qian ; Yang Liu

Abstract: Extractive summarization typically uses sentences as summarization units. In contrast, joint compression and summarization can use smaller units such as words and phrases, resulting in summaries containing more information. The goal of compressive summarization is to find a subset of words that maximize the total score of concepts and cutting dependency arcs under the grammar constraints and summary length constraint. We propose an efficient decoding algorithm for fast compressive summarization using graph cuts. Our approach first relaxes the length constraint using Lagrangian relaxation. Then we propose to bound the relaxed objective function by the supermodular binary quadratic programming problem, which can be solved efficiently using graph max-flow/min-cut. Since finding the tightest lower bound suffers from local optimality, we use convex relaxation for initialization. Experimental results on TAC2008 dataset demonstrate our method achieves competitive ROUGE score and has good readability, while is much faster than the integer linear programming (ILP) method.

6 0.39583442 1 emnlp-2013-A Constrained Latent Variable Model for Coreference Resolution

7 0.3778199 86 emnlp-2013-Feature Noising for Log-Linear Structured Prediction

8 0.37666747 184 emnlp-2013-This Text Has the Scent of Starbucks: A Laplacian Structured Sparsity Model for Computational Branding Analytics

9 0.37245879 195 emnlp-2013-Unsupervised Spectral Learning of WCFG as Low-rank Matrix Completion

10 0.36819085 160 emnlp-2013-Relational Inference for Wikification

11 0.35094258 2 emnlp-2013-A Convex Alternative to IBM Model 2

12 0.34381208 34 emnlp-2013-Automatically Classifying Edit Categories in Wikipedia Revisions

13 0.32257617 188 emnlp-2013-Tree Kernel-based Negation and Speculation Scope Detection with Structured Syntactic Parse Features

14 0.3224867 46 emnlp-2013-Classifying Message Board Posts with an Extracted Lexicon of Patient Attributes

15 0.30684477 132 emnlp-2013-Mining Scientific Terms and their Definitions: A Study of the ACL Anthology

16 0.30650315 202 emnlp-2013-Where Not to Eat? Improving Public Policy by Predicting Hygiene Inspections Using Online Reviews

17 0.30067661 142 emnlp-2013-Open-Domain Fine-Grained Class Extraction from Web Search Queries

18 0.30030331 170 emnlp-2013-Sentiment Analysis: How to Derive Prior Polarities from SentiWordNet

19 0.29841265 42 emnlp-2013-Building Specialized Bilingual Lexicons Using Large Scale Background Knowledge

20 0.29185989 61 emnlp-2013-Detecting Promotional Content in Wikipedia


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(3, 0.04), (16, 0.359), (18, 0.025), (22, 0.043), (30, 0.051), (50, 0.012), (51, 0.152), (66, 0.028), (71, 0.034), (75, 0.037), (77, 0.032), (95, 0.09), (96, 0.015)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.71546066 198 emnlp-2013-Using Soft Constraints in Joint Inference for Clinical Concept Recognition

Author: Prateek Jindal ; Dan Roth

Abstract: This paper introduces IQPs (Integer Quadratic Programs) as a way to model joint inference for the task of concept recognition in clinical domain. IQPs make it possible to easily incorporate soft constraints in the optimization framework and still support exact global inference. We show that soft constraints give statistically significant performance improvements when compared to hard constraints.

2 0.69652379 184 emnlp-2013-This Text Has the Scent of Starbucks: A Laplacian Structured Sparsity Model for Computational Branding Analytics

Author: William Yang Wang ; Edward Lin ; John Kominek

Abstract: We propose a Laplacian structured sparsity model to study computational branding analytics. To do this, we collected customer reviews from Starbucks, Dunkin’ Donuts, and other coffee shops across 38 major cities in the Midwest and Northeastern regions of USA. We study the brand related language use through these reviews, with focuses on the brand satisfaction and gender factors. In particular, we perform three tasks: automatic brand identification from raw text, joint brand-satisfaction prediction, and joint brandgender-satisfaction prediction. This work extends previous studies in text classification by incorporating the dependency and interaction among local features in the form of structured sparsity in a log-linear model. Our quantitative evaluation shows that our approach which combines the advantages of graphical modeling and sparsity modeling techniques significantly outperforms various standard and stateof-the-art text classification algorithms. In addition, qualitative analysis of our model reveals important features of the language uses associated with the specific brands.

3 0.48214808 85 emnlp-2013-Fast Joint Compression and Summarization via Graph Cuts

Author: Xian Qian ; Yang Liu

Abstract: Extractive summarization typically uses sentences as summarization units. In contrast, joint compression and summarization can use smaller units such as words and phrases, resulting in summaries containing more information. The goal of compressive summarization is to find a subset of words that maximize the total score of concepts and cutting dependency arcs under the grammar constraints and summary length constraint. We propose an efficient decoding algorithm for fast compressive summarization using graph cuts. Our approach first relaxes the length constraint using Lagrangian relaxation. Then we propose to bound the relaxed objective function by the supermodular binary quadratic programming problem, which can be solved efficiently using graph max-flow/min-cut. Since finding the tightest lower bound suffers from local optimality, we use convex relaxation for initialization. Experimental results on TAC2008 dataset demonstrate our method achieves competitive ROUGE score and has good readability, while is much faster than the integer linear programming (ILP) method.

4 0.46882433 110 emnlp-2013-Joint Bootstrapping of Corpus Annotations and Entity Types

Author: Hrushikesh Mohapatra ; Siddhanth Jain ; Soumen Chakrabarti

Abstract: Web search can be enhanced in powerful ways if token spans in Web text are annotated with disambiguated entities from large catalogs like Freebase. Entity annotators need to be trained on sample mention snippets. Wikipedia entities and annotated pages offer high-quality labeled data for training and evaluation. Unfortunately, Wikipedia features only one-ninth the number of entities as Freebase, and these are a highly biased sample of well-connected, frequently mentioned “head” entities. To bring hope to “tail” entities, we broaden our goal to a second task: assigning types to entities in Freebase but not Wikipedia. The two tasks are synergistic: knowing the types of unfamiliar entities helps disambiguate mentions, and words in mention contexts help assign types to entities. We present TMI, a bipartite graphical model for joint type-mention inference. TMI attempts no schema integration or entity resolution, but exploits the above-mentioned synergy. In experiments involving 780,000 people in Wikipedia, 2.3 million people in Freebase, 700 million Web pages, and over 20 professional editors, TMI shows considerable annotation accuracy improvement (e.g., 70%) compared to baselines (e.g., 46%), especially for “tail” and emerging entities. We also compare with Google’s recent annotations of the same corpus with Freebase entities, and report considerable improvements within the people domain.

5 0.46814376 65 emnlp-2013-Document Summarization via Guided Sentence Compression

Author: Chen Li ; Fei Liu ; Fuliang Weng ; Yang Liu

Abstract: Joint compression and summarization has been used recently to generate high quality summaries. However, such word-based joint optimization is computationally expensive. In this paper we adopt the ‘sentence compression + sentence selection’ pipeline approach for compressive summarization, but propose to perform summary guided compression, rather than generic sentence-based compression. To create an annotated corpus, the human annotators were asked to compress sentences while explicitly given the important summary words in the sentences. Using this corpus, we train a supervised sentence compression model using a set of word-, syntax-, and documentlevel features. During summarization, we use multiple compressed sentences in the integer linear programming framework to select . salient summary sentences. Our results on the TAC 2008 and 2011 summarization data sets show that by incorporating the guided sentence compression model, our summarization system can yield significant performance gain as compared to the state-of-the-art.

6 0.4672989 1 emnlp-2013-A Constrained Latent Variable Model for Coreference Resolution

7 0.44769779 114 emnlp-2013-Joint Learning and Inference for Grammatical Error Correction

8 0.4397229 36 emnlp-2013-Automatically Determining a Proper Length for Multi-Document Summarization: A Bayesian Nonparametric Approach

9 0.43798691 48 emnlp-2013-Collective Personal Profile Summarization with Social Networks

10 0.43682435 179 emnlp-2013-Summarizing Complex Events: a Cross-Modal Solution of Storylines Extraction and Reconstruction

11 0.4345468 69 emnlp-2013-Efficient Collective Entity Linking with Stacking

12 0.43254244 152 emnlp-2013-Predicting the Presence of Discourse Connectives

13 0.43246815 56 emnlp-2013-Deep Learning for Chinese Word Segmentation and POS Tagging

14 0.43203923 53 emnlp-2013-Cross-Lingual Discriminative Learning of Sequence Models with Posterior Regularization

15 0.43126068 160 emnlp-2013-Relational Inference for Wikification

16 0.4301509 132 emnlp-2013-Mining Scientific Terms and their Definitions: A Study of the ACL Anthology

17 0.42883658 193 emnlp-2013-Unsupervised Induction of Cross-Lingual Semantic Relations

18 0.42881218 194 emnlp-2013-Unsupervised Relation Extraction with General Domain Knowledge

19 0.4282214 51 emnlp-2013-Connecting Language and Knowledge Bases with Embedding Models for Relation Extraction

20 0.4278881 181 emnlp-2013-The Effects of Syntactic Features in Automatic Prediction of Morphology