brendan_oconnor_ai brendan_oconnor_ai-2013 brendan_oconnor_ai-2013-200 knowledge-graph by maker-knowledge-mining

200 brendan oconnor ai-2013-09-13-Response on our movie personas paper


meta infos for this blog

Source: html

Introduction: Update (2013-09-17): See David Bamman ‘s great guest post on Language Log on our latent personas paper, and the big picture of interdisciplinary collaboration. I’ve been informed that an interesting critique of my, David Bamman’s and Noah Smith’s ACL paper on movie personas has appeared on the Language Log, a guest post by Hannah Alpert-Abrams and Dan Garrette . I posted the following as a comment on LL. Thanks everyone for the interesting comments. Scholarship is an ongoing conversation, and we hope our work might contribute to it. Responding to the concerns about our paper , We did not try to make a contribution to contemporary literary theory. Rather, we focus on developing a computational linguistic research method of analyzing characters in stories. We hope there is a place for both the development of new research methods, as well as actual new substantive findings. If you think about the tremendous possibilities for computer science and humanities collabor


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Update (2013-09-17): See David Bamman ‘s great guest post on Language Log on our latent personas paper, and the big picture of interdisciplinary collaboration. [sent-1, score-0.389]

2 I’ve been informed that an interesting critique of my, David Bamman’s and Noah Smith’s ACL paper on movie personas has appeared on the Language Log, a guest post by Hannah Alpert-Abrams and Dan Garrette . [sent-2, score-0.431]

3 Scholarship is an ongoing conversation, and we hope our work might contribute to it. [sent-5, score-0.331]

4 Responding to the concerns about our paper , We did not try to make a contribution to contemporary literary theory. [sent-6, score-0.164]

5 Rather, we focus on developing a computational linguistic research method of analyzing characters in stories. [sent-7, score-0.638]

6 We hope there is a place for both the development of new research methods, as well as actual new substantive findings. [sent-8, score-0.359]

7 If you think about the tremendous possibilities for computer science and humanities collaboration, there is far too much to do and we have to tackle pieces of the puzzle to move forward. [sent-9, score-0.334]

8 All the comments above show there are a wealth of interesting questions to further investigate. [sent-11, score-0.163]

9 We find that, in these multidisciplinary projects, it’s most useful to publish part of the work early and get scholarly feedback, instead of waiting for years before trying to write a “perfect” paper. [sent-13, score-0.656]

10 (And David’s co-teaching a cool digital humanities seminar with Christopher Warren in the English department this semester — I’m sure there will be great cross-fertilization of ideas coming out of there! [sent-15, score-0.395]

11 ) For example, we’ve had useful feedback here already — besides comments from the computational linguistics community through the ACL paper, just in the discussion on LL there have been many interesting theories and references presented. [sent-16, score-0.76]

12 We’ve also been in conversation with other humanists — as we stated in our acknowledgments (noted by one commenter) — though apparently not the same humanists that Alpert-Abrams and Garrett would rather we had talked to. [sent-17, score-0.369]

13 This is why it’s better to publish early and participate in the scholarly conversation. [sent-18, score-0.351]

14 For what it’s worth, some of these high-level debates on whether it’s appropriate to focus on progress in quantitative methods, versus directly on substantive findings, have been playing out for decades in the social sciences. [sent-19, score-0.432]

15 (I’m thinking specifically about economics and political science, both of which are far more quantitative today than they were just 50 years ago. [sent-20, score-0.356]

16 ) And as several commenters have noted, and as we tried to in our references, there’s certainly been plenty of computational work in literary/cultural analysis before. [sent-21, score-0.344]

17 But I do think the quantitative approach still tends to be seen as novel in the humanities, and as the original response notes, there have been some problematic proclamations in this area recently. [sent-22, score-0.336]

18 I just hope there’s room to try to advance things without being everyone’s punching bag for whether or not they liked the latest Steven Pinker essay. [sent-23, score-0.29]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('noah', 0.247), ('david', 0.225), ('humanities', 0.21), ('work', 0.174), ('computational', 0.17), ('noted', 0.169), ('analyzing', 0.169), ('quantitative', 0.169), ('hope', 0.157), ('guest', 0.142), ('linguistics', 0.142), ('scholarly', 0.142), ('digital', 0.123), ('conversation', 0.123), ('humanists', 0.123), ('feedback', 0.123), ('publish', 0.123), ('bamman', 0.123), ('personas', 0.123), ('research', 0.116), ('smith', 0.105), ('novel', 0.105), ('linguistic', 0.105), ('methods', 0.099), ('relations', 0.099), ('appropriate', 0.099), ('acl', 0.099), ('references', 0.094), ('paper', 0.093), ('comments', 0.09), ('substantive', 0.086), ('early', 0.086), ('conference', 0.078), ('log', 0.078), ('focus', 0.078), ('design', 0.073), ('interesting', 0.073), ('try', 0.071), ('useful', 0.068), ('years', 0.063), ('political', 0.062), ('far', 0.062), ('interdisciplinary', 0.062), ('tackle', 0.062), ('liked', 0.062), ('essay', 0.062), ('seminar', 0.062), ('picture', 0.062), ('ideologies', 0.062), ('problematic', 0.062)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000004 200 brendan oconnor ai-2013-09-13-Response on our movie personas paper

Introduction: Update (2013-09-17): See David Bamman ‘s great guest post on Language Log on our latent personas paper, and the big picture of interdisciplinary collaboration. I’ve been informed that an interesting critique of my, David Bamman’s and Noah Smith’s ACL paper on movie personas has appeared on the Language Log, a guest post by Hannah Alpert-Abrams and Dan Garrette . I posted the following as a comment on LL. Thanks everyone for the interesting comments. Scholarship is an ongoing conversation, and we hope our work might contribute to it. Responding to the concerns about our paper , We did not try to make a contribution to contemporary literary theory. Rather, we focus on developing a computational linguistic research method of analyzing characters in stories. We hope there is a place for both the development of new research methods, as well as actual new substantive findings. If you think about the tremendous possibilities for computer science and humanities collabor

2 0.21709162 196 brendan oconnor ai-2013-05-08-Movie summary corpus and learning character personas

Introduction: Here is one of our exciting just-finished ACL papers.   David  and I designed an algorithm that learns different types of character personas — “Protagonist”, “Love Interest”, etc — that are used in movies. To do this we collected a  brand new dataset : 42,306 plot summaries of movies from Wikipedia, along with metadata like box office revenue and genre.  We ran these through parsing and coreference analysis to also create a dataset of movie characters, linked with Freebase records of the actors who portray them.  Did you see that NYT article on quantitative analysis of film scripts ?  This dataset could answer all sorts of things they assert in that article — for example, do movies with bowling scenes really make less money?  We have released the data here . Our focus, though, is on narrative analysis.  We investigate  character personas : familiar character types that are repeated over and over in stories, like “Hero” or “Villian”; maybe grand mythical archetypes like “Trick

3 0.15394729 203 brendan oconnor ai-2014-02-19-What the ACL-2014 review scores mean

Introduction: I’ve had several people ask me what the numbers in ACL reviews mean — and I can’t find anywhere online where they’re described. (Can anyone point this out if it is somewhere?) So here’s the review form, below. They all go from 1 to 5, with 5 the best. I think the review emails to authors only include a subset of the below — for example, “Overall Recommendation” is not included? The CFP said that they have different types of review forms for different types of papers. I think this one is for a standard full paper. I guess what people really want to know is what scores tend to correspond to acceptances. I really have no idea and I get the impression this can change year to year. I have no involvement with the ACL conference besides being one of many, many reviewers. APPROPRIATENESS (1-5) Does the paper fit in ACL 2014? (Please answer this question in light of the desire to broaden the scope of the research areas represented at ACL.) 5: Certainly. 4: Probabl

4 0.10786226 94 brendan oconnor ai-2008-03-10-PHD Comics: Humanities vs. Social Sciences

Introduction: PHD Comics: Humanities vs. Social Sciences

5 0.10607165 150 brendan oconnor ai-2009-08-08-Haghighi and Klein (2009): Simple Coreference Resolution with Rich Syntactic and Semantic Features

Introduction: I haven’t done a paper review on this blog for a while, so here we go. Coreference resolution is an interesting NLP problem.  ( Examples. )  It involves honest-to-goodness syntactic, semantic, and discourse phenomena, but still seems like a real cognitive task that humans have to solve when reading text [1].  I haven’t read the whole literature, but I’ve always been puzzled by the crop of papers on it I’ve seen in the last year or two.  There’s a big focus on fancy graph/probabilistic/constrained optimization algorithms, but often these papers gloss over the linguistic features — the core information they actually make their decisions with [2].  I never understood why the latter isn’t the most important issue.  Therefore, it was a joy to read Aria Haghighi and Dan Klein, EMNLP-2009.   “Simple Coreference Resolution with Rich Syntactic and Semantic Features.” They describe a simple, essentially non-statistical system that outperforms previous unsupervised systems, and compa

6 0.093021892 53 brendan oconnor ai-2007-03-15-Feminists, anarchists, computational complexity, bounded rationality, nethack, and other things to do

7 0.092498757 84 brendan oconnor ai-2007-11-26-How did Freud become a respected humanist?!

8 0.092451416 129 brendan oconnor ai-2008-12-03-Statistics vs. Machine Learning, fight!

9 0.077370413 191 brendan oconnor ai-2013-02-23-Wasserman on Stats vs ML, and previous comparisons

10 0.075957119 104 brendan oconnor ai-2008-05-23-Sub-reddit for Systems Science and OR

11 0.074458025 70 brendan oconnor ai-2007-07-25-Cerealitivity

12 0.073691174 7 brendan oconnor ai-2005-06-25-looking for related blogs-links

13 0.071283795 198 brendan oconnor ai-2013-08-20-Some analysis of tweet shares and “predicting” election outcomes

14 0.070509806 138 brendan oconnor ai-2009-04-17-1 billion web page dataset from CMU

15 0.069287911 11 brendan oconnor ai-2005-07-01-Modelling environmentalism thinking

16 0.069286942 172 brendan oconnor ai-2011-06-26-Good linguistic semantics textbook?

17 0.069073297 44 brendan oconnor ai-2006-08-30-A big, fun list of links I’m reading

18 0.0687465 28 brendan oconnor ai-2005-11-20-science writing bad!

19 0.067427903 157 brendan oconnor ai-2009-12-31-List of probabilistic model mini-language toolkits

20 0.065312572 184 brendan oconnor ai-2012-07-04-The $60,000 cat: deep belief networks make less sense for language than vision


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, -0.284), (1, 0.018), (2, -0.069), (3, -0.011), (4, 0.077), (5, -0.013), (6, 0.073), (7, 0.024), (8, -0.115), (9, 0.029), (10, 0.078), (11, -0.018), (12, 0.171), (13, 0.144), (14, 0.086), (15, -0.033), (16, 0.125), (17, 0.053), (18, -0.06), (19, 0.106), (20, -0.014), (21, -0.019), (22, 0.019), (23, 0.094), (24, 0.11), (25, -0.136), (26, -0.024), (27, -0.17), (28, 0.056), (29, -0.179), (30, -0.128), (31, 0.041), (32, -0.061), (33, 0.132), (34, 0.066), (35, 0.022), (36, 0.025), (37, 0.169), (38, -0.02), (39, 0.138), (40, -0.024), (41, -0.042), (42, -0.049), (43, 0.108), (44, -0.025), (45, 0.063), (46, -0.067), (47, -0.029), (48, 0.101), (49, -0.02)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.98494232 200 brendan oconnor ai-2013-09-13-Response on our movie personas paper

Introduction: Update (2013-09-17): See David Bamman ‘s great guest post on Language Log on our latent personas paper, and the big picture of interdisciplinary collaboration. I’ve been informed that an interesting critique of my, David Bamman’s and Noah Smith’s ACL paper on movie personas has appeared on the Language Log, a guest post by Hannah Alpert-Abrams and Dan Garrette . I posted the following as a comment on LL. Thanks everyone for the interesting comments. Scholarship is an ongoing conversation, and we hope our work might contribute to it. Responding to the concerns about our paper , We did not try to make a contribution to contemporary literary theory. Rather, we focus on developing a computational linguistic research method of analyzing characters in stories. We hope there is a place for both the development of new research methods, as well as actual new substantive findings. If you think about the tremendous possibilities for computer science and humanities collabor

2 0.69105297 196 brendan oconnor ai-2013-05-08-Movie summary corpus and learning character personas

Introduction: Here is one of our exciting just-finished ACL papers.   David  and I designed an algorithm that learns different types of character personas — “Protagonist”, “Love Interest”, etc — that are used in movies. To do this we collected a  brand new dataset : 42,306 plot summaries of movies from Wikipedia, along with metadata like box office revenue and genre.  We ran these through parsing and coreference analysis to also create a dataset of movie characters, linked with Freebase records of the actors who portray them.  Did you see that NYT article on quantitative analysis of film scripts ?  This dataset could answer all sorts of things they assert in that article — for example, do movies with bowling scenes really make less money?  We have released the data here . Our focus, though, is on narrative analysis.  We investigate  character personas : familiar character types that are repeated over and over in stories, like “Hero” or “Villian”; maybe grand mythical archetypes like “Trick

3 0.65256536 203 brendan oconnor ai-2014-02-19-What the ACL-2014 review scores mean

Introduction: I’ve had several people ask me what the numbers in ACL reviews mean — and I can’t find anywhere online where they’re described. (Can anyone point this out if it is somewhere?) So here’s the review form, below. They all go from 1 to 5, with 5 the best. I think the review emails to authors only include a subset of the below — for example, “Overall Recommendation” is not included? The CFP said that they have different types of review forms for different types of papers. I think this one is for a standard full paper. I guess what people really want to know is what scores tend to correspond to acceptances. I really have no idea and I get the impression this can change year to year. I have no involvement with the ACL conference besides being one of many, many reviewers. APPROPRIATENESS (1-5) Does the paper fit in ACL 2014? (Please answer this question in light of the desire to broaden the scope of the research areas represented at ACL.) 5: Certainly. 4: Probabl

4 0.6216892 84 brendan oconnor ai-2007-11-26-How did Freud become a respected humanist?!

Introduction: Freud Is Widely Taught at Universities, Except in the Psychology Department : PSYCHOANALYSIS and its ideas about the unconscious mind have spread to every nook and cranny of the culture from Salinger to “South Park,” from Fellini to foreign policy. Yet if you want to learn about psychoanalysis at the nation’s top universities, one of the last places to look may be the psychology department. A new report by the American Psychoanalytic Association has found that while psychoanalysis — or what purports to be psychoanalysis — is alive and well in literature, film, history and just about every other subject in the humanities, psychology departments and textbooks treat it as “desiccated and dead,” a historical artifact instead of “an ongoing movement and a living, evolving process.” I’ve been wondering about this for a while, ever since I heard someone describe Freud as “one of the greatest humanists who ever lived.” I’m pretty sure he didn’t think of himself that way. If you’re a

5 0.49140048 94 brendan oconnor ai-2008-03-10-PHD Comics: Humanities vs. Social Sciences

Introduction: PHD Comics: Humanities vs. Social Sciences

6 0.37885952 184 brendan oconnor ai-2012-07-04-The $60,000 cat: deep belief networks make less sense for language than vision

7 0.34220374 48 brendan oconnor ai-2007-01-02-funny comic

8 0.34169552 101 brendan oconnor ai-2008-04-13-Are women discriminated against in graduate admissions? Simpson’s paradox via R in three easy steps!

9 0.32776645 150 brendan oconnor ai-2009-08-08-Haghighi and Klein (2009): Simple Coreference Resolution with Rich Syntactic and Semantic Features

10 0.32672572 125 brendan oconnor ai-2008-11-21-Netflix Prize

11 0.32186159 73 brendan oconnor ai-2007-08-05-Are ideas interesting, or are they true?

12 0.31611869 176 brendan oconnor ai-2011-10-05-Be careful with dictionary-based text analysis

13 0.3024556 104 brendan oconnor ai-2008-05-23-Sub-reddit for Systems Science and OR

14 0.30144134 53 brendan oconnor ai-2007-03-15-Feminists, anarchists, computational complexity, bounded rationality, nethack, and other things to do

15 0.30014855 139 brendan oconnor ai-2009-04-22-Performance comparison: key-value stores for language model counts

16 0.29576063 1 brendan oconnor ai-2004-11-20-gintis: theoretical unity in the social sciences

17 0.29428661 40 brendan oconnor ai-2006-06-28-Social network-ized economic markets

18 0.28237438 42 brendan oconnor ai-2006-07-25-Two Middle East politics visualizations

19 0.28196883 151 brendan oconnor ai-2009-08-12-Beautiful Data book chapter

20 0.28167257 140 brendan oconnor ai-2009-05-18-Announcing TweetMotif for summarizing twitter topics


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(1, 0.398), (16, 0.026), (43, 0.038), (44, 0.09), (48, 0.043), (50, 0.016), (55, 0.034), (57, 0.017), (70, 0.098), (74, 0.148)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.95375288 59 brendan oconnor ai-2007-04-08-Random search engine searcher

Introduction: It’s sweeping the internet — I wrote a little plugin for the firefox/internet explorer search box, so when you search it randomly picks one of several search engines. You get to see what’s out there (you mean there’s something besides Google?) in your daily searching. Search a Random Search Engine

2 0.93317121 40 brendan oconnor ai-2006-06-28-Social network-ized economic markets

Introduction: Extremely interesting — a generalization of Arrow-Debreu equilibrium in which interactions are restricted along a social network. Kakade et al 2005 . (Found through NIPS 2004 (which looks like a great conference)). Also a longer and more detailed related version: Kakade et al 2004 .

same-blog 3 0.87215137 200 brendan oconnor ai-2013-09-13-Response on our movie personas paper

Introduction: Update (2013-09-17): See David Bamman ‘s great guest post on Language Log on our latent personas paper, and the big picture of interdisciplinary collaboration. I’ve been informed that an interesting critique of my, David Bamman’s and Noah Smith’s ACL paper on movie personas has appeared on the Language Log, a guest post by Hannah Alpert-Abrams and Dan Garrette . I posted the following as a comment on LL. Thanks everyone for the interesting comments. Scholarship is an ongoing conversation, and we hope our work might contribute to it. Responding to the concerns about our paper , We did not try to make a contribution to contemporary literary theory. Rather, we focus on developing a computational linguistic research method of analyzing characters in stories. We hope there is a place for both the development of new research methods, as well as actual new substantive findings. If you think about the tremendous possibilities for computer science and humanities collabor

4 0.41784447 44 brendan oconnor ai-2006-08-30-A big, fun list of links I’m reading

Introduction: Since blogging is hard, but reading is easy, lately I’ve taken to bookmarking interesting articles I’m reading, with the plan of blogging about them later. This follow-through has happened a few times, but not that often. In an amazing moment of thesis procrastination, today I sat down and figured out how to turn my del.icio.us bookmarks into a nice blogpost, with the plan that every week a post will appear with links I’ve recently read, or maybe I’ll use the script to generate a draft for myself that I’ll revise, or something. But for this first such link post, I put in a whole bunch of them beyond just the last week — why have just a few when you could have *all* of them? Future link posts will be shorter, I promise. Ariel Rubinstein: Freak-Freakonomics July 2006 posted 8/19 under economics sarcastic, critical review of levitt & dubner’s Freakonomics New Yorker review of Philip Tetlock’s book on political expert judgment posted 8/19 under judgment , psycholo

5 0.3988663 129 brendan oconnor ai-2008-12-03-Statistics vs. Machine Learning, fight!

Introduction: 10/1/09 update — well, it’s been nearly a year, and I should say not everything in this rant is totally true, and I certainly believe much less of it now. Current take: Statistics , not machine learning, is the real deal, but unfortunately suffers from bad marketing. On the other hand, to the extent that bad marketing includes misguided undergraduate curriculums, there’s plenty of room to improve for everyone. So it’s pretty clear by now that statistics and machine learning aren’t very different fields. I was recently pointed to a very amusing comparison by the excellent statistician — and machine learning expert — Robert Tibshiriani . Reproduced here: Glossary Machine learning Statistics network, graphs model weights parameters learning fitting generalization test set performance supervised learning regression/classification unsupervised learning density estimation, clustering large grant = $1,000,000

6 0.39769334 188 brendan oconnor ai-2012-10-02-Powerset’s natural language search system

7 0.38804102 203 brendan oconnor ai-2014-02-19-What the ACL-2014 review scores mean

8 0.38510531 150 brendan oconnor ai-2009-08-08-Haghighi and Klein (2009): Simple Coreference Resolution with Rich Syntactic and Semantic Features

9 0.3732824 123 brendan oconnor ai-2008-11-12-Disease tracking with web queries and social messaging (Google, Twitter, Facebook…)

10 0.36564195 53 brendan oconnor ai-2007-03-15-Feminists, anarchists, computational complexity, bounded rationality, nethack, and other things to do

11 0.36394033 86 brendan oconnor ai-2007-12-20-Data-driven charity

12 0.36221656 26 brendan oconnor ai-2005-09-02-cognitive modelling is rational choice++

13 0.35773358 63 brendan oconnor ai-2007-06-10-Freak-Freakonomics (Ariel Rubinstein is the shit!)

14 0.35673258 138 brendan oconnor ai-2009-04-17-1 billion web page dataset from CMU

15 0.3561984 184 brendan oconnor ai-2012-07-04-The $60,000 cat: deep belief networks make less sense for language than vision

16 0.35097462 105 brendan oconnor ai-2008-06-05-Clinton-Obama support visualization

17 0.34922674 77 brendan oconnor ai-2007-09-15-Dollar auction

18 0.34918916 2 brendan oconnor ai-2004-11-24-addiction & 2 problems of economics

19 0.34614745 140 brendan oconnor ai-2009-05-18-Announcing TweetMotif for summarizing twitter topics

20 0.34565371 80 brendan oconnor ai-2007-10-31-neo institutional economic fun!