brendan_oconnor_ai brendan_oconnor_ai-2005 brendan_oconnor_ai-2005-21 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: From the same site, this is fun.
sentIndex sentText sentNum sentScore
wordName wordTfidf (topN-words)
[('site', 0.834), ('fun', 0.551)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999994 21 brendan oconnor ai-2005-07-31-balkanized USA
Introduction: From the same site, this is fun.
2 0.17349276 116 brendan oconnor ai-2008-10-08-MyDebates.org, online polling, and potentially the coolest question corpus ever
Introduction: MySpace and the Commission on the Presidential Debates put together a neat site, mydebates.org , which presents the candidates’ positions through various mini-polls and such. It even has a cool data exploration tool for the poll results … for example, here are two support maps, one for respondents over 65 and one for 18-24 year olds. Anyway, the site also takes submissions of questions for tonight’s debate. Apparently six million questions were submitted, and moderator Tom Brokaw will of course use only 10 or so. This begs a question, how were they selected? There’s no Digg-like social filtering or anything. You could imagine automatic methods to help narrow down the pool: Topic clustering? Quality ranking on syntax and vocabulary? Eric Fish suggested the obvious: probably someone picked 1000 randomly and sent them to Brokaw. I’d love to see a corpus of 6 million questions on U.S. political subjects, directed at only two different people. Anyone know anyon
3 0.073456913 147 brendan oconnor ai-2009-07-22-FFT: Friedman + Fortran + Tricks
Introduction: …is a tongue-in-cheek phrase from Trevor Hastie’s very fun to read useR-2009 presentation , from the merry trio of Hastie, Friedman, and Tibshirani, who brought us, among other things, the excellent Elements of Statistical Learning textbook . It’s a joy to read sophisticated but well-presented work like this. This comes from a slide explaining the impressive speed results for their glmnet regression package. Substantively, I’m interested in their observation that coordinate descent works well for sparse data — if you’re optimizing one feature at a time, and that feature is used in only a small percentage of instances, there are some neat optimizations! But mostly, I had a fun time skimming the glmnet code . It’s written in 2008, but, yes, the core algorithm is written entirely in Fortran , complete with punchcard-style, fixed-width formatting! (This seems gratuitous to me — I thought the modern Fortran-90 had done away with such things?) I’ve felt clever enough making
4 0.072993703 117 brendan oconnor ai-2008-10-11-It is accurate to determine a blog’s bias by what it links to
Introduction: Here’s a great project from Andy Baio and Joshua Schachter : they assessed the political biases of different blogs based on which articles they tend link to. Using these political bias scores, they made a cool little Firefox extension that colors the names of different sources on the news aggregator site Memeorandum , like so: How they computed these biases is pretty neat. Their data source was the Memeorandum site itself, which shows a particular news story, then a list of different news sites that have written articles about the topic. Scraping out that data, Joshua constructed the adjacency matrix of sites vs. articles they linked to and ran good ol’ SVD on it, an algorithm that can be used to summarize the very high-dimensional article linking information in just several numbers (“components” or “dimensions”) for each news site. Basically, the algorithm groups together sites that tend to link to the same articles. It’s not exactly clustering though; rather, it project
Introduction: This is off-topic for this blog but here goes. ConnectU , a small college social networking site, has been in the news due to their apparently weak lawsuit against Facebook , in which they claim Mark Zuckerberg stole their business plan and computer code back when they all were Harvard undergraduates. (Judges involved have noted the case’s flimsy evidence; some technology commentators — as well as everyone I know — have noted that the business idea wasn’t all that brilliant or original in the first place.) Zuckerberg, of course, went on to found Facebook and bring it to incredible success. I tried to use the ConnectU site recently, but got an error when searching for a funny name with an apostrophe, o’connor . It turns out this was symptomatic of a very grave security flaw in their code, an SQL injection vulnerability . While Facebook recently had a minor security-related glitch , ConnectU’s flaw is far more serious. A malicious attacker could use this to easily break in
6 0.051061302 160 brendan oconnor ai-2010-04-22-Updates: CMU, Facebook
7 0.038217925 47 brendan oconnor ai-2007-01-02-The Jungle Economy
8 0.035096228 189 brendan oconnor ai-2012-11-24-Graphs for SANCL-2012 web parsing results
9 0.028684659 28 brendan oconnor ai-2005-11-20-science writing bad!
10 0.027803471 10 brendan oconnor ai-2005-06-26-monkey economics (and brothels)
11 0.026575724 132 brendan oconnor ai-2009-01-07-Love it and hate it, R has come of age
12 0.025204064 138 brendan oconnor ai-2009-04-17-1 billion web page dataset from CMU
13 0.024651403 73 brendan oconnor ai-2007-08-05-Are ideas interesting, or are they true?
14 0.023995135 77 brendan oconnor ai-2007-09-15-Dollar auction
15 0.022881089 33 brendan oconnor ai-2006-04-24-The identity politics of satananic zombie alien man-beasts
16 0.022258854 61 brendan oconnor ai-2007-05-24-Rock Paper Scissors psychology
17 0.021724273 106 brendan oconnor ai-2008-06-17-Pairwise comparisons for relevance evaluation
18 0.021687454 140 brendan oconnor ai-2009-05-18-Announcing TweetMotif for summarizing twitter topics
19 0.019671908 169 brendan oconnor ai-2011-05-20-Log-normal and logistic-normal terminology
20 0.018216331 178 brendan oconnor ai-2011-11-13-Bayes update view of pointwise mutual information
topicId topicWeight
[(0, -0.036), (1, -0.036), (2, -0.013), (3, -0.026), (4, -0.034), (5, -0.064), (6, 0.028), (7, -0.023), (8, 0.029), (9, 0.064), (10, -0.041), (11, -0.144), (12, 0.06), (13, -0.05), (14, -0.008), (15, 0.021), (16, -0.005), (17, -0.007), (18, 0.021), (19, 0.075), (20, -0.038), (21, -0.059), (22, 0.006), (23, -0.031), (24, -0.161), (25, 0.077), (26, -0.098), (27, 0.008), (28, 0.07), (29, -0.055), (30, 0.112), (31, -0.164), (32, -0.046), (33, 0.014), (34, 0.183), (35, 0.054), (36, 0.085), (37, -0.091), (38, 0.138), (39, -0.009), (40, -0.084), (41, 0.068), (42, -0.109), (43, -0.052), (44, -0.107), (45, 0.017), (46, -0.046), (47, 0.083), (48, 0.114), (49, -0.145)]
simIndex simValue blogId blogTitle
same-blog 1 0.99926585 21 brendan oconnor ai-2005-07-31-balkanized USA
Introduction: From the same site, this is fun.
Introduction: MySpace and the Commission on the Presidential Debates put together a neat site, mydebates.org , which presents the candidates’ positions through various mini-polls and such. It even has a cool data exploration tool for the poll results … for example, here are two support maps, one for respondents over 65 and one for 18-24 year olds. Anyway, the site also takes submissions of questions for tonight’s debate. Apparently six million questions were submitted, and moderator Tom Brokaw will of course use only 10 or so. This begs a question, how were they selected? There’s no Digg-like social filtering or anything. You could imagine automatic methods to help narrow down the pool: Topic clustering? Quality ranking on syntax and vocabulary? Eric Fish suggested the obvious: probably someone picked 1000 randomly and sent them to Brokaw. I’d love to see a corpus of 6 million questions on U.S. political subjects, directed at only two different people. Anyone know anyon
3 0.49201092 147 brendan oconnor ai-2009-07-22-FFT: Friedman + Fortran + Tricks
Introduction: …is a tongue-in-cheek phrase from Trevor Hastie’s very fun to read useR-2009 presentation , from the merry trio of Hastie, Friedman, and Tibshirani, who brought us, among other things, the excellent Elements of Statistical Learning textbook . It’s a joy to read sophisticated but well-presented work like this. This comes from a slide explaining the impressive speed results for their glmnet regression package. Substantively, I’m interested in their observation that coordinate descent works well for sparse data — if you’re optimizing one feature at a time, and that feature is used in only a small percentage of instances, there are some neat optimizations! But mostly, I had a fun time skimming the glmnet code . It’s written in 2008, but, yes, the core algorithm is written entirely in Fortran , complete with punchcard-style, fixed-width formatting! (This seems gratuitous to me — I thought the modern Fortran-90 had done away with such things?) I’ve felt clever enough making
4 0.35198712 124 brendan oconnor ai-2008-11-17-Correlations – cotton picking vs. 2008 Presidential votes
Introduction: From the neat blog Strange Maps — a map of the U.S. South, overlaying where cotton was picked in 1860 versus Presidential voting in 2008. The claim is that the causal pathway is through high African-American populations.
5 0.34423262 117 brendan oconnor ai-2008-10-11-It is accurate to determine a blog’s bias by what it links to
Introduction: Here’s a great project from Andy Baio and Joshua Schachter : they assessed the political biases of different blogs based on which articles they tend link to. Using these political bias scores, they made a cool little Firefox extension that colors the names of different sources on the news aggregator site Memeorandum , like so: How they computed these biases is pretty neat. Their data source was the Memeorandum site itself, which shows a particular news story, then a list of different news sites that have written articles about the topic. Scraping out that data, Joshua constructed the adjacency matrix of sites vs. articles they linked to and ran good ol’ SVD on it, an algorithm that can be used to summarize the very high-dimensional article linking information in just several numbers (“components” or “dimensions”) for each news site. Basically, the algorithm groups together sites that tend to link to the same articles. It’s not exactly clustering though; rather, it project
7 0.23717269 160 brendan oconnor ai-2010-04-22-Updates: CMU, Facebook
8 0.22190969 128 brendan oconnor ai-2008-11-28-Calculating running variance in Python and C++
9 0.20698544 169 brendan oconnor ai-2011-05-20-Log-normal and logistic-normal terminology
10 0.19654195 14 brendan oconnor ai-2005-07-04-City crisis simulation (e.g. terrorist attack)
11 0.18868884 132 brendan oconnor ai-2009-01-07-Love it and hate it, R has come of age
12 0.18663013 47 brendan oconnor ai-2007-01-02-The Jungle Economy
13 0.18248884 189 brendan oconnor ai-2012-11-24-Graphs for SANCL-2012 web parsing results
14 0.17970499 109 brendan oconnor ai-2008-07-04-Link: Today’s international organizations
15 0.17847967 111 brendan oconnor ai-2008-08-16-A better Obama vs McCain poll aggregation
16 0.17659719 61 brendan oconnor ai-2007-05-24-Rock Paper Scissors psychology
17 0.17315775 12 brendan oconnor ai-2005-07-02-$ echo {political,social,economic}{cognition,behavior,systems}
18 0.17291634 137 brendan oconnor ai-2009-04-15-Pirates killed by President
19 0.17245975 106 brendan oconnor ai-2008-06-17-Pairwise comparisons for relevance evaluation
20 0.17207612 10 brendan oconnor ai-2005-06-26-monkey economics (and brothels)
topicId topicWeight
[(13, 0.585)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 21 brendan oconnor ai-2005-07-31-balkanized USA
Introduction: From the same site, this is fun.
2 0.98459029 109 brendan oconnor ai-2008-07-04-Link: Today’s international organizations
Introduction: Fascinating — a review of the current international system, focusing on international organizations (that is, organizations of states). Who runs the world? | Wrestling for influence | Economist.com
3 0.87129968 24 brendan oconnor ai-2005-08-01-searchin’ for our friend, homo economicus
Introduction: I must have seen a zillion draft versions of this study floating around online, but here’s a terrific preprint: “Economic manâ€? in cross-cultural perspective: Behavioral experiments in 15 small-scale societies (Henrich, Boyd, Bowles, Camerer, Fehr, Gintis, McElreath, Alvard, Barr, Ensminger, Henrich, Hill, Gil-White, Gurven, Marlowe, Patton, and Tracer 2005 (!)). So looks like we’re now pretty sure, culture affects cooperation, you can see it in social practices. It’s a really neat study. The writeup in this version is terrific, they talk about implications for culture-gene evolution and have great statistical analysis of cultural factors on ultimatum game performance.
4 0.062759899 1 brendan oconnor ai-2004-11-20-gintis: theoretical unity in the social sciences
Introduction: Herbert Gintis thinks it’s time to unify the behavioral sciences. Sociology, economics, political science, human biology, anthropology and others all study the same thing, but each is based on different incompatible models of individual human behavior. There seems to be evidence that new developments have the potential to offer a more unifying theory. Evolutionary biology should be the basis of understanding much of human behavior. Rational choice and game theoretic frameworks are finding greater acceptance beyond economics; in the meantime, other fields need to absorb sociology’s emphasis on socialization — that people do things or understand the world in a way taught by society. The human behavioral sciences are still rife with many smaller inconsistencies; for example, according to Gintis, only anthropolgists look at the influence of culture across groups, but only sociologists look at culture within groups. Gintis’ ultimate goal is to have a common baseline from which each disci
5 0.042038679 81 brendan oconnor ai-2007-11-13-Authoritarian great power capitalism
Introduction: Before I forget — a while back I read a terrific Foreign Affairs article, The Return of Authoritarian Great Powers . The argument is, just a century or so ago, states based on authoritarian capitalism were very powerful in the world; e.g. imperial Japan and Germany. They got plenty of the economic benefits of capitalism but not so much the democratic effects people like to talk about today. (And there are interesting points that the failure of fascism in the second world war was contingent and not inherent to the ideology.) The author argues this looks like the future: Russia and China are becoming economically strong world powers but keeping solidly non-democratic ways of governance. The period of liberal democracy we live in, with all its overhyped speculation about the inevitable spread democracy and free market capitalism — say, an “end of history” — might just be that, a moment caused by the vagaries of 20th century history. After I read the article last June, I actually
6 0.039228164 75 brendan oconnor ai-2007-08-13-It’s all in a name: "Kingdom of Norway" vs. "Democratic People’s Republic of Korea"
7 0.029028961 130 brendan oconnor ai-2008-12-18-Information cost and genocide
8 0.021480491 110 brendan oconnor ai-2008-08-15-East vs West cultural psychology!
9 0.020495431 3 brendan oconnor ai-2004-12-02-go science
10 0.0 2 brendan oconnor ai-2004-11-24-addiction & 2 problems of economics
11 0.0 4 brendan oconnor ai-2005-05-16-Online Deliberation 2005 conference blog & more is up!
13 0.0 6 brendan oconnor ai-2005-06-25-idea: Morals are heuristics for socially optimal behavior
14 0.0 7 brendan oconnor ai-2005-06-25-looking for related blogs-links
15 0.0 8 brendan oconnor ai-2005-06-25-more argumentation & AI-formal modelling links
16 0.0 9 brendan oconnor ai-2005-06-25-zombies!
17 0.0 10 brendan oconnor ai-2005-06-26-monkey economics (and brothels)
18 0.0 11 brendan oconnor ai-2005-07-01-Modelling environmentalism thinking
19 0.0 12 brendan oconnor ai-2005-07-02-$ echo {political,social,economic}{cognition,behavior,systems}
20 0.0 13 brendan oconnor ai-2005-07-03-Supreme Court justices’ agreement levels