andrew_gelman_stats andrew_gelman_stats-2010 andrew_gelman_stats-2010-418 knowledge-graph by maker-knowledge-mining

418 andrew gelman stats-2010-11-17-ff


meta infos for this blog

Source: html

Introduction: Can somebody please fix the pdf reader so that it can correctly render “ff” when I cut and paste? This comes up when I’m copying sections of articles on to the blog. Thank you. P.S. I googled “ff pdf” but no help there. P.P.S. It’s a problem with “fi” also. P.P.P.S. Yes, I know about ligatures. But, if you already knew about ligatures, and I already know about ligatures, then presumably the pdf people already know about ligatures too. So why can’t their clever program, which can already find individual f’s, also find the ff’s and separate them? I assume it’s not so simple but I don’t quite understand why not.


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Can somebody please fix the pdf reader so that it can correctly render “ff” when I cut and paste? [sent-1, score-1.043]

2 This comes up when I’m copying sections of articles on to the blog. [sent-2, score-0.352]

3 But, if you already knew about ligatures, and I already know about ligatures, then presumably the pdf people already know about ligatures too. [sent-16, score-1.954]

4 So why can’t their clever program, which can already find individual f’s, also find the ff’s and separate them? [sent-17, score-0.666]

5 I assume it’s not so simple but I don’t quite understand why not. [sent-18, score-0.223]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('ligatures', 0.571), ('ff', 0.49), ('pdf', 0.351), ('already', 0.22), ('fi', 0.173), ('render', 0.151), ('paste', 0.139), ('copying', 0.118), ('googled', 0.118), ('sections', 0.113), ('thank', 0.109), ('clever', 0.107), ('fix', 0.105), ('cut', 0.096), ('correctly', 0.095), ('presumably', 0.095), ('reader', 0.089), ('knew', 0.087), ('somebody', 0.085), ('find', 0.084), ('separate', 0.084), ('know', 0.083), ('program', 0.075), ('please', 0.071), ('articles', 0.064), ('individual', 0.064), ('assume', 0.062), ('yes', 0.057), ('comes', 0.057), ('help', 0.057), ('quite', 0.056), ('simple', 0.054), ('understand', 0.051), ('problem', 0.038), ('people', 0.024), ('also', 0.023)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 418 andrew gelman stats-2010-11-17-ff

Introduction: Can somebody please fix the pdf reader so that it can correctly render “ff” when I cut and paste? This comes up when I’m copying sections of articles on to the blog. Thank you. P.S. I googled “ff pdf” but no help there. P.P.S. It’s a problem with “fi” also. P.P.P.S. Yes, I know about ligatures. But, if you already knew about ligatures, and I already know about ligatures, then presumably the pdf people already know about ligatures too. So why can’t their clever program, which can already find individual f’s, also find the ff’s and separate them? I assume it’s not so simple but I don’t quite understand why not.

2 0.12949611 341 andrew gelman stats-2010-10-14-Confusion about continuous probability densities

Introduction: I had the following email exchange with a reader of Bayesian Data Analysis. My correspondent wrote: Exercise 1(b) involves evaluating the normal pdf at a single point. But p(Y=y|mu,sigma) = 0 (and is not simply N(y|mu,sigma)), since the normal distribution is continuous. So it seems that part (b) of the exercise is inappropriate. The solution does actually evaluate the probability as the value of the pdf at the single point, which is wrong. The probabilities should all be 0, so the answer to (b) is undefined. I replied: The pdf is the probability density function, which for a continuous distribution is defined as the derivative of the cumulative density function. The notation in BDA is rigorous but we do not spell out all the details, so I can see how confusion is possible. My correspondent: I agree that the pdf is the derivative of the cdf. But to compute P(a .lt. Y .lt. b) for a continuous distribution (with support in the real line) requires integrating over t

3 0.10429926 2190 andrew gelman stats-2014-01-29-Stupid R Tricks: Random Scope

Introduction: Andrew and I have been discussing how we’re going to define functions in Stan for defining systems of differential equations; see our evolving ode design doc ; comments welcome, of course. About Scope I mentioned to Andrew I would prefer pure lexical, static scoping, as found in languages like C++ and Java. If you’re not familiar with the alternatives, there’s a nice overview in the Wikipedia article on scope . Let me call out a few passages that will help set the context. A fundamental distinction in scoping is what “context” means – whether name resolution depends on the location in the source code (lexical scope, static scope, which depends on the lexical context) or depends on the program state when the name is encountered (dynamic scope, which depends on the execution context or calling context). Lexical resolution can be determined at compile time, and is also known as early binding, while dynamic resolution can in general only be determined at run time, and thus

4 0.080426417 1286 andrew gelman stats-2012-04-28-Agreement Groups in US Senate and Dynamic Clustering

Introduction: Adrien Friggeri has a lovely visualization of US Senators movement between clusters: You have to click the image and play with it to appreciate it. The methodology isn’t yet published – but I can see how this could be very illuminating. The dynamic clustering aspect hasn’t been researched much – one of the notable pieces is the Blei and Lafferty dynamic topic model of Science . I did a static analysis of the US Senate back in 2005 with Wray Buntine and coauthors. Some additional visualizations and the source code are here . We did a dynamic analysis of US Supreme Court on this blog but there’s also a paper . My knowledge on this topic is out of date, however. Who has been doing good work in this area? I’ll organize the links. [added 4/29/12, via Edo Airoldi ]: Visualizing the Evolution of Community Structures in Dynamic Social Networks by Khairi Reda et al (2011) [ PDF ]. [added 4/29/12, via Allen Riddell ] Joint Analysis of Time-Evolving Binary Matrices an

5 0.059949256 787 andrew gelman stats-2011-07-05-Different goals, different looks: Infovis and the Chris Rock effect

Introduction: Seth writes: Here’s my candidate for bad graphic of the year: I [Seth] studied it and learned nothing. I have no idea how they assigned colors to locations. I already knew that there were more within-city calls than calls to individual distant locations — for example that there are more SF-SF calls than SF-LA calls. The researchers took a huge rich database and boiled it down to nothing (in terms of information value) — and I have a funny feeling they don’t realize how awful this is and what a waste. I send it to you because it isn’t obvious how to do better — at least not obvious to them. My reply: My first reaction is to agree–I don’t get anything out of this graph either! But let me step back. I think it’s best to understand this using the framework of my paper with Antony Unwin , by thinking of the goals that are satisfied by different sorts of graphs. What does this graph convey? It doesn’t tell us much about phone calls, but it does tell us that some peop

6 0.058714125 1993 andrew gelman stats-2013-08-22-Improvements to Kindle Version of BDA3

7 0.057959557 631 andrew gelman stats-2011-03-28-Explaining that plot.

8 0.057201788 192 andrew gelman stats-2010-08-08-Turning pages into data

9 0.057038531 178 andrew gelman stats-2010-08-03-(Partisan) visualization of health care legislation

10 0.055142127 1182 andrew gelman stats-2012-02-24-Untangling the Jeffreys-Lindley paradox

11 0.046654712 2090 andrew gelman stats-2013-11-05-How much do we trust a new claim that early childhood stimulation raised earnings by 42%?

12 0.043924004 199 andrew gelman stats-2010-08-11-Note to semi-spammers

13 0.043504599 128 andrew gelman stats-2010-07-05-The greatest works of statistics never published

14 0.043388531 2266 andrew gelman stats-2014-03-25-A statistical graphics course and statistical graphics advice

15 0.043356918 1580 andrew gelman stats-2012-11-16-Stantastic!

16 0.043079671 1498 andrew gelman stats-2012-09-16-Choices in graphing parallel time series

17 0.041315764 872 andrew gelman stats-2011-08-26-Blog on applied probability modeling

18 0.040881336 2318 andrew gelman stats-2014-05-04-Stan (& JAGS) Tutorial on Linear Mixed Models

19 0.040217087 848 andrew gelman stats-2011-08-11-That xkcd cartoon on multiple comparisons that all of you were sending me a couple months ago

20 0.038671046 583 andrew gelman stats-2011-02-21-An interesting assignment for statistical graphics


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.052), (1, -0.013), (2, -0.012), (3, 0.007), (4, 0.024), (5, -0.0), (6, 0.01), (7, -0.018), (8, 0.003), (9, -0.005), (10, 0.006), (11, -0.019), (12, -0.002), (13, -0.008), (14, 0.015), (15, 0.004), (16, 0.023), (17, 0.002), (18, -0.01), (19, 0.003), (20, 0.021), (21, 0.009), (22, 0.018), (23, 0.007), (24, 0.003), (25, -0.009), (26, 0.031), (27, 0.005), (28, -0.003), (29, 0.001), (30, 0.018), (31, 0.012), (32, 0.012), (33, 0.015), (34, -0.003), (35, -0.011), (36, -0.009), (37, 0.019), (38, 0.006), (39, 0.017), (40, -0.001), (41, -0.009), (42, 0.027), (43, -0.008), (44, -0.018), (45, 0.008), (46, 0.003), (47, 0.007), (48, -0.005), (49, 0.014)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.95077133 418 andrew gelman stats-2010-11-17-ff

Introduction: Can somebody please fix the pdf reader so that it can correctly render “ff” when I cut and paste? This comes up when I’m copying sections of articles on to the blog. Thank you. P.S. I googled “ff pdf” but no help there. P.P.S. It’s a problem with “fi” also. P.P.P.S. Yes, I know about ligatures. But, if you already knew about ligatures, and I already know about ligatures, then presumably the pdf people already know about ligatures too. So why can’t their clever program, which can already find individual f’s, also find the ff’s and separate them? I assume it’s not so simple but I don’t quite understand why not.

2 0.71585459 2190 andrew gelman stats-2014-01-29-Stupid R Tricks: Random Scope

Introduction: Andrew and I have been discussing how we’re going to define functions in Stan for defining systems of differential equations; see our evolving ode design doc ; comments welcome, of course. About Scope I mentioned to Andrew I would prefer pure lexical, static scoping, as found in languages like C++ and Java. If you’re not familiar with the alternatives, there’s a nice overview in the Wikipedia article on scope . Let me call out a few passages that will help set the context. A fundamental distinction in scoping is what “context” means – whether name resolution depends on the location in the source code (lexical scope, static scope, which depends on the lexical context) or depends on the program state when the name is encountered (dynamic scope, which depends on the execution context or calling context). Lexical resolution can be determined at compile time, and is also known as early binding, while dynamic resolution can in general only be determined at run time, and thus

3 0.65753412 266 andrew gelman stats-2010-09-09-The future of R

Introduction: Some thoughts from Christian , including this bit: We need to consider separately 1. R’s brilliant library 2. R’s not-so-brilliant language and/or interpreter. I don’t know that R’s library is so brilliant as all that–if necessary, I don’t think it would be hard to reprogram the important packages in a new language. I would say, though, that the problems with R are not just in the technical details of the language. I think the culture of R has some problems too. As I’ve written before, R functions used to be lean and mean, and now they’re full of exception-handling and calls to other packages. R functions are spaghetti-like messes of connections in which I keep expecting to run into syntax like “GOTO 120.” I learned about these problems a couple years ago when writing bayesglm(), which is a simple adaptation of glm(). But glm(), and its workhorse, glm.fit(), are a mess: They’re about 10 lines of functioning code, plus about 20 lines of necessary front-end, plus a cou

4 0.62840897 2303 andrew gelman stats-2014-04-23-Thinking of doing a list experiment? Here’s a list of reasons why you should think again

Introduction: Someone wrote in: We are about to conduct a voting list experiment. We came across your comment recommending that each item be removed from the list. Would greatly appreciate it if you take a few minutes to spell out your recommendation in a little more detail. In particular: (a) Why are you “uneasy” about list experiments? What would strengthen your confidence in list experiments? (b) What do you mean by “each item be removed”? As you know, there are several non-sensitive items and one sensitive item in a list experiment. Do you mean that the non-sensitive items should be removed one-by-one for the control group or are you suggesting a multiple arm design in which each arm of the experiment has one non-sensitive item removed. What would be achieved by this design? I replied: I’ve always been a bit skeptical about list experiments, partly because I worry that the absolute number of items on the list could itself affect the response. For example, someone might not want to che

5 0.62568319 1714 andrew gelman stats-2013-02-09-Partial least squares path analysis

Introduction: Wayne Folta writes: I [Folta] was looking for R packages to address a project I’m working on and stumbled onto a package called ‘plspm’. It seems to be a nice package, but the thing I wanted to pass on is the PDF that Gaston Sanchez, its author, wrote that describes PLS Path Analysis in general and shows how to use plspm in particular. It’s like a 200-page R vignette that’s really informative and fun to read. I’d recommend it to you and your readers: even if you don’t want to delve into PLS and plspm deeply, the first seven pages and the Appendix A provide a great read about a grad student, PLS Path Analysis, and the history of the field. It’s written at a more popular level than you might like. For example, he says at one point: “A moderating effect is the fancy term that some authors use to say that there is a nosy variable M influencing the effect between an independent variable X and a dependent variable Y.” You would obviously never write anything like that [yup --- AG]

6 0.62078279 1210 andrew gelman stats-2012-03-12-Plagiarists are in the habit of lying

7 0.61925983 818 andrew gelman stats-2011-07-23-Parallel JAGS RNGs

8 0.60817629 1296 andrew gelman stats-2012-05-03-Google Translate for code, and an R help-list bot

9 0.60393596 1807 andrew gelman stats-2013-04-17-Data problems, coding errors…what can be done?

10 0.60191447 1240 andrew gelman stats-2012-04-02-Blogads update

11 0.60074973 1716 andrew gelman stats-2013-02-09-iPython Notebook

12 0.59590238 59 andrew gelman stats-2010-05-30-Extended Binary Format Support for Mac OS X

13 0.59111845 272 andrew gelman stats-2010-09-13-Ross Ihaka to R: Drop Dead

14 0.58643597 2045 andrew gelman stats-2013-09-30-Using the aggregate of the outcome variable as a group-level predictor in a hierarchical model

15 0.58589226 2202 andrew gelman stats-2014-02-07-Outrage of the week

16 0.58541375 2304 andrew gelman stats-2014-04-24-An open site for researchers to post and share papers

17 0.58421868 1462 andrew gelman stats-2012-08-18-Standardizing regression inputs

18 0.58146572 841 andrew gelman stats-2011-08-06-Twitteo killed the bloggio star . . . Not!

19 0.57879215 1410 andrew gelman stats-2012-07-09-Experimental work on market-based or non-market-based incentives

20 0.57497764 2213 andrew gelman stats-2014-02-16-There’s no need for you to read this one


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(2, 0.022), (16, 0.126), (24, 0.105), (27, 0.068), (45, 0.041), (94, 0.289), (99, 0.174)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.87042195 418 andrew gelman stats-2010-11-17-ff

Introduction: Can somebody please fix the pdf reader so that it can correctly render “ff” when I cut and paste? This comes up when I’m copying sections of articles on to the blog. Thank you. P.S. I googled “ff pdf” but no help there. P.P.S. It’s a problem with “fi” also. P.P.P.S. Yes, I know about ligatures. But, if you already knew about ligatures, and I already know about ligatures, then presumably the pdf people already know about ligatures too. So why can’t their clever program, which can already find individual f’s, also find the ff’s and separate them? I assume it’s not so simple but I don’t quite understand why not.

2 0.802719 1689 andrew gelman stats-2013-01-23-MLB Hall of Fame Voting Trajectories

Introduction: Kenny Shirley sends along this interactive data visualization : What I learned from this was that Jim Rice is in the Hall of Fame! I remember watching him play. Whenever he struck out with a man on first base, we were just so relieved that he hadn’t hit into a double play.

3 0.72553861 2253 andrew gelman stats-2014-03-17-On deck this week: Revisitings

Introduction: Just for fun I thought I’d run a week’s worth of old posts, just some things I came across when searching for various things. Of course I could just post the links right here but instead I’ll repost with my comments on how things have changed in the intervening years. Mon : In the best alternative histories, the real world is what’s ultimately real (from 2005) Tues : Comments on an anti-Bayesian (from 2006) Wed : How Americans vote (from 2012) Thurs : The candy weighing demonstration, or, the unwisdom of crowds (from 2008) Fri : Random matrices in the news (from 2010) Sat : Picking pennies in front of a steamroller: A parable comes to life (from 2011) Sun : Greg Mankiw’s utility function (from 2010)

4 0.68808854 615 andrew gelman stats-2011-03-16-Chess vs. checkers

Introduction: Mark Palko writes : Chess derives most of its complexity through differentiated pieces; with checkers the complexity comes from the interaction between pieces. The result is a series of elegant graph problems where the viable paths change with each move of your opponent. To draw an analogy with chess, imagine if moving your knight could allow your opponent’s bishop to move like a rook. Add to that the potential for traps and manipulation that come with forced capture and you have one of the most remarkable games of all time. . . . It’s not unusual to hear masters of both chess and checkers (draughts) to admit that they prefer the latter. So why does chess get all the respect? Why do you never see a criminal mastermind or a Bond villain playing in a checkers tournament? Part of the problem is that we learn the game as children so we tend to think of it as a children’s game. We focus on how simple the rules are and miss how much complexity and subtlety you can get out of those ru

5 0.66892362 1211 andrew gelman stats-2012-03-13-A personal bit of spam, just for me!

Introduction: Hi Andrew, I came across your site while searching for blogs and posts around American obesity and wanted to reach out to get your readership’s feedback on an infographic my team built which focuses on the obesity of America and where we could end up at the going rate. If you’re interested, let’s connect. Have a great weekend! Thanks. *** I have to say, that’s pretty pitiful, to wish someone a “great weekend” on a Tuesday! This guy’s gotta ratchet up his sophistication a few notches if he ever wants to get a job as a spammer for a major software company , for example.

6 0.6507417 1943 andrew gelman stats-2013-07-18-Data to use for in-class sampling exercises?

7 0.64877051 582 andrew gelman stats-2011-02-20-Statisticians vs. everybody else

8 0.62773103 2190 andrew gelman stats-2014-01-29-Stupid R Tricks: Random Scope

9 0.62723684 1760 andrew gelman stats-2013-03-12-Misunderstanding the p-value

10 0.62473571 1987 andrew gelman stats-2013-08-18-A lot of statistical methods have this flavor, that they are a solution to a mathematical problem that has been posed without a careful enough sense of whether the problem is worth solving in the first place

11 0.62430704 1523 andrew gelman stats-2012-10-06-Comparing people from two surveys, one of which is a simple random sample and one of which is not

12 0.62053859 411 andrew gelman stats-2010-11-13-Ethical concerns in medical trials

13 0.61899227 377 andrew gelman stats-2010-10-28-The incoming moderate Republican congressmembers

14 0.61817896 177 andrew gelman stats-2010-08-02-Reintegrating rebels into civilian life: Quasi-experimental evidence from Burundi

15 0.61778402 2 andrew gelman stats-2010-04-23-Modeling heterogenous treatment effects

16 0.61679208 503 andrew gelman stats-2011-01-04-Clarity on my email policy

17 0.6167196 1510 andrew gelman stats-2012-09-25-Incoherence of Bayesian data analysis

18 0.61296368 1871 andrew gelman stats-2013-05-27-Annals of spam

19 0.61295605 960 andrew gelman stats-2011-10-15-The bias-variance tradeoff

20 0.61064124 609 andrew gelman stats-2011-03-13-Coauthorship norms