andrew_gelman_stats andrew_gelman_stats-2013 andrew_gelman_stats-2013-2139 knowledge-graph by maker-knowledge-mining

2139 andrew gelman stats-2013-12-19-Happy birthday


meta infos for this blog

Source: html

Introduction: (Click for bigger image.) The above is Aki’s decomposition of the birthdays data (the number of babies born each day in the United States, from 1968 through 1988) using a Gaussian process model, as described in more detail in our book .


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 ) The above is Aki’s decomposition of the birthdays data (the number of babies born each day in the United States, from 1968 through 1988) using a Gaussian process model, as described in more detail in our book . [sent-2, score-2.458]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('birthdays', 0.411), ('decomposition', 0.359), ('aki', 0.304), ('babies', 0.288), ('born', 0.273), ('gaussian', 0.265), ('bigger', 0.253), ('click', 0.232), ('detail', 0.225), ('united', 0.214), ('described', 0.202), ('states', 0.172), ('process', 0.165), ('day', 0.143), ('number', 0.128), ('book', 0.114), ('using', 0.091), ('model', 0.086), ('data', 0.059)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 2139 andrew gelman stats-2013-12-19-Happy birthday

Introduction: (Click for bigger image.) The above is Aki’s decomposition of the birthdays data (the number of babies born each day in the United States, from 1968 through 1988) using a Gaussian process model, as described in more detail in our book .

2 0.26975209 2067 andrew gelman stats-2013-10-18-EP and ABC

Introduction: Expectation propagation and approximate Bayesian computation. Here are X’s comments on a paper, “Expectation-Propagation for Likelihood-Free Inference,” by Simon Barthelme and Nicolas Chopin. The paper is not new but the topic is still hot. Also there’s this paper by Maurizio Filippone and Mark Girolami on computation for Gaussian process models. I wonder how this connects to GPstuff , which I think is what Aki did to fit the birthdays model: This stuff is where it’s at.

3 0.18358009 1384 andrew gelman stats-2012-06-19-Slick time series decomposition of the birthdays data

Introduction: Aki updates : Here is my plot using the full time series data to make the model. Data analysis could be made in many different ways, but my hammer is Gaussian process, and so I modeled the data with a Gaussian process with six components 1) slowly changing trend 2) 7 day periodical component capturing day of week effect 3) 365.25 day periodical component capturing day of year effect 4) component to take into account the special days and interaction with weekends 5) small time scale correlating noise 6) independent Gaussian noise - Day of the week effect has been increasing in 80′s - Day of year effect has changed only a little during years - 22nd to 31st December is strange time I [Aki] will make the code available this week, but we have to first make new release of our GPstuff toolbox, as I used our development code to do this. I have no idea what’s going on with 29 Feb; I wouldn’t see why births would be less likely on that day. Also, the above graphs are g

4 0.14638229 1357 andrew gelman stats-2012-06-01-Halloween-Valentine’s update

Introduction: A few months ago we reported on a claim that more babies are born on Valentine’s Day and fewer on Halloween. At the time, I wrote that I’d like to see a graph with all 366 days of the year. It would be easy enough to make. That way we could put the Valentine’s and Halloween data in the context of other possible patterns. Joshua Gans sent along the following from an unpublished appendix to his paper. It’s not the graph I was asking for but it does supply additional information beyond those two holidays. Click to enlarge: I don’t know what all those digits are doing (do you really need to know that an estimate is “-70.856″ if its standard error is “10.640″? I’d think that “-71 +/- 10 would be just fine), but I suppose the careful reader can ignore the numbers and simply read the signs and the stars. In any case, it’s good to see more data.

5 0.14475816 1379 andrew gelman stats-2012-06-14-Cool-ass signal processing using Gaussian processes (birthdays again)

Introduction: Aki writes: Here’s my version of the birthday frequency graph . I used Gaussian process with two slowly varying components and periodic component with decay, so that periodic form can change in time. I used Student’s t-distribution as observation model to allow exceptional dates to be outliers. I guess that periodic component due to week effect is still in the data because there is data only from twenty years. Naturally it would be better to model the whole timeseries, but it was easier to just use the cvs by Mulligan. ALl I can say is . . . wow. Bayes wins again. Maybe Aki can supply the R or Matlab code? P.S. And let’s not forget how great the simple and clear time series plots are, compared to various fancy visualizations that people might try. P.P.S. More here .

6 0.13984649 1856 andrew gelman stats-2013-05-14-GPstuff: Bayesian Modeling with Gaussian Processes

7 0.1260256 2144 andrew gelman stats-2013-12-23-I hate this stuff

8 0.09812516 1376 andrew gelman stats-2012-06-12-Simple graph WIN: the example of birthday frequencies

9 0.096945077 1167 andrew gelman stats-2012-02-14-Extra babies on Valentine’s Day, fewer on Halloween?

10 0.093387708 636 andrew gelman stats-2011-03-29-The Conservative States of America

11 0.091485277 1948 andrew gelman stats-2013-07-21-Bayes related

12 0.087740205 1298 andrew gelman stats-2012-05-03-News from the sister blog!

13 0.087615542 1509 andrew gelman stats-2012-09-24-Analyzing photon counts

14 0.080976605 1642 andrew gelman stats-2012-12-28-New book by Stef van Buuren on missing-data imputation looks really good!

15 0.07870055 54 andrew gelman stats-2010-05-27-Hype about conditional probability puzzles

16 0.077117823 2146 andrew gelman stats-2013-12-24-NYT version of birthday graph

17 0.076189436 1739 andrew gelman stats-2013-02-26-An AI can build and try out statistical models using an open-ended generative grammar

18 0.075645797 1237 andrew gelman stats-2012-03-30-Statisticians: When We Teach, We Don’t Practice What We Preach

19 0.069442123 1972 andrew gelman stats-2013-08-07-When you’re planning on fitting a model, build up to it by fitting simpler models first. Then, once you have a model you like, check the hell out of it

20 0.06941852 1649 andrew gelman stats-2013-01-02-Back when 50 miles was a long way


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.069), (1, 0.031), (2, 0.015), (3, 0.044), (4, 0.04), (5, 0.009), (6, -0.012), (7, -0.02), (8, 0.037), (9, 0.021), (10, 0.035), (11, 0.009), (12, -0.027), (13, 0.004), (14, 0.041), (15, 0.026), (16, 0.053), (17, -0.006), (18, 0.019), (19, -0.044), (20, 0.019), (21, 0.019), (22, -0.039), (23, -0.046), (24, -0.006), (25, 0.015), (26, -0.039), (27, 0.045), (28, 0.058), (29, 0.018), (30, -0.036), (31, -0.095), (32, -0.036), (33, -0.003), (34, 0.0), (35, 0.033), (36, -0.026), (37, -0.062), (38, 0.028), (39, 0.013), (40, -0.038), (41, -0.026), (42, 0.001), (43, 0.015), (44, 0.045), (45, 0.065), (46, 0.037), (47, -0.019), (48, -0.013), (49, -0.027)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.94323105 2139 andrew gelman stats-2013-12-19-Happy birthday

Introduction: (Click for bigger image.) The above is Aki’s decomposition of the birthdays data (the number of babies born each day in the United States, from 1968 through 1988) using a Gaussian process model, as described in more detail in our book .

2 0.69674337 1379 andrew gelman stats-2012-06-14-Cool-ass signal processing using Gaussian processes (birthdays again)

Introduction: Aki writes: Here’s my version of the birthday frequency graph . I used Gaussian process with two slowly varying components and periodic component with decay, so that periodic form can change in time. I used Student’s t-distribution as observation model to allow exceptional dates to be outliers. I guess that periodic component due to week effect is still in the data because there is data only from twenty years. Naturally it would be better to model the whole timeseries, but it was easier to just use the cvs by Mulligan. ALl I can say is . . . wow. Bayes wins again. Maybe Aki can supply the R or Matlab code? P.S. And let’s not forget how great the simple and clear time series plots are, compared to various fancy visualizations that people might try. P.P.S. More here .

3 0.64495665 19 andrew gelman stats-2010-05-06-OK, so this is how I ended up working with three different guys named Matt

Introduction: Really we need the data on babies born 30 years ago, but this is still pretty stunning: Argentina: Matías, #3; Mateo, #13 Australia/New South Wales: Matthew, #21 Australia/Victoria: Matthew, #21 Austria: Matthias, #19 Belgium: Mathis, #9; Matteo, #22; Mathias, #23; Mathéo, #35; Mats, #89; Mathieu, #90; Matthias, #97 Brazil: Matheus, #4 Canada/Alberta: Matthew, #8 Canada/British Columbia: Matthew, #6 Canada/Ontario: Matthew, #2 Canada/Quebec: Mathis, #11; Mathieu, #35; Mathias, #47; Matthew, #76; Mathys, #78; Matis, #84 Canada/Saskatchewan: Matthew, #10 Chile: Matias, #4 Czech Republic: Matej, #7; Matyas, #17; Matous, #25 Denmark: Mathias, #11, Mads, #12 England: Matthew, #24 Finland: Matias, #4 France: Mathis, #3 Georgia: Mate, #8 Germany: Matthis, #87 Hungary: Máté, #2; Matyas, #53 Iceland: Matthias, #32 Ireland: Matthew, #17 Italy: Matteo, #4; Mattia, #7 Lithuania: Matas, #1 Netherlands: Thijs, #13 New Zealand: Matthew, #21 Northern Ireland: Matthew,

4 0.59867966 1542 andrew gelman stats-2012-10-20-A statistical model for underdispersion

Introduction: We have lots of models for overdispersed count data but we rarely see underdispersed data. But now I know what example I’ll be giving when this next comes up in class. From a book review by Theo Tait: A number of shark species go in for oophagy, or uterine cannibalism. Sand tiger foetuses ‘eat each other in utero, acting out the harshest form of sibling rivalry imaginable’. Only two babies emerge, one from each of the mother shark’s uteruses: the survivors have eaten everything else. ‘A female sand tiger gives birth to a baby that’s already a metre long and an experienced killer,’ explains Demian Chapman, an expert on the subject. That’s what I call underdispersion. E(y)=2, var(y)=0. Take that, M. Poisson!

5 0.59702086 215 andrew gelman stats-2010-08-18-DataMarket

Introduction: It seems that every day brings a better system for exploring and sharing data on the Internet. From Iceland comes DataMarket . DataMarket is very good at visualizing individual datasets – with interaction and animation, although the “market” aspect hasn’t yet been developed, and all access is free. Here’s an example of visualizing rankings of countries competing in WorldCup: And here’s a lovely example of visualizing population pyramids : In the future, the visualizations will also include state of the art models for predicting and imputing missing data, and understanding the underlying mechanisms. Other posts: InfoChimps , Future of Data Analysis

6 0.59315008 1112 andrew gelman stats-2012-01-11-A blog full of examples for your statistics class

7 0.58810842 2135 andrew gelman stats-2013-12-15-The UN Plot to Force Bayesianism on Unsuspecting Americans (penalized B-Spline edition)

8 0.56415451 1384 andrew gelman stats-2012-06-19-Slick time series decomposition of the birthdays data

9 0.55787748 2162 andrew gelman stats-2014-01-08-Belief aggregation

10 0.55780685 154 andrew gelman stats-2010-07-18-Predictive checks for hierarchical models

11 0.55410355 194 andrew gelman stats-2010-08-09-Data Visualization

12 0.55315709 1649 andrew gelman stats-2013-01-02-Back when 50 miles was a long way

13 0.54321206 595 andrew gelman stats-2011-02-28-What Zombies see in Scatterplots

14 0.53320032 1286 andrew gelman stats-2012-04-28-Agreement Groups in US Senate and Dynamic Clustering

15 0.52605778 536 andrew gelman stats-2011-01-24-Trends in partisanship by state

16 0.52253002 271 andrew gelman stats-2010-09-12-GLM – exposure

17 0.5173797 20 andrew gelman stats-2010-05-07-Bayesian hierarchical model for the prediction of soccer results

18 0.51712543 1972 andrew gelman stats-2013-08-07-When you’re planning on fitting a model, build up to it by fitting simpler models first. Then, once you have a model you like, check the hell out of it

19 0.51574707 1401 andrew gelman stats-2012-06-30-David Hogg on statistics

20 0.51377177 41 andrew gelman stats-2010-05-19-Updated R code and data for ARM


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(16, 0.211), (24, 0.138), (41, 0.177), (79, 0.282)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.84121078 2139 andrew gelman stats-2013-12-19-Happy birthday

Introduction: (Click for bigger image.) The above is Aki’s decomposition of the birthdays data (the number of babies born each day in the United States, from 1968 through 1988) using a Gaussian process model, as described in more detail in our book .

2 0.61355543 1379 andrew gelman stats-2012-06-14-Cool-ass signal processing using Gaussian processes (birthdays again)

Introduction: Aki writes: Here’s my version of the birthday frequency graph . I used Gaussian process with two slowly varying components and periodic component with decay, so that periodic form can change in time. I used Student’s t-distribution as observation model to allow exceptional dates to be outliers. I guess that periodic component due to week effect is still in the data because there is data only from twenty years. Naturally it would be better to model the whole timeseries, but it was easier to just use the cvs by Mulligan. ALl I can say is . . . wow. Bayes wins again. Maybe Aki can supply the R or Matlab code? P.S. And let’s not forget how great the simple and clear time series plots are, compared to various fancy visualizations that people might try. P.P.S. More here .

3 0.58588141 1697 andrew gelman stats-2013-01-29-Where 36% of all boys end up nowadays

Introduction: My Take a Number feature appears in today’s Times. And here are the graphs that I wish they’d had space to include! Original story here .

4 0.57115734 398 andrew gelman stats-2010-11-06-Quote of the day

Introduction: “A statistical model is usually taken to be summarized by a likelihood, or a likelihood and a prior distribution, but we go an extra step by noting that the parameters of a model are typically batched, and we take this batching as an essential part of the model.”

5 0.5508644 469 andrew gelman stats-2010-12-16-2500 people living in a park in Chicago?

Introduction: Frank Hansen writes: Columbus Park is on Chicago’s west side, in the Austin neighborhood. The park is a big green area which includes a golf course. Here is the google satellite view. Here is the nytimes page. Go to Chicago, and zoom over to the census tract 2521, which is just north of the horizontal gray line (Eisenhower Expressway, aka I290) and just east of Oak Park. The park is labeled on the nytimes map. The census data have around 50 dots (they say 50 people per dot) in the park which has no residential buildings. Congressional district is Danny Davis, IL7. Here’s a map of the district. So, how do we explain the map showing ~50 dots worth of people living in the park. What’s up with the algorithm to place the dots? I dunno. I leave this one to you, the readers.

6 0.511648 572 andrew gelman stats-2011-02-14-Desecration of valuable real estate

7 0.50464576 1745 andrew gelman stats-2013-03-02-Classification error

8 0.50002611 1279 andrew gelman stats-2012-04-24-ESPN is looking to hire a research analyst

9 0.49962232 1026 andrew gelman stats-2011-11-25-Bayes wikipedia update

10 0.48667282 845 andrew gelman stats-2011-08-08-How adoption speed affects the abandonment of cultural tastes

11 0.48656216 1366 andrew gelman stats-2012-06-05-How do segregation measures change when you change the level of aggregation?

12 0.46229178 442 andrew gelman stats-2010-12-01-bayesglm in Stata?

13 0.45868471 1014 andrew gelman stats-2011-11-16-Visualizations of NYPD stop-and-frisk data

14 0.45636094 1538 andrew gelman stats-2012-10-17-Rust

15 0.45573404 1115 andrew gelman stats-2012-01-12-Where are the larger-than-life athletes?

16 0.44012266 177 andrew gelman stats-2010-08-02-Reintegrating rebels into civilian life: Quasi-experimental evidence from Burundi

17 0.440101 528 andrew gelman stats-2011-01-21-Elevator shame is a two-way street

18 0.43914914 2 andrew gelman stats-2010-04-23-Modeling heterogenous treatment effects

19 0.43595028 1118 andrew gelman stats-2012-01-14-A model rejection letter

20 0.43257728 1126 andrew gelman stats-2012-01-18-Bob on Stan