andrew_gelman_stats andrew_gelman_stats-2010 andrew_gelman_stats-2010-346 knowledge-graph by maker-knowledge-mining

346 andrew gelman stats-2010-10-16-Mandelbrot and Akaike: from taxonomy to smooth runways (pioneering work in fractals and self-similarity)


meta infos for this blog

Source: html

Introduction: Mandelbrot on taxonomy (from 1955; the first publication about fractals that I know of): Searching for Mandelbrot on the blog led me to Akaike , who also recently passed away and also did interesting early work on self-similar stochastic processes. For example, this wonderful opening of his 1962 paper, “On a limiting process which asymptotically produces f^{-2} spectral density”: In the recent papers in which the results of the spectral analyses of roughnesses of runways or roadways are reported, the power spectral densities of approximately the form f^{-2} (f: frequency) are often treated. This fact directed the present author to the investigation of the limiting process which will provide the f^{-2} form under fairly general assumptions. In this paper a very simple model is given which explains a way how the f^{-2} form is obtained asymptotically. Our fundamental model is that the stochastic process, which might be considered to represent the roughness of the runway


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Mandelbrot on taxonomy (from 1955; the first publication about fractals that I know of): Searching for Mandelbrot on the blog led me to Akaike , who also recently passed away and also did interesting early work on self-similar stochastic processes. [sent-1, score-0.675]

2 This fact directed the present author to the investigation of the limiting process which will provide the f^{-2} form under fairly general assumptions. [sent-3, score-0.898]

3 In this paper a very simple model is given which explains a way how the f^{-2} form is obtained asymptotically. [sent-4, score-0.534]

4 Our fundamental model is that the stochastic process, which might be considered to represent the roughness of the runway, is obtained by alternative repetitions of roughening and smoothing. [sent-5, score-0.946]

5 We can easily get the limiting form of the spectrum for this model. [sent-6, score-0.543]

6 Further, by taking into account the physical meaning of roughening and smoothing we can formulate the conditions under which this general result assures that the f^{-2} form will eventually take place. [sent-7, score-1.041]

7 I’ve placed this in the Multilevel Modeling category because fractals are a form of multilevel model, although not always recognized as such: fractals are hi-tech physical science models, whereas multilevel modeling is associated with low-grade fields such as education and social science. [sent-10, score-1.495]

8 The connection is clear, though, when you consider Mandelbrot’s taxonomy model or the connection between Akaike’s dynamic model of roughness to complex models of student, teacher, and classroom effects. [sent-11, score-1.085]

9 Unfortunately, at the time I didn’t recognize the general importance of multilevel models, so all I could really do in the conversation was to express my awe and appreciation of his work. [sent-13, score-0.506]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('mandelbrot', 0.33), ('spectral', 0.302), ('fractals', 0.263), ('akaike', 0.252), ('form', 0.238), ('roughening', 0.234), ('roughness', 0.234), ('limiting', 0.229), ('taxonomy', 0.201), ('multilevel', 0.177), ('stochastic', 0.144), ('obtained', 0.133), ('process', 0.121), ('met', 0.118), ('physical', 0.11), ('connection', 0.108), ('model', 0.105), ('awe', 0.101), ('repetitions', 0.096), ('assures', 0.096), ('appreciation', 0.086), ('directed', 0.086), ('asymptotically', 0.082), ('general', 0.082), ('formulate', 0.081), ('models', 0.08), ('investigation', 0.08), ('smoothing', 0.08), ('spectrum', 0.076), ('densities', 0.076), ('opening', 0.075), ('classroom', 0.075), ('modeling', 0.074), ('placed', 0.07), ('dynamic', 0.069), ('produces', 0.069), ('passed', 0.067), ('frequency', 0.067), ('searching', 0.066), ('density', 0.064), ('recognized', 0.064), ('wonderful', 0.062), ('meaning', 0.062), ('fairly', 0.062), ('teacher', 0.062), ('conversation', 0.06), ('category', 0.059), ('approximately', 0.058), ('eventually', 0.058), ('explains', 0.058)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 346 andrew gelman stats-2010-10-16-Mandelbrot and Akaike: from taxonomy to smooth runways (pioneering work in fractals and self-similarity)

Introduction: Mandelbrot on taxonomy (from 1955; the first publication about fractals that I know of): Searching for Mandelbrot on the blog led me to Akaike , who also recently passed away and also did interesting early work on self-similar stochastic processes. For example, this wonderful opening of his 1962 paper, “On a limiting process which asymptotically produces f^{-2} spectral density”: In the recent papers in which the results of the spectral analyses of roughnesses of runways or roadways are reported, the power spectral densities of approximately the form f^{-2} (f: frequency) are often treated. This fact directed the present author to the investigation of the limiting process which will provide the f^{-2} form under fairly general assumptions. In this paper a very simple model is given which explains a way how the f^{-2} form is obtained asymptotically. Our fundamental model is that the stochastic process, which might be considered to represent the roughness of the runway

2 0.21387029 1178 andrew gelman stats-2012-02-21-How many data points do you really have?

Introduction: Chris Harrison writes: I have just come across your paper in the 2009 American Scientist. Another problem that I frequently come across is when people do power spectral analyses of signals. If one has 1200 points (fairly modest in this day and age) then there are 600 power spectral estimates. People will then determine the 95% confidence limits and pick out any spectral estimate that sticks up above this, claiming that it is significant. But there will be on average 30 estimates that stick up too high or too low. So in general there will be 15 spectral estimates which are higher than the 95% confidence limit which could happen just by chance. I suppose that this means that you have to set a much higher confidence limit, which would depend on the number of data in your signal. I would also like your opinion about a paper in the Proceedings of the National Academy of Science, “The causality analysis of climate change and large-scale human crisis” by David D. Zhang, Harry F. L

3 0.20046599 1784 andrew gelman stats-2013-04-01-Wolfram on Mandelbrot

Introduction: The most perfect pairing of author and subject since Nicholson Baker and John Updike. Here’s Wolfram on the great researcher of fractals : In his way, Mandelbrot paid me some great compliments. When I was in my 20s, and he in his 60s, he would ask about my scientific work: “How can so many people take someone so young so seriously?” In 2002, my book “A New Kind of Science”—in which I argued that many phenomena across science are the complex results of relatively simple, program-like rules—appeared. Mandelbrot seemed to see it as a direct threat, once declaring that “Wolfram’s ‘science’ is not new except when it is clearly wrong; it deserves to be completely disregarded.” In private, though, several mutual friends told me, he fretted that in the long view of history it would overwhelm his work. In retrospect, I don’t think Mandelbrot had much to worry about on this account. The link from the above review came from Peter Woit, who also points to a review by Brian Hayes wit

4 0.15477738 493 andrew gelman stats-2010-12-31-Obituaries in 2010

Introduction: David Blackwell . Julian Besag . Arnold Zellner . Benoit Mandelbrot and Hirotugu Akaike (late) . Alfred Kahn .

5 0.14031374 1592 andrew gelman stats-2012-11-27-Art-math

Introduction: This seems like the sort of thing I would like: Drawing from My Mind’s Eye: Dorothea Rockburne in Conversation with David Cohen Introduced by Nina Samuel Thursday, November 29 6 pm BGC, 38 West 86th Street Benoît Mandelbrot, unusual among mathematicians of the twentieth century, harnessed the power of visual images to express his theories and to pursue new lines of thought. In this conversation artist Dorothea Rockburne will share memories of studying with mathematician Max Dehn at Black Mountain College, of meeting Mandelbrot, and discuss her recent work. Exhibition curator Nina Samuel will discuss the related exhibition “The Islands of Benoît Mandelbrot: Fractals, Chaos, and the Materiality of Thinking,” on view in the BGC Focus Gallery through January 27, 2013. David Cohen is editor and publisher of artcritical.com as well as founder and moderator of The Review Panel. Dorothea Rockburne is a distinguished artist whose work has been inspired by her lifelong st

6 0.12106899 295 andrew gelman stats-2010-09-25-Clusters with very small numbers of observations

7 0.10912101 1739 andrew gelman stats-2013-02-26-An AI can build and try out statistical models using an open-ended generative grammar

8 0.10808106 1377 andrew gelman stats-2012-06-13-A question about AIC

9 0.10428904 2133 andrew gelman stats-2013-12-13-Flexibility is good

10 0.09681116 1934 andrew gelman stats-2013-07-11-Yes, worry about generalizing from data to population. But multilevel modeling is the solution, not the problem

11 0.093150459 1737 andrew gelman stats-2013-02-25-Correlation of 1 . . . too good to be true?

12 0.092119917 2033 andrew gelman stats-2013-09-23-More on Bayesian methods and multilevel modeling

13 0.088281102 1392 andrew gelman stats-2012-06-26-Occam

14 0.081999063 1248 andrew gelman stats-2012-04-06-17 groups, 6 group-level predictors: What to do?

15 0.081408121 1972 andrew gelman stats-2013-08-07-When you’re planning on fitting a model, build up to it by fitting simpler models first. Then, once you have a model you like, check the hell out of it

16 0.081079461 780 andrew gelman stats-2011-06-27-Bridges between deterministic and probabilistic models for binary data

17 0.080086738 24 andrew gelman stats-2010-05-09-Special journal issue on statistical methods for the social sciences

18 0.078194581 1999 andrew gelman stats-2013-08-27-Bayesian model averaging or fitting a larger model

19 0.077785403 772 andrew gelman stats-2011-06-17-Graphical tools for understanding multilevel models

20 0.075827099 383 andrew gelman stats-2010-10-31-Analyzing the entire population rather than a sample


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.134), (1, 0.06), (2, -0.007), (3, -0.007), (4, 0.019), (5, 0.024), (6, -0.013), (7, -0.037), (8, 0.057), (9, 0.074), (10, 0.056), (11, 0.037), (12, -0.048), (13, -0.011), (14, -0.028), (15, -0.026), (16, 0.011), (17, -0.017), (18, -0.021), (19, -0.022), (20, 0.01), (21, -0.023), (22, -0.018), (23, -0.012), (24, -0.02), (25, -0.053), (26, -0.029), (27, 0.018), (28, -0.001), (29, -0.015), (30, -0.052), (31, 0.005), (32, -0.017), (33, -0.008), (34, 0.002), (35, 0.009), (36, 0.011), (37, -0.016), (38, 0.054), (39, -0.006), (40, -0.038), (41, -0.006), (42, 0.027), (43, -0.061), (44, 0.001), (45, 0.005), (46, 0.022), (47, -0.018), (48, -0.073), (49, -0.028)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96408421 346 andrew gelman stats-2010-10-16-Mandelbrot and Akaike: from taxonomy to smooth runways (pioneering work in fractals and self-similarity)

Introduction: Mandelbrot on taxonomy (from 1955; the first publication about fractals that I know of): Searching for Mandelbrot on the blog led me to Akaike , who also recently passed away and also did interesting early work on self-similar stochastic processes. For example, this wonderful opening of his 1962 paper, “On a limiting process which asymptotically produces f^{-2} spectral density”: In the recent papers in which the results of the spectral analyses of roughnesses of runways or roadways are reported, the power spectral densities of approximately the form f^{-2} (f: frequency) are often treated. This fact directed the present author to the investigation of the limiting process which will provide the f^{-2} form under fairly general assumptions. In this paper a very simple model is given which explains a way how the f^{-2} form is obtained asymptotically. Our fundamental model is that the stochastic process, which might be considered to represent the roughness of the runway

2 0.76857597 1468 andrew gelman stats-2012-08-24-Multilevel modeling and instrumental variables

Introduction: Terence Teo writes: I was wondering if multilevel models can be used as an alternative to 2SLS or IV models to deal with (i) endogeneity and (ii) selection problems. More concretely, I am trying to assess the impact of investment treaties on foreign investment. Aside from the fact that foreign investment is correlated over time, it may be the case that countries that already receive sufficient amounts of foreign investment need not sign treaties, and countries that sign treaties are those that need foreign investment in the first place. Countries thus “select” into treatment; treaty signing is non-random. As such, I argue that to properly estimate the impact of treaties on investment, we must model the determinants of treaty signing. I [Teo] am currently modeling this as two separate models: (1) regress predictors on likelihood of treaty signing, (2) regress treaty (with interactions, etc) on investment (I’ve thought of using propensity score matching for this part of the model)

3 0.76038408 964 andrew gelman stats-2011-10-19-An interweaving-transformation strategy for boosting MCMC efficiency

Introduction: Yaming Yu and Xiao-Li Meng write in with a cool new idea for improving the efficiency of Gibbs and Metropolis in multilevel models: For a broad class of multilevel models, there exist two well-known competing parameterizations, the centered parameterization (CP) and the non-centered parameterization (NCP), for effective MCMC implementation. Much literature has been devoted to the questions of when to use which and how to compromise between them via partial CP/NCP. This article introduces an alternative strategy for boosting MCMC efficiency via simply interweaving—but not alternating—the two parameterizations. This strategy has the surprising property that failure of both the CP and NCP chains to converge geometrically does not prevent the interweaving algorithm from doing so. It achieves this seemingly magical property by taking advantage of the discordance of the two parameterizations, namely, the sufficiency of CP and the ancillarity of NCP, to substantially reduce the Markovian

4 0.7505722 1395 andrew gelman stats-2012-06-27-Cross-validation (What is it good for?)

Introduction: I think cross-validation is a good way to estimate a model’s forecasting error but I don’t think it’s always such a great tool for comparing models. I mean, sure, if the differences are dramatic, ok. But you can easily have a few candidate models, and one model makes a lot more sense than the others (even from a purely predictive sense, I’m not talking about causality here). The difference between the model doesn’t show up in a xval measure of total error but in the patterns of the predictions. For a simple example, imagine using a linear model with positive slope to model a function that is constrained to be increasing. If the constraint isn’t in the model, the predicted/imputed series will sometimes be nonmonotonic. The effect on the prediction error can be so tiny as to be undetectable (or it might even increase avg prediction error to include the constraint); nonetheless, the predictions will be clearly nonsensical. That’s an extreme example but I think the general point h

5 0.74490494 77 andrew gelman stats-2010-06-09-Sof[t]

Introduction: Joe Fruehwald writes: I’m working with linguistic data, specifically binomial hits and misses of a certain variable for certain words (specifically whether or not the “t” sound was pronounced at the end of words like “soft”). Word frequency follows a power law, with most words appearing just once, and with some words being hyperfrequent. I’m not interested in specific word effects, but I am interested in the effect of word frequency. A logistic model fit is going to be heavily influenced by the effect of the hyperfrequent words which constitute only one type. To control for the item effect, I would fit a multilevel model with a random intercept by word, but like I said, most of the words appear only once. Is there a principled approach to this problem? My response: It’s ok to fit a multilevel model even if most groups only have one observation each. You’ll want to throw in some word-level predictors too. Think of the multilevel model not as a substitute for the usual thoug

6 0.73369938 1934 andrew gelman stats-2013-07-11-Yes, worry about generalizing from data to population. But multilevel modeling is the solution, not the problem

7 0.72582871 2033 andrew gelman stats-2013-09-23-More on Bayesian methods and multilevel modeling

8 0.72135496 823 andrew gelman stats-2011-07-26-Including interactions or not

9 0.72035956 295 andrew gelman stats-2010-09-25-Clusters with very small numbers of observations

10 0.72024685 772 andrew gelman stats-2011-06-17-Graphical tools for understanding multilevel models

11 0.71904999 948 andrew gelman stats-2011-10-10-Combining data from many sources

12 0.70504189 1392 andrew gelman stats-2012-06-26-Occam

13 0.70439702 24 andrew gelman stats-2010-05-09-Special journal issue on statistical methods for the social sciences

14 0.69556773 2007 andrew gelman stats-2013-09-03-Popper and Jaynes

15 0.69320309 1248 andrew gelman stats-2012-04-06-17 groups, 6 group-level predictors: What to do?

16 0.69294524 1234 andrew gelman stats-2012-03-28-The Supreme Court’s Many Median Justices

17 0.67522937 1739 andrew gelman stats-2013-02-26-An AI can build and try out statistical models using an open-ended generative grammar

18 0.67437142 1066 andrew gelman stats-2011-12-17-Ripley on model selection, and some links on exploratory model analysis

19 0.66892034 1459 andrew gelman stats-2012-08-15-How I think about mixture models

20 0.66877556 1162 andrew gelman stats-2012-02-11-Adding an error model to a deterministic model


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(5, 0.015), (15, 0.015), (16, 0.055), (24, 0.166), (30, 0.067), (31, 0.014), (33, 0.119), (42, 0.019), (53, 0.016), (56, 0.017), (57, 0.017), (59, 0.024), (77, 0.078), (78, 0.02), (95, 0.012), (99, 0.247)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.94556957 346 andrew gelman stats-2010-10-16-Mandelbrot and Akaike: from taxonomy to smooth runways (pioneering work in fractals and self-similarity)

Introduction: Mandelbrot on taxonomy (from 1955; the first publication about fractals that I know of): Searching for Mandelbrot on the blog led me to Akaike , who also recently passed away and also did interesting early work on self-similar stochastic processes. For example, this wonderful opening of his 1962 paper, “On a limiting process which asymptotically produces f^{-2} spectral density”: In the recent papers in which the results of the spectral analyses of roughnesses of runways or roadways are reported, the power spectral densities of approximately the form f^{-2} (f: frequency) are often treated. This fact directed the present author to the investigation of the limiting process which will provide the f^{-2} form under fairly general assumptions. In this paper a very simple model is given which explains a way how the f^{-2} form is obtained asymptotically. Our fundamental model is that the stochastic process, which might be considered to represent the roughness of the runway

2 0.90989339 1849 andrew gelman stats-2013-05-09-Same old same old

Introduction: In an email I sent to a colleague who’s writing about lasso and Bayesian regression for R users: The one thing you might want to add, to fit with your pragmatic perspective, is to point out that these different methods are optimal under different assumptions about the data. However, these assumptions are never true (even in the rare cases where you have a believable prior, it won’t really follow the functional form assumed by bayesglm ; even in the rare cases where you have a real loss function, it won’t really follow the mathematical form assumed by lasso etc), but these methods can still be useful and be given the interpretation of regularized estimates. Another thing that someone might naively think is that regularization is fine but “ unbiased ” is somehow the most honest. In practice, if you stick to “unbiased” methods such as least squares, you’ll restrict the number of variables you can include in your model. So in reality you suffer from omitted-variable bias. So th

3 0.89776552 401 andrew gelman stats-2010-11-08-Silly old chi-square!

Introduction: Brian Mulford writes: I [Mulford] ran across this blog post and found myself questioning the relevance of the test used. I’d think Chi-Square would be inappropriate for trying to measure significance of choice in the manner presented here; irrespective of the cute hamster. Since this is a common test for marketers and website developers – I’d be interested in which techniques you might suggest? For tests of this nature, I typically measure a variety of variables (image placement, size, type, page speed, “page feel” as expressed in a factor, etc) and use LOGIT, Cluster and possibly a simple Bayesian model to determine which variables were most significant (chosen). Pearson Chi-squared may be used to express relationships between variables and outcome but I’ve typically not used it to simply judge a 0/1 choice as statistically significant or not. My reply: I like the decision-theoretic way that the blogger (Jason Cohen, according to the webpage) starts: If you wait too

4 0.8953222 1438 andrew gelman stats-2012-07-31-What is a Bayesian?

Introduction: Deborah Mayo recommended that I consider coming up with a new name for the statistical methods that I used, given that the term “Bayesian” has all sorts of associations that I dislike (as discussed, for example, in section 1 of this article ). I replied that I agree on Bayesian, I never liked the term and always wanted something better, but I couldn’t think of any convenient alternative. Also, I was finding that Bayesians (even the Bayesians I disagreed with) were reading my research articles, while non-Bayesians were simply ignoring them. So I thought it was best to identify with, and communicate with, those people who were willing to engage with me. More formally, I’m happy defining “Bayesian” as “using inference from the posterior distribution, p(theta|y)”. This says nothing about where the probability distributions come from (thus, no requirement to be “subjective” or “objective”) and it says nothing about the models (thus, no requirement to use the discrete models that hav

5 0.89305478 1604 andrew gelman stats-2012-12-04-An epithet I can live with

Introduction: Here . Indeed, I’d much rather be a legend than a myth. I just want to clarify one thing. Walter Hickey writes: [Antony Unwin and Andrew Gelman] collaborated on this presentation where they take a hard look at what’s wrong with the recent trends of data visualization and infographics. The takeaway is that while there have been great leaps in visualization technology, some of the visualizations that have garnered the highest praises have actually been lacking in a number of key areas. Specifically, the pair does a takedown of the top visualizations of 2008 as decided by the popular statistics blog Flowing Data. This is a fair summary, but I want to emphasize that, although our dislike of some award-winning visualizations is central to our argument, it is only the first part of our story. As Antony and I worked more on our paper, and especially after seeing the discussions by Robert Kosara, Stephen Few, Hadley Wickham, and Paul Murrell (all to appear in Journal of Computati

6 0.89058226 562 andrew gelman stats-2011-02-06-Statistician cracks Toronto lottery

7 0.88786602 1195 andrew gelman stats-2012-03-04-Multiple comparisons dispute in the tabloids

8 0.88729352 1429 andrew gelman stats-2012-07-26-Our broken scholarly publishing system

9 0.88570982 1831 andrew gelman stats-2013-04-29-The Great Race

10 0.88500547 207 andrew gelman stats-2010-08-14-Pourquoi Google search est devenu plus raisonnable?

11 0.88352758 1684 andrew gelman stats-2013-01-20-Ugly ugly ugly

12 0.88331586 1247 andrew gelman stats-2012-04-05-More philosophy of Bayes

13 0.88072157 752 andrew gelman stats-2011-06-08-Traffic Prediction

14 0.88042063 687 andrew gelman stats-2011-04-29-Zero is zero

15 0.87859118 878 andrew gelman stats-2011-08-29-Infovis, infographics, and data visualization: Where I’m coming from, and where I’d like to go

16 0.8785156 262 andrew gelman stats-2010-09-08-Here’s how rumors get started: Lineplots, dotplots, and nonfunctional modernist architecture

17 0.87817407 1792 andrew gelman stats-2013-04-07-X on JLP

18 0.87792051 1976 andrew gelman stats-2013-08-10-The birthday problem

19 0.87773716 2015 andrew gelman stats-2013-09-10-The ethics of lying, cheating, and stealing with data: A case study

20 0.87697643 1784 andrew gelman stats-2013-04-01-Wolfram on Mandelbrot