nathan_marz_storm nathan_marz_storm-2012 nathan_marz_storm-2012-34 knowledge-graph by maker-knowledge-mining

34 nathan marz storm-2012-09-19-Storm's 1st birthday

meta infos for this blog

Source: html

Introduction: Storm was open-sourced exactly one year ago today. It's been an action-packed year for Storm, to say the least. Here's some of the exciting stuff that's happened over the past year: 27 companies have publicized that they're using Storm in production . I know of at least a few more companies using it that haven't published anything yet. O'Reilly published a book on Storm. The Storm mailing list has over 1300 members, with over 500 messages per month. The @stormprocessor account has over 1200 followers. More than 4000 people have starred the project on Github . There's a regular Storm meetup in the Bay Area with over 230 members. I've also seen lots of Storm-focused meetups happen all over the world over the past year. 29 people all over the world have contributed to the codebase We released Trident , a high level abstraction for realtime computation, that is a major leap forward in what's possible in realtime. Libraries have been released integrating Stor

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Storm was open-sourced exactly one year ago today. [sent-1, score-0.248]

2 It's been an action-packed year for Storm, to say the least. [sent-2, score-0.178]

3 Here's some of the exciting stuff that's happened over the past year: 27 companies have publicized that they're using Storm in production . [sent-3, score-0.586]

4 I know of at least a few more companies using it that haven't published anything yet. [sent-4, score-0.16]

5 The Storm mailing list has over 1300 members, with over 500 messages per month. [sent-6, score-0.371]

6 I've also seen lots of Storm-focused meetups happen all over the world over the past year. [sent-10, score-0.463]

7 29 people all over the world have contributed to the codebase We released Trident , a high level abstraction for realtime computation, that is a major leap forward in what's possible in realtime. [sent-11, score-0.507]

8 Libraries have been released integrating Storm with Kestrel, Kafka, JMS, Cassandra, Memcached, and many more systems. [sent-12, score-0.249]

9 For many, Storm is becoming the system of choice for connecting these systems together. [sent-13, score-0.29]

10 Storm's performance has been increased by over 10x. [sent-14, score-0.078]

11 I've benchmarked it at 1M messages per second per node on an internal Twitter cluster. [sent-15, score-0.656]

12 What I overwhelmingly hear from people is that they like Storm because it's simple to understand, flexible, and extremely robust in production. [sent-16, score-0.158]

13 These have always been some of the core design goals of Storm, so I'm glad that we were able to succeed on these points. [sent-17, score-0.36]

14 We've got lots of exciting stuff planned over the next year. [sent-18, score-0.513]

15 We have a new metrics system in development which will let you get deep insight into what's happening throughout your topology in realtime. [sent-19, score-0.515]

16 And we have big plans for improving Trident and integrating it with more datastores and input sources. [sent-20, score-0.515]

similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('storm', 0.634), ('integrating', 0.249), ('exciting', 0.202), ('per', 0.178), ('year', 0.178), ('stuff', 0.129), ('messages', 0.129), ('lots', 0.118), ('past', 0.109), ('glad', 0.101), ('insight', 0.101), ('datastores', 0.101), ('flexible', 0.101), ('node', 0.101), ('bay', 0.101), ('birthday', 0.101), ('goals', 0.101), ('choice', 0.101), ('topology', 0.101), ('connecting', 0.101), ('contributed', 0.101), ('metrics', 0.101), ('plans', 0.101), ('world', 0.088), ('succeed', 0.088), ('members', 0.088), ('hear', 0.088), ('forward', 0.088), ('becoming', 0.088), ('abstraction', 0.088), ('companies', 0.082), ('increased', 0.078), ('published', 0.078), ('cassandra', 0.078), ('major', 0.078), ('happen', 0.078), ('throughout', 0.078), ('seen', 0.07), ('area', 0.07), ('internal', 0.07), ('ago', 0.07), ('core', 0.07), ('deep', 0.07), ('robust', 0.07), ('happened', 0.064), ('happening', 0.064), ('improving', 0.064), ('list', 0.064), ('level', 0.064), ('got', 0.064)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 34 nathan marz storm-2012-09-19-Storm's 1st birthday

2 0.21880129 33 nathan marz storm-2012-02-06-Suffering-oriented programming

Introduction: Someone asked me an interesting question the other day: "How did you justify taking such a huge risk on building Storm while working on a startup ?" (Storm is a realtime computation system). I can see how from an outsider's perspective investing in such a massive project seems extremely risky for a startup. From my perspective, though, building Storm wasn't risky at all. It was challenging, but not risky. I follow a style of development that greatly reduces the risk of big projects like Storm. I call this style "suffering-oriented programming." Suffering-oriented programming can be summarized like so: don't build technology unless you feel the pain of not having it. It applies to the big, architectural decisions as well as the smaller everyday programming decisions. Suffering-oriented programming greatly reduces risk by ensuring that you're always working on something important, and it ensures that you are well-versed in a problem space before attempting a large investment. I ha

3 0.21497843 39 nathan marz storm-2014-02-12-Interview with "Programmer Magazine"

Introduction: I was recently interviewed for "Programmer Magazine", a Chinese magazine. The interview was published in Chinese, but a lot of people told me they'd like to see the English version of the interview. Due to the Google translation being, ahem, a little iffy, I decided to just publish the original English version on my blog. Hope you enjoy! What drew you to programming and what was the first interesting program you wrote? I started programming when I was 10 years old on my TI-82 graphing calculator. Initially I started programming because I wanted to make games on my calculator – and also because I was bored in math class :D. The first interesting game I made on my calculator was an archery game where you'd shoot arrows at moving targets. You'd get points for hitting more targets or completing all the targets faster. A couple years later I graduated to programming the TI-89 which was a huge upgrade in power. I remember how the TI-82 only let you have 26 variables (for the character

4 0.1132773 37 nathan marz storm-2013-04-02-Principles of Software Engineering, Part 1

Introduction: This is the first in a series of posts on the principles of software engineering. There's far more to software engineering than just "making computers do stuff" – while that phrase is accurate, it does not come close to describing what's involved in making robust, reliable software. I will use my experience building large scale systems to inform a first principles approach to defining what it is we do – or should be doing – as software engineers. I'm not interested in tired debates like dynamic vs. static languages – instead, I intend to explore the really core aspects of software engineering. The first order of business is to define what software engineering even is in the first place. Software engineering is the construction of software that produces some desired output for some range of inputs. The inputs to software are more than just method parameters: they include the hardware on which it's running, the rate at which it receives data, and anything else that influences the oper

5 0.079760842 31 nathan marz storm-2011-10-13-How to beat the CAP theorem

Introduction: The CAP theorem states a database cannot guarantee consistency, availability, and partition-tolerance at the same time. But you can't sacrifice partition-tolerance (see here and here ), so you must make a tradeoff between availability and consistency. Managing this tradeoff is a central focus of the NoSQL movement. Consistency means that after you do a successful write, future reads will always take that write into account. Availability means that you can always read and write to the system. During a partition, you can only have one of these properties. Systems that choose consistency over availability have to deal with some awkward issues. What do you do when the database isn't available? You can try buffering writes for later, but you risk losing those writes if you lose the machine with the buffer. Also, buffering writes can be a form of inconsistency because a client thinks a write has succeeded but the write isn't in the database yet. Alternatively, you can return errors ba

6 0.077627808 15 nathan marz storm-2010-05-07-Cascalog Presentation at Bay Area Clojure User Group

7 0.074391335 35 nathan marz storm-2013-03-16-Leaving Twitter

8 0.054712888 40 nathan marz storm-2014-02-24-The inexplicable rise of open floor plans in tech companies

9 0.052941035 38 nathan marz storm-2013-04-12-Break into Silicon Valley with a blog

10 0.048082635 30 nathan marz storm-2011-03-29-My talks at POSSCON

11 0.043567132 19 nathan marz storm-2010-07-12-My experience as the first employee of a Y Combinator startup

12 0.043534331 22 nathan marz storm-2010-10-05-How to get a job at a kick-ass startup (for programmers)

13 0.03521803 41 nathan marz storm-2014-05-10-Why we in tech must support Lawrence Lessig

14 0.029393036 36 nathan marz storm-2013-04-01-My new startup

15 0.028186621 2 nathan marz storm-2010-01-03-Tips for Optimizing Cascading Flows

16 0.027899798 7 nathan marz storm-2010-03-04-Introducing "Nanny" - a really simple dependency management tool

17 0.027820073 1 nathan marz storm-2009-12-28-The mathematics behind Hadoop-based systems

18 0.022973605 9 nathan marz storm-2010-03-10-Thrift + Graphs = Strong, flexible schemas on Hadoop

19 0.022592383 25 nathan marz storm-2010-12-06-You Are a Product

20 0.022402804 20 nathan marz storm-2010-07-30-You should blog even if you have no readers

similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.27), (1, 0.095), (2, -0.036), (3, 0.46), (4, -0.072), (5, 0.009), (6, 0.136), (7, -0.261), (8, 0.219), (9, 0.042), (10, 0.059), (11, 0.032), (12, -0.287), (13, 0.058), (14, 0.019), (15, 0.142), (16, 0.008), (17, 0.023), (18, 0.073), (19, -0.125), (20, -0.183), (21, -0.077), (22, -0.019), (23, -0.01), (24, -0.222), (25, 0.089), (26, 0.036), (27, -0.076), (28, -0.006), (29, -0.142), (30, -0.192), (31, 0.159), (32, 0.153), (33, -0.173), (34, 0.19), (35, 0.021), (36, -0.379), (37, 0.09), (38, 0.056), (39, 0.004), (40, 0.002)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99670857 34 nathan marz storm-2012-09-19-Storm's 1st birthday

2 0.22482814 33 nathan marz storm-2012-02-06-Suffering-oriented programming

3 0.22118677 39 nathan marz storm-2014-02-12-Interview with "Programmer Magazine"

4 0.1412852 37 nathan marz storm-2013-04-02-Principles of Software Engineering, Part 1

5 0.10163998 31 nathan marz storm-2011-10-13-How to beat the CAP theorem

6 0.091985613 40 nathan marz storm-2014-02-24-The inexplicable rise of open floor plans in tech companies

7 0.086191744 38 nathan marz storm-2013-04-12-Break into Silicon Valley with a blog

8 0.078998625 41 nathan marz storm-2014-05-10-Why we in tech must support Lawrence Lessig

9 0.078062959 35 nathan marz storm-2013-03-16-Leaving Twitter

10 0.068171576 22 nathan marz storm-2010-10-05-How to get a job at a kick-ass startup (for programmers)

11 0.062023621 15 nathan marz storm-2010-05-07-Cascalog Presentation at Bay Area Clojure User Group

12 0.060413025 19 nathan marz storm-2010-07-12-My experience as the first employee of a Y Combinator startup

13 0.059240948 30 nathan marz storm-2011-03-29-My talks at POSSCON

14 0.054887217 36 nathan marz storm-2013-04-01-My new startup

15 0.052372336 2 nathan marz storm-2010-01-03-Tips for Optimizing Cascading Flows

16 0.051959205 3 nathan marz storm-2010-01-13-Mimi Silbert: the greatest hacker in the world

17 0.047546979 1 nathan marz storm-2009-12-28-The mathematics behind Hadoop-based systems

18 0.045844492 9 nathan marz storm-2010-03-10-Thrift + Graphs = Strong, flexible schemas on Hadoop

19 0.043773171 7 nathan marz storm-2010-03-04-Introducing "Nanny" - a really simple dependency management tool

20 0.042262193 25 nathan marz storm-2010-12-06-You Are a Product

similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(10, 0.784), (41, 0.032), (68, 0.028), (76, 0.013), (88, 0.022)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99605972 34 nathan marz storm-2012-09-19-Storm's 1st birthday

2 0.99490356 28 nathan marz storm-2011-01-11-Cascalog workshop

Introduction: I'll be teaching a Cascalog workshop on February 19th at BackType HQ in Union Square. You can sign up at http://cascalog.eventbrite.com . Early bird tickets are available until January 31st. I'm very excited to be teaching this workshop. Cascalog's tight integration with Clojure opens up a world of techniques that no other data processing tool is able to do. Even though I created Cascalog, I've been discovering many of these techniques as I've made use of Cascalog for more and more varied tasks. Along the way, I've tweaked Cascalog so that making use of these techniques would be cleaner and more idiomatic. At this point, after nine months of iteration, Cascalog is a joy to use for even the most complex tasks. I'm excited to impart this knowledge upon others in this workshop.

3 0.24322821 7 nathan marz storm-2010-03-04-Introducing "Nanny" - a really simple dependency management tool

Introduction: Dependency management in software projects is a pretty simple problem when you think about it. A tool to manage dependencies just needs to do three things: Provide a mechanism to specify the direct dependencies to a project Download the transitive closure of dependencies to a project Publish packages that can be used as a dependency to other projects Some languages have good dependency management systems - for example, rubygems. Others, like Java, have tools like Maven which I would call a complex solution to a simple problem. You shouldn't need to buy a book to understand the solution to such a simple problem. Plus, these dependency management systems are all language specific. I've seen companies do crazy things to manage their dependencies. One company, to manage their jar files, would put all the jars that any project might need in a special "jars" project. You would then need to setup a JARS_HOME environment variable and be sure to update the jars project if you n

4 0.1962778 9 nathan marz storm-2010-03-10-Thrift + Graphs = Strong, flexible schemas on Hadoop

Introduction: There are a lot of misconceptions about what Hadoop is useful for and what kind of data you can put in it. A lot of people think that Hadoop is meant for unstructured data like log files. While Hadoop is great for log files, it's also fantastic for strongly typed, structured data. In this post I'll discuss how you can use a tool like Thrift to store strongly typed data in Hadoop while retaining the flexibility to evolve your schema. We'll look at graph-based schemas and see why they are an ideal fit for many Hadoop-based applications. OK, so what kind of "structured" data can you put in Hadoop? Anything! At BackType we put data about news, conversations, and people into Hadoop as structured objects. You can easily push structured information about social graphs, financial information, or anything you want into Hadoop. Â That sounds all well and good, but why not just use JSON as the data format? JSON doesn't give you a real schema and doesn't protect against data i

5 0.19462086 40 nathan marz storm-2014-02-24-The inexplicable rise of open floor plans in tech companies

Introduction: Update: I originally quoted the average price of office space as $36 / square foot / month, where in reality it's per year. So I was accidentally weakening my own argument! The post has been updated to reflect the right number. The "open floor plan" has really taken over tech companies in San Francisco. Offices are organized as huge open spaces with row after row of tables. Employees sit next to each other and each have their own piece of desk space. Now, I don't want to comment on the effectiveness of open floor plans for fields other than my own. But for software development, this is the single best way to sabotage the productivity of your entire engineering team . The problem Programming is a very brain-intensive task. You have to hold all sorts of disparate information in your head at once and synthesize it into extremely precise code. It requires intense amounts of focus. Distractions and interruptions are death to the productivity of a programmer. And an open-floor plan e

6 0.18708591 39 nathan marz storm-2014-02-12-Interview with "Programmer Magazine"

7 0.17328961 1 nathan marz storm-2009-12-28-The mathematics behind Hadoop-based systems

8 0.15917988 18 nathan marz storm-2010-06-16-Your company has a knowledge debt problem

9 0.14060614 37 nathan marz storm-2013-04-02-Principles of Software Engineering, Part 1

10 0.13958062 33 nathan marz storm-2012-02-06-Suffering-oriented programming

11 0.13493489 41 nathan marz storm-2014-05-10-Why we in tech must support Lawrence Lessig

12 0.13418223 31 nathan marz storm-2011-10-13-How to beat the CAP theorem

13 0.12979259 36 nathan marz storm-2013-04-01-My new startup

14 0.12559094 8 nathan marz storm-2010-03-08-Follow-up to "The mathematics behind Hadoop-based systems"

15 0.12419441 19 nathan marz storm-2010-07-12-My experience as the first employee of a Y Combinator startup

16 0.1077708 24 nathan marz storm-2010-11-03-The time I hacked my high school

17 0.10761005 25 nathan marz storm-2010-12-06-You Are a Product

18 0.10126086 3 nathan marz storm-2010-01-13-Mimi Silbert: the greatest hacker in the world

19 0.10072879 38 nathan marz storm-2013-04-12-Break into Silicon Valley with a blog

20 0.096843228 23 nathan marz storm-2010-10-27-Fastest Viable Product: Investing in Speed at a Startup