nathan_marz_storm nathan_marz_storm-2013 nathan_marz_storm-2013-37 knowledge-graph by maker-knowledge-mining

37 nathan marz storm-2013-04-02-Principles of Software Engineering, Part 1


meta infos for this blog

Source: html

Introduction: This is the first in a series of posts on the principles of software engineering. There's far more to software engineering than just "making computers do stuff" – while that phrase is accurate, it does not come close to describing what's involved in making robust, reliable software. I will use my experience building large scale systems to inform a first principles approach to defining what it is we do – or should be doing – as software engineers. I'm not interested in tired debates like dynamic vs. static languages – instead, I intend to explore the really core aspects of software engineering. The first order of business is to define what software engineering even is in the first place. Software engineering is the construction of software that produces some desired output for some range of inputs. The inputs to software are more than just method parameters: they include the hardware on which it's running, the rate at which it receives data, and anything else that influences the oper


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 The inputs to software are more than just method parameters: they include the hardware on which it's running, the rate at which it receives data, and anything else that influences the operation of the software. [sent-8, score-0.701]

2 As this story illustrates, there's a lot of uncertainty in software engineering. [sent-40, score-0.718]

3 The most salient feature of software engineering is the degree to which uncertainty permeates every aspect of the construction of software, from designing it to implementing it to operating it in a production environment. [sent-45, score-0.998]

4 Learning from other fields of engineering It's useful to look at other forms of engineering to learn more about software engineering. [sent-46, score-0.681]

5 Each failure lets the engineers understand the input ranges to the rocket a little better and better engineer the rocket to handle a greater and greater part of the input space. [sent-64, score-0.877]

6 Making software robust is an iterative process: you build and test it as best you can, but inevitably in production you'll discover new areas of the input space that lead to failure. [sent-68, score-0.735]

7 Over time, the uncertainty in the input space goes down, and software gets "hardened". [sent-70, score-0.852]

8 There's always going to be some part of the input space for which software fails – as an engineer you have to balance the probabilities and cost tradeoffs to determine where to draw that line. [sent-72, score-0.647]

9 For all of your dependencies, you better understand the input ranges for which the dependencies operate within spec and design your software accordingly. [sent-73, score-0.778]

10 Sources of uncertainty in software There are many sources of uncertainty in software. [sent-74, score-0.999]

11 Because of this fact of software development, all software must be viewed as probabilistic. [sent-77, score-0.874]

12 Another source of uncertainty is the fact that humans are involved in running software in production. [sent-80, score-0.769]

13 Another source of uncertainty is what functionality your software should even have – very rarely are the specs fully understood and fleshed out from the get go. [sent-83, score-0.788]

14 Finally, another big source of uncertainty is not understanding the range of inputs your software will see in production. [sent-89, score-0.847]

15 This is by no means an exhaustive overview of sources of uncertainty in software, but it's clear that uncertainty permeates all of the software engineering process. [sent-91, score-1.121]

16 Engineering for uncertainty You can do a much better job building robust software by being cognizant of the uncertain nature of software. [sent-92, score-0.791]

17 Minimize dependencies One technique for making software more robust is to minimize what your software depends on – the less moving parts, the better. [sent-95, score-1.084]

18 As software hardens more and more, unexpected events will get more and more infrequent and reproducing those events will become harder and harder. [sent-122, score-0.682]

19 I consider the monitoring aspects of software just as important as the functionality of the software itself. [sent-125, score-1.014]

20 Conclusion Software engineering is a constant battle against uncertainty – uncertainty about your specs, uncertainty about your implementation, uncertainty about your dependencies, and uncertainty about your inputs. [sent-133, score-1.246]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('software', 0.437), ('zookeeper', 0.324), ('uncertainty', 0.281), ('reporterror', 0.151), ('application', 0.147), ('failure', 0.137), ('dependencies', 0.137), ('method', 0.135), ('input', 0.134), ('inputs', 0.129), ('rocket', 0.129), ('worker', 0.129), ('failures', 0.123), ('engineering', 0.122), ('errors', 0.122), ('storm', 0.11), ('traffic', 0.108), ('components', 0.106), ('unexpected', 0.105), ('cascading', 0.098), ('location', 0.091), ('production', 0.091), ('workers', 0.091), ('bridge', 0.088), ('probability', 0.088), ('deterministic', 0.086), ('watches', 0.086), ('engineer', 0.076), ('robust', 0.073), ('information', 0.07), ('functionality', 0.07), ('events', 0.07), ('ranges', 0.07), ('monitoring', 0.07), ('correctly', 0.07), ('handle', 0.068), ('feature', 0.067), ('pull', 0.065), ('reported', 0.065), ('error', 0.061), ('correct', 0.061), ('minimizing', 0.061), ('bugs', 0.061), ('code', 0.053), ('push', 0.053), ('watch', 0.053), ('functional', 0.053), ('hit', 0.053), ('respect', 0.053), ('running', 0.051)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000008 37 nathan marz storm-2013-04-02-Principles of Software Engineering, Part 1

Introduction: This is the first in a series of posts on the principles of software engineering. There's far more to software engineering than just "making computers do stuff" – while that phrase is accurate, it does not come close to describing what's involved in making robust, reliable software. I will use my experience building large scale systems to inform a first principles approach to defining what it is we do – or should be doing – as software engineers. I'm not interested in tired debates like dynamic vs. static languages – instead, I intend to explore the really core aspects of software engineering. The first order of business is to define what software engineering even is in the first place. Software engineering is the construction of software that produces some desired output for some range of inputs. The inputs to software are more than just method parameters: they include the hardware on which it's running, the rate at which it receives data, and anything else that influences the oper

2 0.17339042 39 nathan marz storm-2014-02-12-Interview with "Programmer Magazine"

Introduction: I was recently interviewed for "Programmer Magazine", a Chinese magazine. The interview was published in Chinese, but a lot of people told me they'd like to see the English version of the interview. Due to the Google translation being, ahem, a little iffy, I decided to just publish the original English version on my blog. Hope you enjoy! What drew you to programming and what was the first interesting program you wrote? I started programming when I was 10 years old on my TI-82 graphing calculator. Initially I started programming because I wanted to make games on my calculator – and also because I was bored in math class :D. The first interesting game I made on my calculator was an archery game where you'd shoot arrows at moving targets. You'd get points for hitting more targets or completing all the targets faster. A couple years later I graduated to programming the TI-89 which was a huge upgrade in power. I remember how the TI-82 only let you have 26 variables (for the character

3 0.14453708 29 nathan marz storm-2011-01-19-Inglourious Software Patents

Introduction: Most articles arguing for the abolishment of software patents focus on how so many software patents don't meet the "non-obvious and non-trivial" guidelines for patents. The problem with this approach is that the same argument could be used to advocate for reform in how software patents are evaluated rather than the abolishment of software patents altogether. Software patents should be abolished though, and I'm going to show this with an economic analysis. We'll see that even non-obvious and non-trivial software patents should never be granted as they can only cause economic loss. Why do patents exist in the first place? The patent system exists to provide an incentive for innovation where that incentive would not have existed otherwise . Imagine you're an individual living in the 19th century. Let's say the patent system does not exist and you have an idea to make a radically better kind of sewing machine. If you invested the time to develop your idea into a working invention,

4 0.11653561 7 nathan marz storm-2010-03-04-Introducing "Nanny" - a really simple dependency management tool

Introduction: Dependency management in software projects is a pretty simple problem when you think about it. A tool to manage dependencies just needs to do three things: Provide a mechanism to specify the direct dependencies to a project Download the transitive closure of dependencies to a project Publish packages that can be used as a dependency to other projects Some languages have good dependency management systems - for example, rubygems. Others, like Java, have tools like Maven which I would call a complex solution to a simple problem. You shouldn't need to buy a book to understand the solution to such a simple problem. Plus, these dependency management systems are all language specific. I've seen companies do crazy things to manage their dependencies. One company, to manage their jar files, would put all the jars that any project might need in a special "jars" project. You would then need to setup a JARS_HOME environment variable and be sure to update the jars project if you n

5 0.1132773 34 nathan marz storm-2012-09-19-Storm's 1st birthday

Introduction: Storm was open-sourced exactly one year ago today. It's been an action-packed year for Storm, to say the least. Here's some of the exciting stuff that's happened over the past year: 27 companies have publicized that they're using Storm in production . I know of at least a few more companies using it that haven't published anything yet. O'Reilly published a book on Storm. The  Storm mailing list  has over 1300 members, with over 500 messages per month. The  @stormprocessor  account has over 1200 followers. More than 4000 people have starred the project on Github . There's a  regular Storm meetup  in the Bay Area with over 230 members. I've also seen lots of Storm-focused meetups happen all over the world over the past year. 29 people all over the world have contributed to the codebase We released Trident , a high level abstraction for realtime computation, that is a major leap forward in what's possible in realtime. Libraries have been released integrating Stor

6 0.11195478 33 nathan marz storm-2012-02-06-Suffering-oriented programming

7 0.10929836 21 nathan marz storm-2010-08-20-5 Tips for Thinking Under Uncertainty

8 0.078183465 31 nathan marz storm-2011-10-13-How to beat the CAP theorem

9 0.060158972 1 nathan marz storm-2009-12-28-The mathematics behind Hadoop-based systems

10 0.060097814 18 nathan marz storm-2010-06-16-Your company has a knowledge debt problem

11 0.057137556 16 nathan marz storm-2010-05-08-News Feed in 38 lines of code using Cascalog

12 0.055897012 19 nathan marz storm-2010-07-12-My experience as the first employee of a Y Combinator startup

13 0.055630166 13 nathan marz storm-2010-04-14-Introducing Cascalog: a Clojure-based query language for Hadoop

14 0.054611579 2 nathan marz storm-2010-01-03-Tips for Optimizing Cascading Flows

15 0.05157226 22 nathan marz storm-2010-10-05-How to get a job at a kick-ass startup (for programmers)

16 0.049836859 38 nathan marz storm-2013-04-12-Break into Silicon Valley with a blog

17 0.049537025 9 nathan marz storm-2010-03-10-Thrift + Graphs = Strong, flexible schemas on Hadoop

18 0.047302205 20 nathan marz storm-2010-07-30-You should blog even if you have no readers

19 0.045940153 41 nathan marz storm-2014-05-10-Why we in tech must support Lawrence Lessig

20 0.044480026 23 nathan marz storm-2010-10-27-Fastest Viable Product: Investing in Speed at a Startup


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.306), (1, 0.037), (2, 0.024), (3, 0.25), (4, -0.254), (5, 0.14), (6, 0.185), (7, -0.132), (8, -0.395), (9, 0.229), (10, -0.073), (11, -0.008), (12, 0.167), (13, 0.042), (14, 0.013), (15, 0.016), (16, 0.044), (17, 0.023), (18, 0.035), (19, -0.004), (20, -0.063), (21, 0.071), (22, -0.0), (23, -0.064), (24, -0.137), (25, 0.091), (26, -0.013), (27, 0.15), (28, 0.06), (29, -0.085), (30, 0.216), (31, 0.455), (32, -0.094), (33, 0.13), (34, -0.261), (35, -0.142), (36, 0.094), (37, -0.016), (38, 0.091), (39, 0.018), (40, -0.006)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99058366 37 nathan marz storm-2013-04-02-Principles of Software Engineering, Part 1

Introduction: This is the first in a series of posts on the principles of software engineering. There's far more to software engineering than just "making computers do stuff" – while that phrase is accurate, it does not come close to describing what's involved in making robust, reliable software. I will use my experience building large scale systems to inform a first principles approach to defining what it is we do – or should be doing – as software engineers. I'm not interested in tired debates like dynamic vs. static languages – instead, I intend to explore the really core aspects of software engineering. The first order of business is to define what software engineering even is in the first place. Software engineering is the construction of software that produces some desired output for some range of inputs. The inputs to software are more than just method parameters: they include the hardware on which it's running, the rate at which it receives data, and anything else that influences the oper

2 0.27177316 29 nathan marz storm-2011-01-19-Inglourious Software Patents

Introduction: Most articles arguing for the abolishment of software patents focus on how so many software patents don't meet the "non-obvious and non-trivial" guidelines for patents. The problem with this approach is that the same argument could be used to advocate for reform in how software patents are evaluated rather than the abolishment of software patents altogether. Software patents should be abolished though, and I'm going to show this with an economic analysis. We'll see that even non-obvious and non-trivial software patents should never be granted as they can only cause economic loss. Why do patents exist in the first place? The patent system exists to provide an incentive for innovation where that incentive would not have existed otherwise . Imagine you're an individual living in the 19th century. Let's say the patent system does not exist and you have an idea to make a radically better kind of sewing machine. If you invested the time to develop your idea into a working invention,

3 0.20868023 39 nathan marz storm-2014-02-12-Interview with "Programmer Magazine"

Introduction: I was recently interviewed for "Programmer Magazine", a Chinese magazine. The interview was published in Chinese, but a lot of people told me they'd like to see the English version of the interview. Due to the Google translation being, ahem, a little iffy, I decided to just publish the original English version on my blog. Hope you enjoy! What drew you to programming and what was the first interesting program you wrote? I started programming when I was 10 years old on my TI-82 graphing calculator. Initially I started programming because I wanted to make games on my calculator – and also because I was bored in math class :D. The first interesting game I made on my calculator was an archery game where you'd shoot arrows at moving targets. You'd get points for hitting more targets or completing all the targets faster. A couple years later I graduated to programming the TI-89 which was a huge upgrade in power. I remember how the TI-82 only let you have 26 variables (for the character

4 0.13823937 33 nathan marz storm-2012-02-06-Suffering-oriented programming

Introduction: Someone asked me an interesting question the other day: "How did you justify taking such a huge risk on building Storm while working on a startup ?" (Storm is a realtime computation system). I can see how from an outsider's perspective investing in such a massive project seems extremely risky for a startup. From my perspective, though, building Storm wasn't risky at all. It was challenging, but not risky. I follow a style of development that greatly reduces the risk of big projects like Storm. I call this style "suffering-oriented programming." Suffering-oriented programming can be summarized like so: don't build technology unless you feel the pain of not having it. It applies to the big, architectural decisions as well as the smaller everyday programming decisions. Suffering-oriented programming greatly reduces risk by ensuring that you're always working on something important, and it ensures that you are well-versed in a problem space before attempting a large investment. I ha

5 0.1367234 7 nathan marz storm-2010-03-04-Introducing "Nanny" - a really simple dependency management tool

Introduction: Dependency management in software projects is a pretty simple problem when you think about it. A tool to manage dependencies just needs to do three things: Provide a mechanism to specify the direct dependencies to a project Download the transitive closure of dependencies to a project Publish packages that can be used as a dependency to other projects Some languages have good dependency management systems - for example, rubygems. Others, like Java, have tools like Maven which I would call a complex solution to a simple problem. You shouldn't need to buy a book to understand the solution to such a simple problem. Plus, these dependency management systems are all language specific. I've seen companies do crazy things to manage their dependencies. One company, to manage their jar files, would put all the jars that any project might need in a special "jars" project. You would then need to setup a JARS_HOME environment variable and be sure to update the jars project if you n

6 0.13098775 21 nathan marz storm-2010-08-20-5 Tips for Thinking Under Uncertainty

7 0.12912877 34 nathan marz storm-2012-09-19-Storm's 1st birthday

8 0.10928837 31 nathan marz storm-2011-10-13-How to beat the CAP theorem

9 0.0913091 19 nathan marz storm-2010-07-12-My experience as the first employee of a Y Combinator startup

10 0.090102583 2 nathan marz storm-2010-01-03-Tips for Optimizing Cascading Flows

11 0.090009034 18 nathan marz storm-2010-06-16-Your company has a knowledge debt problem

12 0.087828301 16 nathan marz storm-2010-05-08-News Feed in 38 lines of code using Cascalog

13 0.086680137 41 nathan marz storm-2014-05-10-Why we in tech must support Lawrence Lessig

14 0.08573205 9 nathan marz storm-2010-03-10-Thrift + Graphs = Strong, flexible schemas on Hadoop

15 0.085184693 1 nathan marz storm-2009-12-28-The mathematics behind Hadoop-based systems

16 0.079022996 22 nathan marz storm-2010-10-05-How to get a job at a kick-ass startup (for programmers)

17 0.076132566 3 nathan marz storm-2010-01-13-Mimi Silbert: the greatest hacker in the world

18 0.074968509 23 nathan marz storm-2010-10-27-Fastest Viable Product: Investing in Speed at a Startup

19 0.073645622 40 nathan marz storm-2014-02-24-The inexplicable rise of open floor plans in tech companies

20 0.069923826 38 nathan marz storm-2013-04-12-Break into Silicon Valley with a blog


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(0, 0.023), (5, 0.022), (10, 0.036), (21, 0.52), (26, 0.023), (33, 0.016), (41, 0.04), (48, 0.017), (59, 0.053), (61, 0.012), (68, 0.046), (76, 0.017), (82, 0.035), (88, 0.04), (99, 0.023)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.98037171 37 nathan marz storm-2013-04-02-Principles of Software Engineering, Part 1

Introduction: This is the first in a series of posts on the principles of software engineering. There's far more to software engineering than just "making computers do stuff" – while that phrase is accurate, it does not come close to describing what's involved in making robust, reliable software. I will use my experience building large scale systems to inform a first principles approach to defining what it is we do – or should be doing – as software engineers. I'm not interested in tired debates like dynamic vs. static languages – instead, I intend to explore the really core aspects of software engineering. The first order of business is to define what software engineering even is in the first place. Software engineering is the construction of software that produces some desired output for some range of inputs. The inputs to software are more than just method parameters: they include the hardware on which it's running, the rate at which it receives data, and anything else that influences the oper

2 0.22122718 39 nathan marz storm-2014-02-12-Interview with "Programmer Magazine"

Introduction: I was recently interviewed for "Programmer Magazine", a Chinese magazine. The interview was published in Chinese, but a lot of people told me they'd like to see the English version of the interview. Due to the Google translation being, ahem, a little iffy, I decided to just publish the original English version on my blog. Hope you enjoy! What drew you to programming and what was the first interesting program you wrote? I started programming when I was 10 years old on my TI-82 graphing calculator. Initially I started programming because I wanted to make games on my calculator – and also because I was bored in math class :D. The first interesting game I made on my calculator was an archery game where you'd shoot arrows at moving targets. You'd get points for hitting more targets or completing all the targets faster. A couple years later I graduated to programming the TI-89 which was a huge upgrade in power. I remember how the TI-82 only let you have 26 variables (for the character

3 0.20217407 9 nathan marz storm-2010-03-10-Thrift + Graphs = Strong, flexible schemas on Hadoop

Introduction: There are a lot of misconceptions about what Hadoop is useful for and what kind of data you can put in it. A lot of people think that Hadoop is meant for unstructured data like log files. While Hadoop is great for log files, it's also fantastic for strongly typed, structured data. In this post I'll discuss how you can use a tool like Thrift to store strongly typed data in Hadoop while retaining the flexibility to evolve your schema. We'll look at graph-based schemas and see why they are an ideal fit for many Hadoop-based applications. OK, so what kind of "structured" data can you put in Hadoop? Anything! At BackType we put data about news, conversations, and people into Hadoop as structured objects. You can easily push structured information about social graphs, financial information, or anything you want into Hadoop.   That sounds all well and good, but why not just use JSON as the data format? JSON doesn't give you a real schema and doesn't protect against data i

4 0.17567511 31 nathan marz storm-2011-10-13-How to beat the CAP theorem

Introduction: The CAP theorem states a database cannot guarantee consistency, availability, and partition-tolerance at the same time. But you can't sacrifice partition-tolerance (see here and here ), so you must make a tradeoff between availability and consistency. Managing this tradeoff is a central focus of the NoSQL movement. Consistency means that after you do a successful write, future reads will always take that write into account. Availability means that you can always read and write to the system. During a partition, you can only have one of these properties. Systems that choose consistency over availability have to deal with some awkward issues. What do you do when the database isn't available? You can try buffering writes for later, but you risk losing those writes if you lose the machine with the buffer. Also, buffering writes can be a form of inconsistency because a client thinks a write has succeeded but the write isn't in the database yet. Alternatively, you can return errors ba

5 0.16089089 18 nathan marz storm-2010-06-16-Your company has a knowledge debt problem

Introduction: When your company lacks experience in tools and techniques that can make it more productive, your company has knowledge debt. Companies tend to operate in ways that exacerbate their knowledge debt problem. Consider this fairly typical job ad: Initech is seeking an experienced Software Engineer to join the engineering team. Responsibilities * Design core, back-end software components * Analyze and improve efficiency, scalability, and stability of various system resources Requirements * M.S. Computer Science or related field preferred * 2+ years of Java experience * Expert in relational data modeling and query optimization using MySQL I would posit a guess that this company uses Java for the majority of its work and uses MySQL on the back-end. Naturally, the company wants to recruit people who share that skill set and can "jump right in" and contribute. This mindset is fundamentally flawed. A company should be hiring for problem solving skills

6 0.14734286 19 nathan marz storm-2010-07-12-My experience as the first employee of a Y Combinator startup

7 0.14480036 33 nathan marz storm-2012-02-06-Suffering-oriented programming

8 0.1400663 7 nathan marz storm-2010-03-04-Introducing "Nanny" - a really simple dependency management tool

9 0.13374402 13 nathan marz storm-2010-04-14-Introducing Cascalog: a Clojure-based query language for Hadoop

10 0.13342606 29 nathan marz storm-2011-01-19-Inglourious Software Patents

11 0.11511033 23 nathan marz storm-2010-10-27-Fastest Viable Product: Investing in Speed at a Startup

12 0.1131241 38 nathan marz storm-2013-04-12-Break into Silicon Valley with a blog

13 0.11095551 22 nathan marz storm-2010-10-05-How to get a job at a kick-ass startup (for programmers)

14 0.10995395 12 nathan marz storm-2010-04-10-Fun with equality in Clojure

15 0.10972136 41 nathan marz storm-2014-05-10-Why we in tech must support Lawrence Lessig

16 0.10875528 40 nathan marz storm-2014-02-24-The inexplicable rise of open floor plans in tech companies

17 0.10462409 8 nathan marz storm-2010-03-08-Follow-up to "The mathematics behind Hadoop-based systems"

18 0.10249478 25 nathan marz storm-2010-12-06-You Are a Product

19 0.097615726 21 nathan marz storm-2010-08-20-5 Tips for Thinking Under Uncertainty

20 0.09656816 3 nathan marz storm-2010-01-13-Mimi Silbert: the greatest hacker in the world