high_scalability high_scalability-2013 high_scalability-2013-1469 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: I'm not sure what I was expecting the stack GOV.UK used at launch to look like. Maybe some messenger owls and lots of cobwebs? But not so at all. So much not so I thought any organization looking at their own stack for ideas could learn something from the considered choices of others. The diversity of technologies used was surprising. They use "at least five different programming languages, three separate database types, two versions of an operating system." Some may think of this as a weakness, but they think it a strength: The reason we operate such a diverse ecosystem is that we are focused on solving real problems. Our first task is to understand the problem or need we are solving and then to choose the best tool for the job. If we restrict ourselves to moulding the need to the tools we already have, then we risk not solving the initial problem in the best way possible for the user. By restricting software diversity or enforcing rigid organisational standards on a project,
sentIndex sentText sentNum sentScore
1 I'm not sure what I was expecting the stack GOV. [sent-1, score-0.078]
2 So much not so I thought any organization looking at their own stack for ideas could learn something from the considered choices of others. [sent-5, score-0.086]
3 The diversity of technologies used was surprising. [sent-6, score-0.128]
4 They use "at least five different programming languages, three separate database types, two versions of an operating system. [sent-7, score-0.137]
5 " Some may think of this as a weakness, but they think it a strength: The reason we operate such a diverse ecosystem is that we are focused on solving real problems. [sent-8, score-0.139]
6 Our first task is to understand the problem or need we are solving and then to choose the best tool for the job. [sent-9, score-0.139]
7 If we restrict ourselves to moulding the need to the tools we already have, then we risk not solving the initial problem in the best way possible for the user. [sent-10, score-0.229]
8 By restricting software diversity or enforcing rigid organisational standards on a project, there is a possibility of descending into a cargo cult, where we simply repeat the same patterns and mistakes in everything we make. [sent-11, score-0.5]
9 This "use the best tool no matter what" policy is outlined in a blog post Benefits of diversity . [sent-12, score-0.223]
10 The only choice that wouldn't be found in a modern startup is the use of Skyscape as their cloud provider. [sent-13, score-0.216]
11 I'm assuming this has to do with legal issues around data sovereignty as this is government site, but otherwise it's all straight out of standard modern web practice: monitoring, dashboards, continuous release, polyglot persistence, distributed source code control, etc. [sent-14, score-0.498]
12 The core of the servers: We’re making use of Infrastructure As A Service from Skyscape We use Akamai as our Content Delivery Network Our servers are running Ubuntu GNU/Linux 10. [sent-18, score-0.137]
13 Servers are managed with Puppet , using PuppetDB Web serving is handled by nginx , proxying to unicorn for our ruby applications. [sent-21, score-0.736]
14 One of the team wrote Unicorn Herder to make Unicorn play nicely with upstart . [sent-23, score-0.086]
15 js was used to build a side-by-side browser for reviewing the redirections Applications: The majority of our applications are written in ruby , based on either Ruby on Rails or Sinatra . [sent-25, score-0.361]
16 A few components are written in Scala and built on top of Play 2. [sent-26, score-0.089]
17 0 We’re running Mapit from MySociety which is built on top of Django Databases and other storage: We use MongoDB for most systems, with a few apps also making use of MySQL . [sent-27, score-0.234]
18 They’re very much our playground and you can find them written in a mixture of Ruby, Clojure, Node. [sent-32, score-0.177]
wordName wordTfidf (topN-words)
[('unicorn', 0.263), ('mapit', 0.243), ('ruby', 0.182), ('government', 0.152), ('solving', 0.139), ('use', 0.137), ('frontend', 0.13), ('diversity', 0.128), ('herder', 0.11), ('postgresqlis', 0.11), ('sovereignty', 0.11), ('nginx', 0.104), ('proxying', 0.104), ('restricting', 0.104), ('accessibility', 0.099), ('cult', 0.099), ('descending', 0.099), ('apps', 0.097), ('preparation', 0.095), ('validating', 0.095), ('outlined', 0.095), ('font', 0.095), ('jekyll', 0.095), ('deserves', 0.092), ('enforcing', 0.092), ('us', 0.09), ('reviewing', 0.09), ('restrict', 0.09), ('tracker', 0.09), ('written', 0.089), ('mixture', 0.088), ('messenger', 0.088), ('track', 0.087), ('play', 0.086), ('stack', 0.086), ('playback', 0.084), ('using', 0.083), ('elasticsearch', 0.082), ('internal', 0.082), ('legal', 0.08), ('jquery', 0.08), ('modern', 0.079), ('expecting', 0.078), ('clojure', 0.078), ('rigid', 0.077), ('occasionally', 0.077), ('polyglot', 0.077), ('strength', 0.076), ('alerting', 0.075), ('php', 0.075)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999982 1469 high scalability-2013-06-03-GOV.UK - Not Your Father's Stack
Introduction: I'm not sure what I was expecting the stack GOV.UK used at launch to look like. Maybe some messenger owls and lots of cobwebs? But not so at all. So much not so I thought any organization looking at their own stack for ideas could learn something from the considered choices of others. The diversity of technologies used was surprising. They use "at least five different programming languages, three separate database types, two versions of an operating system." Some may think of this as a weakness, but they think it a strength: The reason we operate such a diverse ecosystem is that we are focused on solving real problems. Our first task is to understand the problem or need we are solving and then to choose the best tool for the job. If we restrict ourselves to moulding the need to the tools we already have, then we risk not solving the initial problem in the best way possible for the user. By restricting software diversity or enforcing rigid organisational standards on a project,
2 0.14646097 808 high scalability-2010-04-12-Poppen.de Architecture
Introduction: This is a guest a post by Alvaro Videla describing their architecture for Poppen.de , a popular German dating site. This site is very much NSFW, so be careful before clicking on the link. What I found most interesting is how they manage to sucessfully blend a little of the old with a little of the new, using technologies like Nginx, MySQL, CouchDB, and Erlang, Memcached, RabbitMQ, PHP, Graphite, Red5, and Tsung. What is Poppen.de? Poppen.de (NSFW) is the top dating website in Germany, and while it may be a small site compared to giants like Flickr or Facebook, we believe it's a nice architecture to learn from if you are starting to get some scaling problems. The Stats 2.000.000 users 20.000 concurrent users 300.000 private messages per day 250.000 logins per day We have a team of eleven developers, two designers and two sysadmins for this project. Business Model The site works with a freemium model, where users can do for free things like: Search
3 0.14266649 639 high scalability-2009-06-27-Scaling Twitter: Making Twitter 10000 Percent Faster
Introduction: Update 6: Some interesting changes from Twitter's Evan Weaver : everything in RAM now, database is a backup; peaks at 300 tweets/second; every tweet followed by average 126 people; vector cache of tweet IDs; row cache; fragment cache; page cache; keep separate caches; GC makes Ruby optimization resistant so went with Scala; Thrift and HTTP are used internally; 100s internal requests for every external request; rewrote MQ but kept interface the same; 3 queues are used to load balance requests; extensive A/B testing for backwards capability; switched to C memcached client for speed; optimize critical path; faster to get the cached results from the network memory than recompute them locally. Update 5: Twitter on Scala . A Conversation with Steve Jenson, Alex Payne, and Robey Pointer by Bill Venners. A fascinating discussion of why Twitter moved to the Java JVM for their server infrastructure (long lived processes) and why they moved to Scala to program against it (high level langu
Introduction: Who's Hiring? Apple is hiring for multiple positions. Imagine what you could do here. At Apple, great ideas have a way of becoming great products, services, and customer experiences very quickly. Sr. Software Engineer . The Identity Management Services team at Apple is in search of a motivated Senior Software Engineer who is self-driven and has a proven track record in design and development of complex, highly available and scalable systems. Please apply here . SQE and Operations Manager, iOS Systems . The iOS Systems team is looking for an experienced hands-on manager to lead the Quality Engineering, Build and Release Engineering team. Please apply here . Senior Engineer: Emerging Technology . Apple’s Emerging Technology group is looking for a senior engineer passionate about exploring emerging technologies to create paradigm shifting cloud based solutions. Please apply here . Senior Storage Engineer . Software Engineering Operations (
Introduction: Who's Hiring? Apple is hiring a Senior Engineer in their Mobile Services team. We seek an accomplished server-side engineer capable of delivering an extraordinary portfolio of features and services based on emerging technologies to our internal customers. Please apply here . Apple is hiring a Software Engineer in their Messaging Services team. We build the cloud systems that power some of the busiest applications in the world, including iMessage, FaceTime and Apple Push Notifications. You'll have the opportunity to explore a wide range of technologies, developing the server software that is driving the future of messaging and mobile services. Please apply here . Apple is hiring an Enterprise Software Engineer. Apple's Emerging Technology Services group provides a Java based SOA platform for various applications to interact with each other. The platform is designed to handle millions of messages a day with very low latency. We have an immediate opening for a
6 0.13085762 556 high scalability-2009-04-05-At Some Point the Cost of Servers Outweighs the Cost of Programmers
7 0.13076822 1240 high scalability-2012-05-07-Startups are Creating a New System of the World for IT
9 0.12700632 417 high scalability-2008-10-15-Outside.in Scales Up with Engine Yard and moving from PHP to Ruby on Rails
11 0.12621966 33 high scalability-2007-07-26-ThemBid Architecture
topicId topicWeight
[(0, 0.238), (1, 0.018), (2, -0.019), (3, -0.05), (4, 0.034), (5, -0.068), (6, 0.008), (7, 0.005), (8, 0.053), (9, 0.037), (10, -0.018), (11, 0.017), (12, 0.018), (13, -0.049), (14, -0.027), (15, -0.092), (16, 0.028), (17, -0.039), (18, -0.001), (19, -0.062), (20, 0.009), (21, -0.049), (22, -0.015), (23, 0.002), (24, -0.029), (25, 0.052), (26, -0.025), (27, -0.018), (28, -0.072), (29, -0.005), (30, -0.06), (31, 0.033), (32, -0.063), (33, 0.001), (34, -0.039), (35, -0.067), (36, -0.024), (37, -0.009), (38, -0.046), (39, 0.054), (40, -0.029), (41, -0.075), (42, 0.017), (43, 0.013), (44, -0.007), (45, -0.051), (46, -0.023), (47, -0.026), (48, 0.016), (49, -0.017)]
simIndex simValue blogId blogTitle
same-blog 1 0.95769399 1469 high scalability-2013-06-03-GOV.UK - Not Your Father's Stack
Introduction: I'm not sure what I was expecting the stack GOV.UK used at launch to look like. Maybe some messenger owls and lots of cobwebs? But not so at all. So much not so I thought any organization looking at their own stack for ideas could learn something from the considered choices of others. The diversity of technologies used was surprising. They use "at least five different programming languages, three separate database types, two versions of an operating system." Some may think of this as a weakness, but they think it a strength: The reason we operate such a diverse ecosystem is that we are focused on solving real problems. Our first task is to understand the problem or need we are solving and then to choose the best tool for the job. If we restrict ourselves to moulding the need to the tools we already have, then we risk not solving the initial problem in the best way possible for the user. By restricting software diversity or enforcing rigid organisational standards on a project,
2 0.76236719 970 high scalability-2011-01-06-BankSimple Mini-Architecture - Using a Next Generation Toolchain
Introduction: I know people are always interested in what others are using to build their systems. Alex Payne , CTO of the new startup BankSimple , gives us a quick hit on their toolchain choices in this Quora thread . BankSimple positions itself as a customer-focused alternative to online banking. You may remember Alex from the early days of Twitter . Alex was always helpful to me on Twitter's programmer support list, so I really wish them well. Alex is also a bit of an outside the box thinker, which is reflected in some of their choices: The JVM acts as a convergence platform for these languages: Scala - ideal for writing performance-sensitive components that need the safety and expressiveness of the language's advanced type system. Clojure - rapidly prototype in a more dynamic language while still offering the benefits of functional programming. JRuby - makes available a bunch of great libraries and frameworks for doing frontend web development, like Rails and Pa
3 0.76028472 255 high scalability-2008-02-21-Product: Capistrano - Automate Remote Tasks Via SSH
Introduction: Update: Deployment with Capistrano by Charles Max Wood. Nice simple step-by-step for using Capistrano for deployment. From their website: Simply put, Capistrano is a tool for automating tasks on one or more remote servers. It executes commands in parallel on all targeted machines, and provides a mechanism for rolling back changes across multiple machines. It is ideal for anyone doing any kind of system administration, either professionally or incidentally. * Great for automating tasks via SSH on remote servers, like software installation, application deployment, configuration management, ad hoc server monitoring, and more. * Ideal for system administrators, whether professional or incidental. * Easy to customize. Its configuration files use the Ruby programming language syntax, but you don't need to know Ruby to do most things with Capistrano. * Easy to extend. Capistrano is written in the Ruby programming language, and may be extended easily by writing additional Ruby mod
4 0.72644329 429 high scalability-2008-10-25-Product: Puppet the Automated Administration System
Introduction: Update: Digg on their choice and use of Puppet . They chose puppet over cfengine, and bcfg2 because they liked Puppet's resource abstraction layer (RAL), the ability to implement configuration management incrementally, support for bundles, and the overall design philosophy. Puppet implements a declarative (what not how) configuration language for automating common administration tasks. It's the system every large site writes for themselves and it's already made for you! Ilike was able to "easily" scale from 0 to hundreds of servers using Puppet. I can't believe I've never seen this before. It looks really cool. What is Puppet and how can it help you scale your website operations? From the Puppet website: Puppet has been developed to help the sysadmin community move to building and sharing mature tools that avoid the duplication of everyone solving the same problem. It does so in two ways: * It provides a powerful framework to simplify the majority of the technical tasks t
5 0.72441918 307 high scalability-2008-04-21-Using Google AppEngine for a Little Micro-Scalability
Introduction: Over the years I've accumulated quite a rag tag collection of personal systems scattered wide across a galaxy of different servers. For the past month I've been on a quest to rationalize this conglomeration by moving everything to a managed service of one kind or another. The goal: lift a load of worry from my mind. I like to do my own stuff my self so I learn something and have control. Control always comes with headaches and it was time for a little aspirin. As part of the process GAE came in handy as a host for a few Twitter related scripts I couldn't manage to run anywhere else. I recoded my simple little scripts into Python/GAE and learned a lot in the process. In the move I exported HighScalability from a VPS and imported it into a shared hosting service. I could never quite configure Apache and MySQL well enough that they wouldn't spike memory periodically and crash the VPS. And since a memory crash did not automatically restarted it was unacceptable. I also wrote a scrip
7 0.71834636 1068 high scalability-2011-06-27-TripAdvisor Architecture - 40M Visitors, 200M Dynamic Page Views, 30TB Data
8 0.71761441 556 high scalability-2009-04-05-At Some Point the Cost of Servers Outweighs the Cost of Programmers
9 0.71293008 1124 high scalability-2011-09-26-17 Techniques Used to Scale Turntable.fm and Labmeeting to Millions of Users
10 0.71118575 218 high scalability-2008-01-17-Moving old to new. Do not be afraid of the re-write -- but take some help
11 0.71026945 807 high scalability-2010-04-09-Vagrant - Build and Deploy Virtualized Development Environments Using Ruby
13 0.70154417 461 high scalability-2008-12-05-Sprinkle - Provisioning Tool to Build Remote Servers
14 0.69396728 808 high scalability-2010-04-12-Poppen.de Architecture
15 0.68602729 33 high scalability-2007-07-26-ThemBid Architecture
16 0.68186092 993 high scalability-2011-02-22-Is Node.js Becoming a Part of the Stack? SimpleGeo Says Yes.
17 0.67748773 1427 high scalability-2013-03-20-Dart - Is it the Future of the Web?
18 0.67556643 1520 high scalability-2013-09-20-Stuff The Internet Says On Scalability For September 20, 2013
19 0.67474538 1423 high scalability-2013-03-13-Iron.io Moved From Ruby to Go: 28 Servers Cut and Colossal Clusterf**ks Prevented
20 0.67358613 1645 high scalability-2014-05-09-Stuff The Internet Says On Scalability For May 9th, 2014
topicId topicWeight
[(1, 0.198), (2, 0.214), (10, 0.046), (30, 0.042), (56, 0.023), (61, 0.048), (63, 0.011), (64, 0.139), (73, 0.016), (77, 0.042), (79, 0.11), (85, 0.022), (94, 0.03)]
simIndex simValue blogId blogTitle
same-blog 1 0.94354594 1469 high scalability-2013-06-03-GOV.UK - Not Your Father's Stack
Introduction: I'm not sure what I was expecting the stack GOV.UK used at launch to look like. Maybe some messenger owls and lots of cobwebs? But not so at all. So much not so I thought any organization looking at their own stack for ideas could learn something from the considered choices of others. The diversity of technologies used was surprising. They use "at least five different programming languages, three separate database types, two versions of an operating system." Some may think of this as a weakness, but they think it a strength: The reason we operate such a diverse ecosystem is that we are focused on solving real problems. Our first task is to understand the problem or need we are solving and then to choose the best tool for the job. If we restrict ourselves to moulding the need to the tools we already have, then we risk not solving the initial problem in the best way possible for the user. By restricting software diversity or enforcing rigid organisational standards on a project,
2 0.94239628 1476 high scalability-2013-06-14-Stuff The Internet Says On Scalability For June 14, 2013
Introduction: Hey, it's HighScalability time: ( Steve Gibson on Security Now with a plausible analysis of the tech behind PRISM) 27 billion: WhatsApp messages per day Quotable Quotes: Richard Feinman : If Bill Gates walks into a bar, on average, everybody in the bar is a millionaire. @giltene : Financial Programmers get paid by the CPU cycle. Web developers get paid by the developer cycle. @johndmitchell : “It’s the I/O, stupid.” @PatrickMcFadin : More people registering at #cassandra13 No worries. Adding more nodes at the reg desk. Google does it with science. Here's a list of Excellent Papers for 2012 from Googlers and friends. Most relevant for HS readers is a wildly inspiring Spanner: Google's Globally-Distributed Database . But you'll also see the influence of extracting knowledge from data to do subtle and interesting things. On that theme is Improving Photo Search: A Step Across the Semantic Gap . Google is doing
3 0.93828952 914 high scalability-2010-10-04-Paper: An Analysis of Linux Scalability to Many Cores
Introduction: An Analysis of Linux Scalability to Many Cores , by a number of MIT researchers, is a refreshingly practical paper on what it takes to scale Linux and common applications like Exim, memcached, Apache, PostgreSQL, gmake, Psearchy, and MapReduce to run on 48 core systems. A very timely paper given moderately massive multicore systems are reportedly the near future of computing. This paper must have taken a lot of work. They both tracked down bottlenecks in a number of applications and the Linux kernel and they also tried to fix them. Modestly speaking the authors said they made "modest" changes to the kernel and applications, but there's nothing modest about what they did. It's excellent work. After the next bit, which is the abstract, there is a list of the problems they found and how they fixed them. The abstract: This paper analyzes the scalability of seven system applications (Exim, memcached, Apache, PostgreSQL, gmake, Psearchy, and MapReduce) running on Linux on a 48-core
4 0.92556041 328 high scalability-2008-05-27-Scalable virus scanning for web-applications
Introduction: Hi, We're looking for a highly scalable way of scanning documents being uploaded and downloaded from our web application. I believe services like gmail and hotmail are using bespoke solutions from companies like Trend, but are there some quality "off the shelf" products out there that can easily be scaled out and have a "loose" API (HTTP based) for application integration? Once again, thanks for any input.
5 0.92488754 1076 high scalability-2011-07-08-Stuff The Internet Says On Scalability For July 8, 2011
Introduction: Submitted for your scaling pleasure: Facebook confirms 750 million users, sharing 4 billion items daily ; Yahoo: 42,000 Hadoop nodes storing 180-200 petabytes ; Formspring hits 25 million users . Zynga's Cadir Lee : It’s not the amount of hardware that matters. It’s the architecture of the application. You have to work at making your app architecture so that it takes advantage of Amazon. You have to have complete fluidity with the storage tier, the web tier. We are running our own data centers. We are looking more at doing our own data centers with more of a private cloud. Love the sensing making described by Hunch’s Infographic on their Taste Graph . 500 million people, 200 million items, 30 billion edges. 48 processors. 1 TB RAM. Is MongoDB is the New MySQL? Stephen O'Grady thinks so using a worse is better argument: wide adoption by applications, enterprise inroads, simple feature set, and the number complainers. Who plays PostgreSQL in this movie? Java is the
6 0.91748273 165 high scalability-2007-11-26-Scale to China
7 0.90606225 1011 high scalability-2011-03-25-Did the Microsoft Stack Kill MySpace?
8 0.90290457 511 high scalability-2009-02-12-MySpace Architecture
9 0.90264612 233 high scalability-2008-01-30-How Rackspace Now Uses MapReduce and Hadoop to Query Terabytes of Data
10 0.90247387 987 high scalability-2011-02-10-Dispelling the New SSL Myth
11 0.90244961 1537 high scalability-2013-10-25-Stuff The Internet Says On Scalability For October 25th, 2013
12 0.90244454 1386 high scalability-2013-01-14-MongoDB and GridFS for Inter and Intra Datacenter Data Replication
13 0.90142012 1090 high scalability-2011-08-01-Peecho Architecture - scalability on a shoestring
14 0.90135849 126 high scalability-2007-10-20-Should you build your next website using 3tera's grid OS?
16 0.90119702 1096 high scalability-2011-08-10-LevelDB - Fast and Lightweight Key-Value Database From the Authors of MapReduce and BigTable
17 0.90107906 904 high scalability-2010-09-21-Playfish's Social Gaming Architecture - 50 Million Monthly Users and Growing
18 0.90094882 881 high scalability-2010-08-16-Scaling an AWS infrastructure - Tools and Patterns
19 0.90016121 467 high scalability-2008-12-16-[ANN] New Open Source Cache System