high_scalability high_scalability-2013 high_scalability-2013-1531 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: It’s hardly news to anyone that product development and testing involve a lot of boring routine work, which can lead to human error. To avoid complications stemming from this, we use AIDA. AIDA (Automated Interactive Deploy Assistant) is a utility that automatically performs many of the processes in Git, TeamCity and JIRA. In this post, we focus on how through using AIDA we were able to automate multiple workflows and create a scheme of continuous integration. We’ll start by looking at the version control system (VCS) we use here at Badoo, specifically how Git is used to automate creation of release branches, and their subsequent merging. Then we’ll discuss AIDA’s major contribution to both JIRA integration and TeamCity. Git flow The Badoo Team uses Git as a version control system. Our model ensures each task is developed and tested in a separate branch. The branch name consists of the ticket number in JIRA and a description of the problem. BFG-9000_All_developers_should
sentIndex sentText sentNum sentScore
1 The branch name consists of the ticket number in JIRA and a description of the problem. [sent-9, score-0.627]
2 BFG-9000_All_developers_should_be_given_a_years_holiday_(paid) A release is built and tested in its own branch, which is then merged with the branches for completed issues. [sent-10, score-0.719]
3 We deploy code to production servers twice a day, so two release branches are created daily. [sent-11, score-0.705]
4 Names of release branches are simple: build_{name of the component}_{release date}_{time} This structure means the team immediately knows the date and time of release from the branch name. [sent-12, score-1.48]
5 The hooks that prevent changes being made to a release branch use the same time-stamp. [sent-13, score-0.948]
6 For example, developers are prevented from adding a task to a branch release two hours before deploy to production servers. [sent-14, score-1.275]
7 If a task in the release contains an error we remove its branch from the release branch with Git rebase. [sent-35, score-2.003]
8 If we removed a task from the release branch with Git revert, after the release was merged into the master, the developer of the problematic task would have to revert the commit in order to get his or her changes back. [sent-38, score-1.949]
9 AIDA tracks changes in the master branch, and once the previous release branch is merged into the master, a new release branch is created. [sent-41, score-2.056]
10 Automatic generation of a new release - Every minute, JIRA tasks that have been resolved and tested are merged into a release branch (with the exception of tasks specifically marked in JIRA flow). [sent-42, score-1.617]
11 Release automatically kept up to date with master - Since the master branch is a copy of the code production, and developers add hot fixes to it via the special tool Deploy Dashboard, the master branch needs to continuously be merged with the branch release. [sent-44, score-2.357]
12 If the developer adds a change to the task branch after a merger with a branch release, this will be caught and AIDA will report it. [sent-47, score-0.858]
13 Applying a patch to the master branch and release branch takes place in semi-automatic mode. [sent-49, score-1.583]
14 Then the release engineer checks and applies it to the master branch in the central repository. [sent-53, score-1.038]
15 If the ticket tester creates a Shot (code deploy into a single production environment), the task status is automatically changed to ‘In Shot’. [sent-61, score-0.71]
16 The ticket is reopened automatically when the task is rolled back from the release branch. [sent-62, score-0.764]
17 If changes to the task branch happen after the task has been resolved, the issue is returned to review mode. [sent-63, score-1.113]
18 When a task branch is pushed to the central repository for the first time, the branch name is registered in the corresponding JIRA ticket. [sent-64, score-1.38]
19 Continuous integration Earlier, we wanted to get rid of routine activities related to the assembly and automatic deployment to a test environment, but were stuck with manually assigning new names to the branches of each release in the project’s CI-server. [sent-69, score-0.806]
20 If the tests don't pass, the release version is marked as bad and is rolled back to the previous (good) version of the release. [sent-80, score-0.619]
wordName wordTfidf (topN-words)
[('branch', 0.545), ('aida', 0.42), ('release', 0.349), ('task', 0.215), ('jira', 0.202), ('branches', 0.19), ('teamcity', 0.161), ('tests', 0.119), ('merged', 0.108), ('master', 0.106), ('automatic', 0.104), ('git', 0.1), ('production', 0.084), ('automatically', 0.083), ('deploy', 0.082), ('ticket', 0.082), ('routine', 0.073), ('tested', 0.072), ('fixes', 0.07), ('revert', 0.07), ('checked', 0.067), ('status', 0.065), ('badoo', 0.065), ('reviewer', 0.065), ('tester', 0.065), ('environment', 0.063), ('spreadsheet', 0.059), ('tasks', 0.055), ('report', 0.054), ('changes', 0.054), ('special', 0.048), ('review', 0.048), ('test', 0.048), ('hot', 0.048), ('date', 0.047), ('developer', 0.044), ('marked', 0.042), ('patches', 0.042), ('resolved', 0.042), ('assembly', 0.042), ('shot', 0.04), ('central', 0.038), ('patch', 0.038), ('repository', 0.037), ('version', 0.037), ('conflict', 0.037), ('issue', 0.036), ('workflow', 0.036), ('rolled', 0.035), ('creates', 0.034)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 1531 high scalability-2013-10-13-AIDA: Badoo’s journey into Continuous Integration
Introduction: It’s hardly news to anyone that product development and testing involve a lot of boring routine work, which can lead to human error. To avoid complications stemming from this, we use AIDA. AIDA (Automated Interactive Deploy Assistant) is a utility that automatically performs many of the processes in Git, TeamCity and JIRA. In this post, we focus on how through using AIDA we were able to automate multiple workflows and create a scheme of continuous integration. We’ll start by looking at the version control system (VCS) we use here at Badoo, specifically how Git is used to automate creation of release branches, and their subsequent merging. Then we’ll discuss AIDA’s major contribution to both JIRA integration and TeamCity. Git flow The Badoo Team uses Git as a version control system. Our model ensures each task is developed and tested in a separate branch. The branch name consists of the ticket number in JIRA and a description of the problem. BFG-9000_All_developers_should
2 0.11370042 173 high scalability-2007-12-05-Easier Production Releases
Introduction: I’ve been a part of some late night release procedures and they’re never fun. You’ve got QA, Dev, IT and a handful of managers sitting in their jammies in a group IM (or worse, a conference call) from 2:00 AM until way too early in the morning. Everyone’s grumpy and sleepy, causing the release to be more difficult and take longer. Sometimes the dreaded “rollback!” is yelled. All this because you’re running a high profile website that needs to be accessible 24/7, and 2:00 AM - 5:00 AM downtime is better than daytime downtime. If you're a site that doesn't have 10s of thousands to drop on a real http load balancer, use this strategy to release software during business hours with no downtime using apache's mod_proxy_balancer....
3 0.11026911 378 high scalability-2008-09-03-Some Facebook Secrets to Better Operations
Introduction: Kim Nash in an interview with Jonathan Heiliger , Facebook VP of technical operations, provides some juicy details on how Facebook handles operations. Operations is one of those departments everyone runs differently as it is usually an ontogeny recapitulates phylogeny situation. With 2,000 databases, 25 terabytes of cache, 90 million active users, and 10,000 servers you know Facebook has some serious operational issues. What are some of Facebook's secrets to better operations? Frequent Releases . A major release once a week and a minor releases every few days. Create a Cyber Liability Group . At one time operations was distributed amongst several groups. A permanent operations group was created to isolate problems and revert problem software components back to previously known good states. The ability of a separate team to handle rollbacks speaks to a great deal of standardization and advanced tool building. Distribute Team Across Time Zones . Split the operations team ac
Introduction: At a Cloud Computing Meetup , Siddharth "Sid" Anand of Netflix, backed by a merry band of Netflixians, gave an interesting talk: Keeping Movies Running Amid Thunderstorms . While the talk gave a good overview of their move to the cloud, issues with capacity planning, thundering herds , latency problems, and simian armageddon , I found myself most taken with how they handle software deployment in the cloud . I've worked on half a dozen or more build and deployment systems, some small, some quite large, but never for a large organization like Netflix in the cloud. The cloud has this amazing capability that has never existed before that enables a novel approach to fault-tolerant software deployments: the ability to spin up huge numbers of instances to completely run a new release while running the old release at the same time . The process goes something like: A canary machine is launched first with the new software load running real traffic to sanity test the load in a p
Introduction: This a guest post by Rajkumar Iyer , a Member of Technical Staff at Aerospike. About a year ago, Aerospike embarked upon a quest to increase in-memory database performance - 1 Million TPS on a single inexpensive commodity server. NoSQL has the reputation of speed, and we saw great benefit from improving latency and throughput of cacheless architectures. At that time, we took a version of Aerospike delivering about 200K TPS, improved a few things - performance went to 500k TPS - and published the Aerospike 2.0 Community Edition. We then used kernel tuning techniques and published the recipe for how we achieved 1 M TPS on $5k of hardware. This year we continued the quest. Our goal was to achieve 1 Million database transactions per second per server; more than doubling previous performance. This compares to Cassandra’s boast of 1M TPS on over 300 servers in Google Compute Engine - at a cost of $2 million dollars per year. We achieved this without kernel tuning. This article d
6 0.099933796 539 high scalability-2009-03-16-Books: Web 2.0 Architectures and Cloud Application Architectures
7 0.09834294 397 high scalability-2008-09-28-Product: Happy = Hadoop + Python
8 0.082127891 1628 high scalability-2014-04-08-Microservices - Not a free lunch!
9 0.079539597 1440 high scalability-2013-04-15-Scaling Pinterest - From 0 to 10s of Billions of Page Views a Month in Two Years
10 0.07845819 796 high scalability-2010-03-16-Justin.tv's Live Video Broadcasting Architecture
11 0.078284636 1209 high scalability-2012-03-14-The Azure Outage: Time Is a SPOF, Leap Day Doubly So
12 0.078163885 1359 high scalability-2012-11-15-Gone Fishin': Justin.Tv's Live Video Broadcasting Architecture
14 0.071775444 1158 high scalability-2011-12-16-Stuff The Internet Says On Scalability For December 16, 2011
15 0.071445435 1387 high scalability-2013-01-15-More Numbers Every Awesome Programmer Must Know
16 0.069980815 255 high scalability-2008-02-21-Product: Capistrano - Automate Remote Tasks Via SSH
19 0.066192798 1068 high scalability-2011-06-27-TripAdvisor Architecture - 40M Visitors, 200M Dynamic Page Views, 30TB Data
20 0.065467812 1240 high scalability-2012-05-07-Startups are Creating a New System of the World for IT
topicId topicWeight
[(0, 0.107), (1, 0.016), (2, -0.003), (3, -0.013), (4, 0.035), (5, -0.022), (6, 0.041), (7, -0.005), (8, -0.016), (9, -0.032), (10, -0.007), (11, 0.033), (12, 0.033), (13, -0.057), (14, 0.017), (15, -0.026), (16, 0.002), (17, -0.009), (18, -0.011), (19, 0.018), (20, 0.008), (21, -0.026), (22, 0.025), (23, 0.011), (24, -0.007), (25, 0.034), (26, -0.018), (27, 0.011), (28, -0.002), (29, 0.016), (30, -0.04), (31, 0.006), (32, -0.045), (33, 0.036), (34, 0.022), (35, -0.003), (36, -0.018), (37, -0.009), (38, 0.031), (39, 0.031), (40, -0.058), (41, -0.001), (42, 0.007), (43, 0.007), (44, -0.021), (45, 0.01), (46, -0.001), (47, 0.004), (48, 0.027), (49, 0.045)]
simIndex simValue blogId blogTitle
same-blog 1 0.9762544 1531 high scalability-2013-10-13-AIDA: Badoo’s journey into Continuous Integration
Introduction: It’s hardly news to anyone that product development and testing involve a lot of boring routine work, which can lead to human error. To avoid complications stemming from this, we use AIDA. AIDA (Automated Interactive Deploy Assistant) is a utility that automatically performs many of the processes in Git, TeamCity and JIRA. In this post, we focus on how through using AIDA we were able to automate multiple workflows and create a scheme of continuous integration. We’ll start by looking at the version control system (VCS) we use here at Badoo, specifically how Git is used to automate creation of release branches, and their subsequent merging. Then we’ll discuss AIDA’s major contribution to both JIRA integration and TeamCity. Git flow The Badoo Team uses Git as a version control system. Our model ensures each task is developed and tested in a separate branch. The branch name consists of the ticket number in JIRA and a description of the problem. BFG-9000_All_developers_should
Introduction: This is guest post by Michael DeHaan (@laserllama), a software developer and architect, on Ansible , a simple deployment, model-driven configuration management, and command execution framework. I owe High Scalability a great deal of credit for the idea behind my latest software project. I was reading about how an older tool I helped create, Func, was used at Tumblr , and it kicked some ideas into gear. This article is about what happened from that idea. My observation, which the article reinforced, was that many shops end up using a configuration management tool (Puppet, Chef, cfengine), a separate deployment tool (Capistrano, Fabric) and yet another separate ad-hoc task execution tool (Func, pssh, etc) because one class of tool historically hasn't been good at all three jobs. My other observation (not from the article) was that the whole "infrastructure as code" movement, while revolutionary, and definitely great for many, was probably secretly grating on a good number of
3 0.75568688 295 high scalability-2008-04-02-Product: Supervisor - Monitor and Control Your Processes
Introduction: It's a sad fact of life, but processes die. I know, it's horrible. You start them, send them out into process space, and hope for the best. Yet sometimes, despite your best coding, they core dump, seg fault, or some other calamity befalls them. Unlike our messy biological world so cruelly ruled by entropy, in the digital world processes can be given another chance. They can be restarted. A greater destiny awaits. And hopefully this time the random lottery of unforeseen killing factors will be avoided and a long productive life will be had by all. This is fun code to write because it's a lot more complicated than you might think. And restarting processes is a highly effective high availability strategy. Most faults are transient, caused by an unexpected series of events. Rather than taking drastic action, like taking a node out of production or failing over, transients can be effectively masked by simply restarting failed processes. Though complexity makes it a fun problem, it's also
4 0.72603041 429 high scalability-2008-10-25-Product: Puppet the Automated Administration System
Introduction: Update: Digg on their choice and use of Puppet . They chose puppet over cfengine, and bcfg2 because they liked Puppet's resource abstraction layer (RAL), the ability to implement configuration management incrementally, support for bundles, and the overall design philosophy. Puppet implements a declarative (what not how) configuration language for automating common administration tasks. It's the system every large site writes for themselves and it's already made for you! Ilike was able to "easily" scale from 0 to hundreds of servers using Puppet. I can't believe I've never seen this before. It looks really cool. What is Puppet and how can it help you scale your website operations? From the Puppet website: Puppet has been developed to help the sysadmin community move to building and sharing mature tools that avoid the duplication of everyone solving the same problem. It does so in two ways: * It provides a powerful framework to simplify the majority of the technical tasks t
5 0.71988171 461 high scalability-2008-12-05-Sprinkle - Provisioning Tool to Build Remote Servers
Introduction: At 37 Signals Joshua Sierles describes how 37 Signals uses Sprinkle to configure their servers within EC2. Sprinkle defines a domain specific meta-language for describing and processing the installation of software . You can find an interesting discussion of Sprinkle's creation story by the creator himself, Marcus Crafter, in Sprinkle Some Powder! . Marcus divides provisioning tools into two categories: Task Based - the tool issues a list of commands to run on the remote system, either remotely via a network connection or smart client. Policy/state Based - the tool determines what needs to be run on the remote system by examining its current and final state. Sprinkle combines both models together in a chocolate-in-my-peanut-butter approach using normal Ruby code as the DSL (domain specific language) to declaratively describe remote system configurations. 37 Signals likes the use of Ruby as the DSL because it makes learning a separate syntax unnecessary. I've successfu
6 0.71982831 807 high scalability-2010-04-09-Vagrant - Build and Deploy Virtualized Development Environments Using Ruby
7 0.70804101 385 high scalability-2008-09-16-Product: Func - Fedora Unified Network Controller
8 0.70168525 1408 high scalability-2013-02-19-Puppet monitoring: how to monitor the success or failure of Puppet runs
9 0.69871444 255 high scalability-2008-02-21-Product: Capistrano - Automate Remote Tasks Via SSH
10 0.68679255 208 high scalability-2008-01-11-FTP Sanity: Redundancy, archiving, consolidation.
11 0.6857149 245 high scalability-2008-02-12-Product: rPath - Creating and Managing Virtual Appliances
12 0.68387514 1209 high scalability-2012-03-14-The Azure Outage: Time Is a SPOF, Leap Day Doubly So
13 0.67237324 228 high scalability-2008-01-28-Product: ISPMan Centralized ISP Management System
15 0.66549087 1068 high scalability-2011-06-27-TripAdvisor Architecture - 40M Visitors, 200M Dynamic Page Views, 30TB Data
16 0.66442496 1269 high scalability-2012-06-20-iDoneThis - Scaling an Email-based App from Scratch
17 0.65843266 1335 high scalability-2012-10-08-How UltraDNS Handles Hundreds of Thousands of Zones and Tens of Millions of Records
18 0.65734106 1124 high scalability-2011-09-26-17 Techniques Used to Scale Turntable.fm and Labmeeting to Millions of Users
19 0.65115607 1155 high scalability-2011-12-12-Netflix: Developing, Deploying, and Supporting Software According to the Way of the Cloud
20 0.65076959 263 high scalability-2008-02-27-Product: System Imager - Automate Deployment and Installs
topicId topicWeight
[(1, 0.137), (2, 0.163), (10, 0.066), (20, 0.014), (30, 0.019), (47, 0.011), (61, 0.047), (73, 0.013), (77, 0.273), (79, 0.076), (85, 0.038), (94, 0.027)]
simIndex simValue blogId blogTitle
Introduction: James Hamilton in Counting Servers is Hard has an awesome breakdown of what one million plus servers really means in terms of resource usage. The summary from his calculations are eye popping: Facilities: 15 to 30 large datacenters Capital expense: $4.25 Billion Total power: 300MW Power Consumption: 2.6TWh annually The power consumption is about the same as used by Nicaragua and the capital cost is about a third of what Americans spent on video games in 2012. Now that's web scale.
2 0.92136997 1195 high scalability-2012-02-17-Stuff The Internet Says On Scalability For February 17, 2012
Introduction: HighScalability Tested, Cyborg Approved: Google's DNS : 70 billion requests a day; Superexponentia l: the rate of tech progress; Akka : 48 cores / 20 million messages a second; 1 minute : intervals for health; Santa Tracker : 1.6 Million Requests per Second; 70x : MySQL cluster performance improvement Quotable Quotes @joeweinman : Lightbody at #ccevent : "Rule #1: architect with price structure in mind Techdirt : Nothing Scales Like Stupidity @brainvat : The IRS of Spain has a columnar database with 100,000 columns and over a trillion rows Zynga is now 20% Amazon and 80% their own cloud , reversing their previous approach of launching in Amazon for the spiky growth phase of a product and then folding it back in when growth rates stabilized. Follow the money... Cost comparisons are like benchmarks, the only sure thing is that nothing is sure, but we usually still learning something anyway. Depending the load profile, bandwidth cost, C
Introduction: Successful software design is all about trade-offs. In the typical (if there is such a thing) distributed system, recognizing the importance of trade-offs within the design of your architecture is integral to the success of your system. Despite this reality, I see time and time again, developers choosing a particular solution based on an ill-placed belief in their solution as a “silver bullet”, or a solution that conquers all, despite the inevitable occurrence of changing requirements. Regardless of the reasons behind this phenomenon, I’d like to outline a few of the methods I use to ensure that I’m making good scalable decisions without losing sight of the trade-offs that accompany them. I’d also like to compile (pun intended) the issues at hand, by formulating a simple theorem that we can use to describe this oft occurring situation.
same-blog 4 0.91908509 1531 high scalability-2013-10-13-AIDA: Badoo’s journey into Continuous Integration
Introduction: It’s hardly news to anyone that product development and testing involve a lot of boring routine work, which can lead to human error. To avoid complications stemming from this, we use AIDA. AIDA (Automated Interactive Deploy Assistant) is a utility that automatically performs many of the processes in Git, TeamCity and JIRA. In this post, we focus on how through using AIDA we were able to automate multiple workflows and create a scheme of continuous integration. We’ll start by looking at the version control system (VCS) we use here at Badoo, specifically how Git is used to automate creation of release branches, and their subsequent merging. Then we’ll discuss AIDA’s major contribution to both JIRA integration and TeamCity. Git flow The Badoo Team uses Git as a version control system. Our model ensures each task is developed and tested in a separate branch. The branch name consists of the ticket number in JIRA and a description of the problem. BFG-9000_All_developers_should
5 0.9072668 1116 high scalability-2011-09-15-Paper: It's Time for Low Latency - Inventing the 1 Microsecond Datacenter
Introduction: In It's Time for Low Latency Stephen Rumble et al. explore the idea that it's time to rearchitect our stack to live in the modern era of low-latency datacenter instead of high-latency WANs. The implications for program architectures will be revolutionary . Luiz André Barroso , Distinguished Engineer at Google, sees ultra low latency as a way to make computer resources, to be as much as possible, fungible, that is they are interchangeable and location independent, effectively turning a datacenter into single computer. Abstract from the paper: The operating systems community has ignored network latency for too long. In the past, speed-of-light delays in wide area networks and unoptimized network hardware have made sub-100µs round-trip times impossible. However, in the next few years datacenters will be deployed with low-latency Ethernet. Without the burden of propagation delays in the datacenter campus and network delays in the Ethernet devices, it will be up to us to finish
6 0.90353823 753 high scalability-2009-12-21-Hot Holiday Scalability Links for 2009
7 0.90048468 525 high scalability-2009-03-05-Product: Amazon Simple Storage Service
8 0.89769679 258 high scalability-2008-02-24-Yandex Architecture
9 0.89090282 959 high scalability-2010-12-17-Stuff the Internet Says on Scalability For December 17th, 2010
10 0.8888877 766 high scalability-2010-01-26-Product: HyperGraphDB - A Graph Database
11 0.88875347 439 high scalability-2008-11-10-Scalability Perspectives #1: Nicholas Carr – The Big Switch
12 0.88180435 211 high scalability-2008-01-13-Google Reveals New MapReduce Stats
13 0.87150329 1377 high scalability-2012-12-26-Ask HS: What will programming and architecture look like in 2020?
14 0.84066141 1158 high scalability-2011-12-16-Stuff The Internet Says On Scalability For December 16, 2011
15 0.83718395 1188 high scalability-2012-02-06-The Design of 99designs - A Clean Tens of Millions Pageviews Architecture
16 0.81995553 612 high scalability-2009-05-31-Parallel Programming for real-world
17 0.81760705 212 high scalability-2008-01-14-OpenSpaces.org community site launched - framework for building scale-out applications
18 0.81477129 977 high scalability-2011-01-21-PaaS shouldn’t be built in Silos
19 0.81034195 1059 high scalability-2011-06-14-A TripAdvisor Short
20 0.80904478 1567 high scalability-2013-12-20-Stuff The Internet Says On Scalability For December 20th, 2013