high_scalability high_scalability-2008 high_scalability-2008-378 knowledge-graph by maker-knowledge-mining

378 high scalability-2008-09-03-Some Facebook Secrets to Better Operations


meta infos for this blog

Source: html

Introduction: Kim Nash in an interview with Jonathan Heiliger , Facebook VP of technical operations, provides some juicy details on how Facebook handles operations. Operations is one of those departments everyone runs differently as it is usually an ontogeny recapitulates phylogeny situation. With 2,000 databases, 25 terabytes of cache, 90 million active users, and 10,000 servers you know Facebook has some serious operational issues. What are some of Facebook's secrets to better operations? Frequent Releases . A major release once a week and a minor releases every few days. Create a Cyber Liability Group . At one time operations was distributed amongst several groups. A permanent operations group was created to isolate problems and revert problem software components back to previously known good states. The ability of a separate team to handle rollbacks speaks to a great deal of standardization and advanced tool building. Distribute Team Across Time Zones . Split the operations team ac


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 A major release once a week and a minor releases every few days. [sent-6, score-0.334]

2 At one time operations was distributed amongst several groups. [sent-8, score-0.189]

3 A permanent operations group was created to isolate problems and revert problem software components back to previously known good states. [sent-9, score-0.614]

4 The ability of a separate team to handle rollbacks speaks to a great deal of standardization and advanced tool building. [sent-10, score-0.295]

5 Split the operations team across different time zones so no one has to work the graveyard shift. [sent-12, score-0.324]

6 Facebook has 20 people in their team located in Palo Alto, California and London, England. [sent-13, score-0.135]

7 Fear of failure often shuts down the organizational brain and makes it hide behind excessive rules and regulations. [sent-15, score-0.192]

8 When a problem is detected in a release the changes can either be rolled forward or backward. [sent-24, score-0.422]

9 Rolling forward is fixing problems in the new release rather than rolling back. [sent-26, score-0.597]

10 Roll forward ends up being covered in the press, so prefer roll backs over roll forwards. [sent-28, score-0.566]

11 Use the slow rollout to fix problems that can only be found under real user conditions. [sent-31, score-0.168]

12 This approach give operations and development a lot of confidence in changes. [sent-32, score-0.189]

13 Design reviews, PR strategy, which servers to buy, etc are often open for informal debate among employees. [sent-34, score-0.195]

14 There's a discussion tool for discussing the idea and a rating system for rating the idea. [sent-36, score-0.328]

15 Large company meetings, monthly presentations and weekly Q&As; with the management team are transcribed live. [sent-39, score-0.135]

16 Getting software moved into production is often harder than the original coding and testing. [sent-41, score-0.272]

17 Emphasing frequent releases and gutsy release policies makes it actually seem like someone is supporting developers instead of treating them like their software carries the plague. [sent-49, score-0.808]

18 Data centers are often treated like quarantine stations and developers are treated like asymptomatic carriers of some unknown virulent disease. [sent-50, score-0.562]

19 To setup or not to setup a separate operations group? [sent-53, score-0.351]

20 Amazon says "not to be" and has developers support their own software . [sent-55, score-0.189]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('facebook', 0.227), ('release', 0.203), ('operations', 0.189), ('rolling', 0.17), ('roll', 0.169), ('rating', 0.164), ('forward', 0.144), ('team', 0.135), ('treated', 0.134), ('releases', 0.131), ('frequent', 0.118), ('servicesamazon', 0.105), ('heiliger', 0.105), ('kim', 0.105), ('liability', 0.105), ('operationson', 0.105), ('secretly', 0.105), ('often', 0.104), ('developers', 0.104), ('previously', 0.103), ('pr', 0.099), ('articleshighscalability', 0.099), ('aninterview', 0.099), ('cyber', 0.094), ('carries', 0.091), ('departments', 0.091), ('informal', 0.091), ('rollout', 0.088), ('seperate', 0.088), ('shuts', 0.088), ('carriers', 0.086), ('software', 0.085), ('revert', 0.084), ('backs', 0.084), ('production', 0.083), ('openness', 0.082), ('setup', 0.081), ('attitude', 0.08), ('speaks', 0.08), ('standardization', 0.08), ('problems', 0.08), ('screw', 0.079), ('bias', 0.079), ('alto', 0.077), ('treating', 0.076), ('meetings', 0.075), ('detected', 0.075), ('remotely', 0.075), ('somehow', 0.075), ('permanent', 0.073)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000001 378 high scalability-2008-09-03-Some Facebook Secrets to Better Operations

Introduction: Kim Nash in an interview with Jonathan Heiliger , Facebook VP of technical operations, provides some juicy details on how Facebook handles operations. Operations is one of those departments everyone runs differently as it is usually an ontogeny recapitulates phylogeny situation. With 2,000 databases, 25 terabytes of cache, 90 million active users, and 10,000 servers you know Facebook has some serious operational issues. What are some of Facebook's secrets to better operations? Frequent Releases . A major release once a week and a minor releases every few days. Create a Cyber Liability Group . At one time operations was distributed amongst several groups. A permanent operations group was created to isolate problems and revert problem software components back to previously known good states. The ability of a separate team to handle rollbacks speaks to a great deal of standardization and advanced tool building. Distribute Team Across Time Zones . Split the operations team ac

2 0.14810583 1240 high scalability-2012-05-07-Startups are Creating a New System of the World for IT

Introduction: It remains that, from the same principles, I now demonstrate the frame of the System of the World. -- Isaac Newton The practice of IT reminds me a lot of the practice of science before Isaac Newton. Aristotelianism was dead, but there was nothing to replace it. Then Newton came along, created a scientific revolution with his System of the World . And everything changed. That was New System of the World number one. New System of the World number two was written about by the incomparable Neal Stephenson in his incredible  Baroque Cycle  series. It explores the singular creation of a new way of organizing society grounded in new modes of thought in business, religion, politics, and science. Our modern world emerged Enlightened as it could from this roiling cauldron of forces. In IT we may have had a Leonardo da Vinci or even a Galileo, but we’ve never had our Newton. Maybe we don't need a towering genius to make everything clear? For years startups, like the frenetically inventive

3 0.14526957 1155 high scalability-2011-12-12-Netflix: Developing, Deploying, and Supporting Software According to the Way of the Cloud

Introduction: At a  Cloud Computing Meetup , Siddharth "Sid" Anand of Netflix, backed by a merry band of Netflixians, gave an interesting talk: Keeping Movies Running Amid Thunderstorms . While the talk gave a good overview of their move to the cloud, issues with capacity planning, thundering herds , latency problems, and simian armageddon , I found myself most taken with how they handle software deployment in the cloud . I've worked on half a dozen or more build and deployment systems, some small, some quite large, but never for a large organization like Netflix in the cloud. The cloud has this amazing capability that has never existed before that enables a novel approach to fault-tolerant software deployments: the ability to spin up huge numbers of instances to completely run a new release while running the old release at the same time . The process goes something like:  A canary machine is launched first with the new software load running real traffic to sanity test the load in a p

4 0.14146399 870 high scalability-2010-08-02-7 Scaling Strategies Facebook Used to Grow to 500 Million Users

Introduction: Robert Johnson,   a director of engineering at Facebook, celebrated Facebook's monumental achievement of reaching 500 million users by sharing the scaling principles that helped  reach that milestone. In case you weren't suitably impressed by the 500 million user number, Robert ratchets up the numbers game with these impressive figures: 1 million users per engineer 500 million active users 100 billion hits per day 50 billion photos 2 trillion objects cached, with hundreds of millions of requests per second 130TB of logs every day How did Facebook get to this point? People Matter Most . It's people who build and run systems. The best tools for scaling are an engineering and operations teams that can handle anything. Scale Horizontally . Handling exponentially growing traffic requires spreading load arbitrarily across many machines. Using different databases for tables like accounts and profiles only doubles capacity. This approach hurts efficiency, but

5 0.14123861 720 high scalability-2009-10-12-High Performance at Massive Scale – Lessons learned at Facebook

Introduction: Jeff Rothschild, Vice President of Technology at Facebook gave a great presentation at UC San Diego on our favorite subject: " High Performance at Massive Scale –  Lessons learned at Facebook ". The abstract for the talk is: Facebook has grown into one of the largest sites on the Internet today serving over 200 billion pages per month. The nature of social data makes engineering a site for this level of scale a particularly challenging proposition. In this presentation, I will discuss the aspects of social data that present challenges for scalability and will describe the the core architectural components and design principles that Facebook has used to address these challenges. In addition, I will discuss emerging technologies that offer new opportunities for building cost-effective high performance web architectures. There's a lot of interesting about this talk that we'll get into  later, but I thought you might want a head start on learning how Facebook handles 30K+ machines,

6 0.14006427 845 high scalability-2010-06-22-Exploring the software behind Facebook, the world’s largest site

7 0.14000858 1123 high scalability-2011-09-23-The Real News is Not that Facebook Serves Up 1 Trillion Pages a Month…

8 0.13145448 721 high scalability-2009-10-13-Why are Facebook, Digg, and Twitter so hard to scale?

9 0.12478611 1508 high scalability-2013-08-28-Sean Hull's 20 Biggest Bottlenecks that Reduce and Slow Down Scalability

10 0.12219384 1628 high scalability-2014-04-08-Microservices - Not a free lunch!

11 0.12143026 624 high scalability-2009-06-10-Hive - A Petabyte Scale Data Warehouse using Hadoop

12 0.12008412 840 high scalability-2010-06-10-The Four Meta Secrets of Scaling at Facebook

13 0.11978074 1008 high scalability-2011-03-22-Facebook's New Realtime Analytics System: HBase to Process 20 Billion Events Per Day

14 0.11735332 313 high scalability-2008-05-02-Friends for Sale Architecture - A 300 Million Page View-Month Facebook RoR App

15 0.11168487 96 high scalability-2007-09-18-Amazon Architecture

16 0.11128336 399 high scalability-2008-10-01-Joyent - Cloud Computing Built on Accelerators

17 0.11085094 920 high scalability-2010-10-15-Troubles with Sharding - What can we learn from the Foursquare Incident?

18 0.11066601 1011 high scalability-2011-03-25-Did the Microsoft Stack Kill MySpace?

19 0.11026911 1531 high scalability-2013-10-13-AIDA: Badoo’s journey into Continuous Integration

20 0.10881037 259 high scalability-2008-02-25-Any Suggestions for the Architecture Template?


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.207), (1, 0.069), (2, 0.019), (3, 0.005), (4, 0.073), (5, -0.07), (6, -0.038), (7, 0.001), (8, 0.008), (9, -0.014), (10, -0.019), (11, 0.089), (12, 0.062), (13, -0.009), (14, -0.026), (15, 0.049), (16, 0.07), (17, -0.021), (18, -0.02), (19, 0.089), (20, 0.098), (21, 0.042), (22, 0.12), (23, -0.022), (24, -0.036), (25, 0.004), (26, 0.031), (27, 0.013), (28, 0.016), (29, 0.031), (30, -0.149), (31, -0.008), (32, 0.012), (33, 0.008), (34, 0.006), (35, 0.007), (36, 0.034), (37, 0.017), (38, 0.036), (39, 0.036), (40, -0.095), (41, 0.01), (42, 0.035), (43, 0.019), (44, -0.023), (45, -0.001), (46, -0.011), (47, 0.003), (48, 0.053), (49, 0.016)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.98561764 378 high scalability-2008-09-03-Some Facebook Secrets to Better Operations

Introduction: Kim Nash in an interview with Jonathan Heiliger , Facebook VP of technical operations, provides some juicy details on how Facebook handles operations. Operations is one of those departments everyone runs differently as it is usually an ontogeny recapitulates phylogeny situation. With 2,000 databases, 25 terabytes of cache, 90 million active users, and 10,000 servers you know Facebook has some serious operational issues. What are some of Facebook's secrets to better operations? Frequent Releases . A major release once a week and a minor releases every few days. Create a Cyber Liability Group . At one time operations was distributed amongst several groups. A permanent operations group was created to isolate problems and revert problem software components back to previously known good states. The ability of a separate team to handle rollbacks speaks to a great deal of standardization and advanced tool building. Distribute Team Across Time Zones . Split the operations team ac

2 0.82413948 870 high scalability-2010-08-02-7 Scaling Strategies Facebook Used to Grow to 500 Million Users

Introduction: Robert Johnson,   a director of engineering at Facebook, celebrated Facebook's monumental achievement of reaching 500 million users by sharing the scaling principles that helped  reach that milestone. In case you weren't suitably impressed by the 500 million user number, Robert ratchets up the numbers game with these impressive figures: 1 million users per engineer 500 million active users 100 billion hits per day 50 billion photos 2 trillion objects cached, with hundreds of millions of requests per second 130TB of logs every day How did Facebook get to this point? People Matter Most . It's people who build and run systems. The best tools for scaling are an engineering and operations teams that can handle anything. Scale Horizontally . Handling exponentially growing traffic requires spreading load arbitrarily across many machines. Using different databases for tables like accounts and profiles only doubles capacity. This approach hurts efficiency, but

3 0.79019594 1619 high scalability-2014-03-26-Oculus Causes a Rift, but the Facebook Deal Will Avoid a Scaling Crisis for Virtual Reality

Introduction: Facebook has been teasing us. While many of their recent acquisitions have been surprising, shocking is the only word adequately describing Facebook's 5 day whirlwind acquisition of Oculus , immersive virtual reality visionaries, for a now paltry sounding $2 billion. The backlash is a pandemic, jumping across social networks with the speed only a meme powered by the directly unaffected can generate. For more than 30 years VR has been the dream burning in the heart of every science fiction fan. Now that this future might finally be here, Facebook’s ownage makes it seem like a wonderful and hopeful timeline has been choked off, killing the Metaverse before it even had a chance to begin. For the many who voted for an open future with their Kickstarter dollars , there’s a deep and personal sense of betrayal, despite Facebook’s promise to leave Oculus alone. The intensity of the reaction is because Oculus matters to people. It's new, it's different, it create

4 0.7852183 1444 high scalability-2013-04-23-Facebook Secrets of Web Performance

Introduction: This is a repost of part 1 of  an interview I did for the Boundary blog . Boundary: What is Facebook’s secret sauce for managing what’s got to be the biggest Big Data project, if you will, on the Web? Hoff: From several presentations we’ve learned what Facebook insiders like Aditya Agarwal and Robert Johnson , both former Directors of Engineering, consider their secret sauce: Scaling Takes Iteration . Solutions often work in the beginning, but you’ll have to modify them as you go. PHP, for example, is simple to use at first, but is not a good choice when you have tens of thousands of web servers. Scaling Takes Iteration . You can say that again. Don’t Over-Design . Just use what you need as you scale your system out. Figure out where you need to iterate on a solution, optimize something, or completely build a part of the stack yourself. Choose the Right Tool for the Job . Realize that any choice comes with overhead. If you really need to use P

5 0.73395997 1011 high scalability-2011-03-25-Did the Microsoft Stack Kill MySpace?

Introduction: Robert Scoble wrote a fascinating case study, MySpace’s death spiral: insiders say it’s due to bets on Los Angeles and Microsoft , where he reports MySpace insiders blame the Microsoft stack on why they lost the great social network race to Facebook.   Does anyone know if this is true? What's the real story? I was wondering because it doesn't seem to track with the MySpace Architecture  post that I did in 2009, where they seem happy with their choices and had stats to back up their improvements. Why this matters is it's a fascinating model for startups to learn from. What does it really take to succeed? Is it the people or the stack? Is it the organization or the technology? Is it the process or the competition? Is the quality of the site or the love of the users? So much to consider and learn from. Some conjectures from the article: Myspace didn't have programming talent capable of scaling the site to compete with Facebook. Choosing the Microsoft stack made it difficul

6 0.72358978 840 high scalability-2010-06-10-The Four Meta Secrets of Scaling at Facebook

7 0.71430576 1014 high scalability-2011-03-31-8 Lessons We Can Learn from the MySpace Incident - Balance, Vision, Fearlessness

8 0.70582145 1223 high scalability-2012-04-06-Stuff The Internet Says On Scalability For April 6, 2012

9 0.70102978 264 high scalability-2008-03-03-Read This Site and Ace Your Next Interview!

10 0.69862372 966 high scalability-2010-12-31-Facebook in 20 Minutes: 2.7M Photos, 10.2M Comments, 4.6M Messages

11 0.69033301 845 high scalability-2010-06-22-Exploring the software behind Facebook, the world’s largest site

12 0.68723637 1602 high scalability-2014-02-26-The WhatsApp Architecture Facebook Bought For $19 Billion

13 0.68542111 129 high scalability-2007-10-23-Hire Facebook, Ning, and Salesforce to Scale for You

14 0.68439901 1323 high scalability-2012-09-15-4 Reasons Facebook Dumped HTML5 and Went Native

15 0.677347 720 high scalability-2009-10-12-High Performance at Massive Scale – Lessons learned at Facebook

16 0.67556995 464 high scalability-2008-12-13-Strategy: Facebook Tweaks to Handle 6 Time as Many Memcached Requests

17 0.66714549 1209 high scalability-2012-03-14-The Azure Outage: Time Is a SPOF, Leap Day Doubly So

18 0.66454226 1531 high scalability-2013-10-13-AIDA: Badoo’s journey into Continuous Integration

19 0.65803051 1081 high scalability-2011-07-18-Building your own Facebook Realtime Analytics System

20 0.6571762 1123 high scalability-2011-09-23-The Real News is Not that Facebook Serves Up 1 Trillion Pages a Month…


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(1, 0.142), (2, 0.232), (10, 0.048), (26, 0.016), (30, 0.027), (40, 0.05), (57, 0.021), (61, 0.058), (70, 0.138), (77, 0.032), (79, 0.093), (85, 0.017), (94, 0.06)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.93444884 1256 high scalability-2012-06-04-OpenFlow-SDN is Not a Silver Bullet for Network Scalability

Introduction: Ivan Pepelnjak (CCIE#1354 Emeritus) is Chief Technology Advisor at NIL Data Communications , author of numerous webinars and advanced networking books , and a prolific blogger . He’s focusing on data center and cloud networking, network virtualization, and scalable application design. OpenFlow is an interesting emerging networking technology appearing seemingly out of nowhere with much hype and fanfare in March 2011. More than a year later, there are two commercial products based on OpenFlow ( NEC’s Programmable Flow and Nicira’s Network Virtualization Platform ) and probably less than a dozen production-grade implementations (including Google’s G-Scale network and Indiana University’s campus network ). Is this an expected result for an emerging technology or another case of overhyped technology hitting limits imposed by reality? OpenFlow-based solutions have to overcome numerous problems every emerging technology is facing, in OpenFlow’s case ranging from compatibili

same-blog 2 0.93292981 378 high scalability-2008-09-03-Some Facebook Secrets to Better Operations

Introduction: Kim Nash in an interview with Jonathan Heiliger , Facebook VP of technical operations, provides some juicy details on how Facebook handles operations. Operations is one of those departments everyone runs differently as it is usually an ontogeny recapitulates phylogeny situation. With 2,000 databases, 25 terabytes of cache, 90 million active users, and 10,000 servers you know Facebook has some serious operational issues. What are some of Facebook's secrets to better operations? Frequent Releases . A major release once a week and a minor releases every few days. Create a Cyber Liability Group . At one time operations was distributed amongst several groups. A permanent operations group was created to isolate problems and revert problem software components back to previously known good states. The ability of a separate team to handle rollbacks speaks to a great deal of standardization and advanced tool building. Distribute Team Across Time Zones . Split the operations team ac

3 0.92789233 295 high scalability-2008-04-02-Product: Supervisor - Monitor and Control Your Processes

Introduction: It's a sad fact of life, but processes die. I know, it's horrible. You start them, send them out into process space, and hope for the best. Yet sometimes, despite your best coding, they core dump, seg fault, or some other calamity befalls them. Unlike our messy biological world so cruelly ruled by entropy, in the digital world processes can be given another chance. They can be restarted. A greater destiny awaits. And hopefully this time the random lottery of unforeseen killing factors will be avoided and a long productive life will be had by all. This is fun code to write because it's a lot more complicated than you might think. And restarting processes is a highly effective high availability strategy. Most faults are transient, caused by an unexpected series of events. Rather than taking drastic action, like taking a node out of production or failing over, transients can be effectively masked by simply restarting failed processes. Though complexity makes it a fun problem, it's also

4 0.91954547 1127 high scalability-2011-09-28-Pursue robust indefinite scalability with the Movable Feast Machine

Introduction: And now for something completely different, brought to you by David Ackley and Daniel Cannon in their playfully thought provoking paper:  Pursue robust indefinite scalability , wherein they try to take a fresh look at neural networks, starting from scratch. What is this strange thing called indefinite scalability ? They sound like words that don't really go together: Indefinite scalability is the property that the design can support open-ended computational growth without substantial re-engineering, in as strict as sense as can be managed. By comparison, many computer, algorithm, and network designs -- even those that address scalability -- are only finitely scalable because their scalability occurs within some finite space. For example, an in-core sorting algorithm for a 32 bit machine can only scale to billions of numbers before address space is exhausted and then that algorithm must be re-engineered. Our idea is to expose indefinitely scalable computational power to program

5 0.91628689 1368 high scalability-2012-12-07-Stuff The Internet Says On Scalability For December 7, 2012

Introduction: It's HighScalability Time: Quotable Quotes: Built to win : 4Gb/s, 10k requests per second, 2,000 nodes, 3 datacenters, 180TB and 8.5 billion requests. Design, deploy, dismantle in 583 days to elect the President.  @CarlosTheSailor : In modern terms, feudalism was a sort of scalability solution for the tribal system - @angel_m, starting from the beginning @randybias : "Software-defined" is the new "cloud." Sprinkle it on your products along with an API and you *are* the future. How can you resist a story about Lady Gaga and BigData ? BigData magic helps convert her more than 31 million Twitter followers and over 51 million Facebook followers into sales by creating more intimate communities of little monsters. While Twitter, Google, Apple, and Facebook are all concentrating on eviscerating the middleman, Lady Gaga wants to cut them all out of the action too. Reap and sow. Reap and sow.  Multi-Armed Bandit testing  sounds so much cooler t

6 0.90451401 803 high scalability-2010-04-05-Intercloud: How Will We Scale Across Multiple Clouds?

7 0.90360594 235 high scalability-2008-02-02-The case against ORM Frameworks in High Scalability Architectures

8 0.90231657 667 high scalability-2009-07-31-NSFW: Hilarious Fault-Tolerance Cartoon

9 0.89986593 1445 high scalability-2013-04-24-Strategy: Using Lots of RAM Often Cheaper than Using a Hadoop Cluster

10 0.89892185 1602 high scalability-2014-02-26-The WhatsApp Architecture Facebook Bought For $19 Billion

11 0.89578414 514 high scalability-2009-02-18-Numbers Everyone Should Know

12 0.89567757 1171 high scalability-2012-01-09-The Etsy Saga: From Silos to Happy to Billions of Pageviews a Month

13 0.89556098 1537 high scalability-2013-10-25-Stuff The Internet Says On Scalability For October 25th, 2013

14 0.8955332 1588 high scalability-2014-01-31-Stuff The Internet Says On Scalability For January 31st, 2014

15 0.89521283 1407 high scalability-2013-02-15-Stuff The Internet Says On Scalability For February 15, 2013

16 0.89460981 1297 high scalability-2012-08-03-Stuff The Internet Says On Scalability For August 3, 2012

17 0.89319277 1333 high scalability-2012-10-04-LinkedIn Moved from Rails to Node: 27 Servers Cut and Up to 20x Faster

18 0.89271653 1231 high scalability-2012-04-20-Stuff The Internet Says On Scalability For April 20, 2012

19 0.89269716 798 high scalability-2010-03-22-7 Secrets to Successfully Scaling with Scalr (on Amazon) by Sebastian Stadil

20 0.89221585 357 high scalability-2008-07-26-Google's Paxos Made Live – An Engineering Perspective