high_scalability high_scalability-2012 high_scalability-2012-1352 knowledge-graph by maker-knowledge-mining

1352 high scalability-2012-10-31-Gone Fishin': LiveJournal Architecture


meta infos for this blog

Source: html

Introduction: This was the first architecture profile on HighScalability. IMHO LiveJournal was really the start of the openness on how to build stuff at scale, setting the whole industry off with an excellent role model. They wrote about their architecture, they open sourced their tools, they showed that success wasn't based on keeping secrets, and they set forth principles still followed by our rather amazing industry. No other industry is so open and cooperative, with their eyes cast so far forward, intent on building cool stuff. When all around seems dark it would be good to keep this little bit of light in mind... A fascinating and detailed story of how LiveJournal evolved their system to scale. LiveJournal was an early player in the free blog service race and faced issues from quickly adding a large number of users. Blog posts come fast and furious which causes a lot of writes and writes are particularly hard to scale. Understanding how LiveJournal faced their scaling problems will help any


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 IMHO LiveJournal was really the start of the openness on how to build stuff at scale, setting the whole industry off with an excellent role model. [sent-2, score-0.202]

2 They wrote about their architecture, they open sourced their tools, they showed that success wasn't based on keeping secrets, and they set forth principles still followed by our rather amazing industry. [sent-3, score-0.322]

3 No other industry is so open and cooperative, with their eyes cast so far forward, intent on building cool stuff. [sent-4, score-0.431]

4 When all around seems dark it would be good to keep this little bit of light in mind. [sent-5, score-0.153]

5 A fascinating and detailed story of how LiveJournal evolved their system to scale. [sent-8, score-0.075]

6 LiveJournal was an early player in the free blog service race and faced issues from quickly adding a large number of users. [sent-9, score-0.513]

7 Blog posts come fast and furious which causes a lot of writes and writes are particularly hard to scale. [sent-10, score-0.367]

8 Understanding how LiveJournal faced their scaling problems will help any aspiring website builder. [sent-11, score-0.359]

9 Spread out writes and reads for more parallelism. [sent-19, score-0.121]

10 Shard storage approach, using DRBD, for maximal throughput. [sent-21, score-0.112]

11 MogileFS, a distributed file system, for parallelism. [sent-26, score-0.179]

12 TheSchwartz and Gearman for distributed job queuing to do more work in parallel. [sent-27, score-0.219]

13 LiveJournal as provided incredible value to the community through their efforts. [sent-30, score-0.072]

14 Sites can evolve from small 1, 2 machine setups to larger systems as they learn about their users and what their system really needs to do. [sent-31, score-0.178]

15 Remove choke points by caching, load balancing, sharding, clustering file systems, and making use of more disk spindles. [sent-33, score-0.29]

16 You can't just keep adding more and more read slaves and expect to scale. [sent-35, score-0.341]

17 Low level issues like which OS event notification mechanism to use, file system and disk interactions, threading and even models, and connection types, matter at scale. [sent-36, score-0.484]

18 Large sites eventually turn to a distributed queuing and scheduling mechanism to distribute large work loads across a grid. [sent-37, score-0.339]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('livejournal', 0.507), ('faced', 0.153), ('slaves', 0.141), ('queuing', 0.138), ('aspiring', 0.133), ('storytimegoogle', 0.133), ('versionplatformlinuxmysqlperlmemcachedmogilefsapachewhat', 0.133), ('videotokyo', 0.133), ('furious', 0.125), ('learneddo', 0.125), ('cast', 0.125), ('adding', 0.124), ('writes', 0.121), ('mechanism', 0.12), ('imho', 0.115), ('intent', 0.115), ('maximal', 0.112), ('connection', 0.108), ('setups', 0.106), ('drbd', 0.103), ('openness', 0.103), ('choke', 0.101), ('cooperative', 0.099), ('industry', 0.099), ('file', 0.098), ('gearman', 0.095), ('scenes', 0.093), ('eyes', 0.092), ('points', 0.091), ('sourced', 0.09), ('allocate', 0.087), ('blog', 0.086), ('forth', 0.085), ('kills', 0.083), ('distributed', 0.081), ('interactions', 0.081), ('notification', 0.08), ('race', 0.078), ('threading', 0.078), ('dark', 0.077), ('secrets', 0.077), ('keep', 0.076), ('evolved', 0.075), ('showed', 0.075), ('hashing', 0.074), ('scaling', 0.073), ('incredible', 0.072), ('evolve', 0.072), ('followed', 0.072), ('player', 0.072)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000001 1352 high scalability-2012-10-31-Gone Fishin': LiveJournal Architecture

Introduction: This was the first architecture profile on HighScalability. IMHO LiveJournal was really the start of the openness on how to build stuff at scale, setting the whole industry off with an excellent role model. They wrote about their architecture, they open sourced their tools, they showed that success wasn't based on keeping secrets, and they set forth principles still followed by our rather amazing industry. No other industry is so open and cooperative, with their eyes cast so far forward, intent on building cool stuff. When all around seems dark it would be good to keep this little bit of light in mind... A fascinating and detailed story of how LiveJournal evolved their system to scale. LiveJournal was an early player in the free blog service race and faced issues from quickly adding a large number of users. Blog posts come fast and furious which causes a lot of writes and writes are particularly hard to scale. Understanding how LiveJournal faced their scaling problems will help any

2 0.89987427 3 high scalability-2007-07-09-LiveJournal Architecture

Introduction: A fascinating and detailed story of how LiveJournal evolved their system to scale. LiveJournal was an early player in the free blog service race and faced issues from quickly adding a large number of users. Blog posts come fast and furious which causes a lot of writes and writes are particularly hard to scale. Understanding how LiveJournal faced their scaling problems will help any aspiring website builder. Site: http://www.livejournal.com/ Information Sources LiveJournal - Behind The Scenes Scaling Storytime Google Video Tokyo Video 2005 version Platform Linux MySql Perl Memcached MogileFS Apache What's Inside? Scaling from 1, 2, and 4 hosts to cluster of servers. Avoid single points of failure. Using MySQL replication only takes you so far. Becoming IO bound kills scaling. Spread out writes and reads for more parallelism. You can't keep adding read slaves and scale. Shard storage approach, using DRBD, for maxim

3 0.17219992 192 high scalability-2007-12-25-IBMer Says LAMP Can't Scale

Introduction: A very entertaining and somewhat educational article on IBM Poopheads say LAMP Users Need to "grow up" . The physical three tier architecture turns out to be the root of all evil and shared nothing architectures brings simplicity and light. In the comments Simon Willison makes an insightful comment on why fine grained caching works for personalized pages and proxy's don't: Great post, but I have to disagree with you on the finely grained caching part. If you look at big LAMP deployments such as Flickr, LiveJournal and Facebook the common technology component that enables them to scale is memcached - a tool for finely grained caching. That's not to say that they aren't doing shared-nothing, it's just that memcached is critical for helping the database layer scale. LiveJournal serves around 50% of its page views "permission controlled" (friends only) so an HTTP proxy on the front end isn't the right solution - but memcached reduces their database hits by 90%.

4 0.11439276 5 high scalability-2007-07-10-mixi.jp Architecture

Introduction: Mixi is a fast growing social networking site in Japan. They provide services like: diary, community, message, review, and photo album. Having a lot in common with LiveJournal they also developed many of the same approaches. Their write up on how they scaled their system is easily one of the best out there. Site: http://mixi.jp Information Sources mixi.jp - scaling out with open source Platform Linux Apache MySQL Perl Memcached Squid Shard What's Inside? They grew to approximately 4 million users in two years and add over 15,000 new users/day. Ranks 35th on Alexa and 3rd in Japan. More than 100 MySQL servers Add more than 10 servers/month Use non-persistent connections. Diary traffic is 85% read and 15% write. Message traffic is is 75% read and 25% write. Ran into replication performance problems so they had to split the database. Considered splitting vertically by user or splitting horizontally by table type. The ende

5 0.1131743 920 high scalability-2010-10-15-Troubles with Sharding - What can we learn from the Foursquare Incident?

Introduction: For everything given something seems to be taken. Caching is a great scalability solution, but caching also comes with problems . Sharding is a great scalability solution, but as Foursquare recently revealed in a post-mortem about their 17 hours of downtime, sharding also has problems. MongoDB, the database Foursquare uses, also contributed their post-mortem of what went wrong too. Now that everyone has shared and resharded, what can we learn to help us skip these mistakes and quickly move on to a different set of mistakes? First, like for Facebook , huge props to Foursquare and MongoDB for being upfront and honest about their problems. This helps everyone get better and is a sign we work in a pretty cool industry. Second, overall, the fault didn't flow from evil hearts or gross negligence. As usual the cause was more mundane: a key system, that could be a little more robust, combined with a very popular application built by a small group of people, under immense pressure

6 0.110108 1440 high scalability-2013-04-15-Scaling Pinterest - From 0 to 10s of Billions of Page Views a Month in Two Years

7 0.10513434 491 high scalability-2009-01-13-Product: Gearman - Open Source Message Queuing System

8 0.10444979 672 high scalability-2009-08-06-An Unorthodox Approach to Database Design : The Coming of the Shard

9 0.099363066 274 high scalability-2008-03-12-YouTube Architecture

10 0.094201684 178 high scalability-2007-12-10-1 Master, N Slaves

11 0.09338811 927 high scalability-2010-10-26-Marrying memcached and NoSQL

12 0.093313888 769 high scalability-2010-02-02-Scale out your identity management

13 0.090897031 406 high scalability-2008-10-08-Strategy: Flickr - Do the Essential Work Up-front and Queue the Rest

14 0.090523183 96 high scalability-2007-09-18-Amazon Architecture

15 0.089632653 23 high scalability-2007-07-24-Major Websites Down: Or Why You Want to Run in Two or More Data Centers.

16 0.089205913 554 high scalability-2009-04-04-Digg Architecture

17 0.088288248 538 high scalability-2009-03-16-Are Cloud Based Memory Architectures the Next Big Thing?

18 0.087433681 1240 high scalability-2012-05-07-Startups are Creating a New System of the World for IT

19 0.086635247 1588 high scalability-2014-01-31-Stuff The Internet Says On Scalability For January 31st, 2014

20 0.08557564 829 high scalability-2010-05-20-Strategy: Scale Writes to 734 Million Records Per Day Using Time Partitioning


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.173), (1, 0.079), (2, -0.038), (3, -0.046), (4, 0.016), (5, 0.048), (6, -0.017), (7, -0.039), (8, -0.031), (9, 0.008), (10, -0.01), (11, 0.014), (12, 0.021), (13, 0.029), (14, 0.031), (15, 0.003), (16, 0.008), (17, -0.008), (18, -0.018), (19, 0.075), (20, 0.04), (21, 0.055), (22, -0.071), (23, 0.018), (24, -0.082), (25, 0.037), (26, 0.144), (27, 0.092), (28, -0.077), (29, 0.011), (30, 0.097), (31, -0.087), (32, 0.124), (33, -0.005), (34, 0.03), (35, -0.106), (36, -0.01), (37, -0.092), (38, 0.074), (39, -0.184), (40, -0.006), (41, -0.008), (42, 0.054), (43, -0.029), (44, 0.054), (45, -0.05), (46, 0.02), (47, -0.065), (48, 0.045), (49, 0.021)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.96408868 3 high scalability-2007-07-09-LiveJournal Architecture

Introduction: A fascinating and detailed story of how LiveJournal evolved their system to scale. LiveJournal was an early player in the free blog service race and faced issues from quickly adding a large number of users. Blog posts come fast and furious which causes a lot of writes and writes are particularly hard to scale. Understanding how LiveJournal faced their scaling problems will help any aspiring website builder. Site: http://www.livejournal.com/ Information Sources LiveJournal - Behind The Scenes Scaling Storytime Google Video Tokyo Video 2005 version Platform Linux MySql Perl Memcached MogileFS Apache What's Inside? Scaling from 1, 2, and 4 hosts to cluster of servers. Avoid single points of failure. Using MySQL replication only takes you so far. Becoming IO bound kills scaling. Spread out writes and reads for more parallelism. You can't keep adding read slaves and scale. Shard storage approach, using DRBD, for maxim

same-blog 2 0.96265048 1352 high scalability-2012-10-31-Gone Fishin': LiveJournal Architecture

Introduction: This was the first architecture profile on HighScalability. IMHO LiveJournal was really the start of the openness on how to build stuff at scale, setting the whole industry off with an excellent role model. They wrote about their architecture, they open sourced their tools, they showed that success wasn't based on keeping secrets, and they set forth principles still followed by our rather amazing industry. No other industry is so open and cooperative, with their eyes cast so far forward, intent on building cool stuff. When all around seems dark it would be good to keep this little bit of light in mind... A fascinating and detailed story of how LiveJournal evolved their system to scale. LiveJournal was an early player in the free blog service race and faced issues from quickly adding a large number of users. Blog posts come fast and furious which causes a lot of writes and writes are particularly hard to scale. Understanding how LiveJournal faced their scaling problems will help any

3 0.58841115 257 high scalability-2008-02-22-Kevin's Great Adventures in SSDland

Introduction: Update: Final Thoughts on SSD and MySQL AKA Battleship Spinn3r . Tips on how to make your database 10x faster using solid state drives. Potential exists for 100x speedup. Solid-state drives (SSDs) are the holy grail of storage. The promise of RAM speeds and hard disk like persistence have for years driven us crazy with power user lust, but they've stayed tantalizingly just out of reach. Always too expensive, too small, and oddly too slow. Has that changed? Can you now miraculously have your cake and eat it too? Can you now have it both ways? Is balancing work with family life now as easy as tripping over a terabyte drive? In a pioneering series of blog articles Kevin Burton conducts original research on next generation SSD drives in real world configurations. For an experience report on his great adventure you can turn to: Could SSD Mean a Rise in MyISAM Usage? , Serverbeach, MySQL and Mtron SSDs , Prediction: SSD Blades in 2008 , Zeus IOPS - Another High

4 0.56653053 243 high scalability-2008-02-07-clusteradmin.blogspot.com - blog about building and administering clusters

Introduction: A blog about cluster administration. Written by a System Administrator working at HPC (High Performance Computing) data-center, mostly dealing with PC clusters (100s of servers), SMP machines and distributed installations. The blog concentrates on software/configuration/installation management systems, load balancers, monitoring and other cluster-related solutions.

5 0.5637669 566 high scalability-2009-04-13-High Performance Web Pages – Real World Examples: Netflix Case Study

Introduction: This read will provide you with information about how Netflix deals with high load on their movie rental website. It was written by Bill Scott in the fall of 2008. Read or download the PDF file here

6 0.55010796 898 high scalability-2010-09-09-6 Scalability Lessons

7 0.54224497 769 high scalability-2010-02-02-Scale out your identity management

8 0.53823948 710 high scalability-2009-09-20-PaxosLease: Diskless Paxos for Leases

9 0.53797251 345 high scalability-2008-06-11-Pyshards aspires to build sharding toolkit for Python

10 0.53478271 528 high scalability-2009-03-06-Product: Lightcloud - Key-Value Database

11 0.53444308 595 high scalability-2009-05-08-Publish-subscribe model does not scale?

12 0.53109121 747 high scalability-2009-11-26-What I'm Thankful For on Thanksgiving

13 0.52751034 88 high scalability-2007-09-10-Blog: Scalable Web Architectures by Royans Tharakan

14 0.52663869 157 high scalability-2007-11-16-Product: lbpool - Load Balancing JDBC Pool

15 0.51952529 1325 high scalability-2012-09-19-The 4 Building Blocks of Architecting Systems for Scale

16 0.51870483 690 high scalability-2009-08-31-Scaling MySQL on Amazon Web Services

17 0.51036304 1588 high scalability-2014-01-31-Stuff The Internet Says On Scalability For January 31st, 2014

18 0.50785112 275 high scalability-2008-03-14-Problem: Mobbing the Least Used Resource Error

19 0.49606851 1611 high scalability-2014-03-12-Paper: Scalable Eventually Consistent Counters over Unreliable Networks

20 0.49555841 916 high scalability-2010-10-07-Hot Scalability Links For Oct 8, 2010


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(1, 0.146), (2, 0.24), (4, 0.011), (10, 0.057), (12, 0.224), (40, 0.024), (59, 0.012), (61, 0.021), (79, 0.102), (94, 0.062)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.93060631 278 high scalability-2008-03-16-Product: GlusterFS

Introduction: Adapted from their website: GlusterFS is a clustered file-system capable of scaling to several peta-bytes. It aggregates various storage bricks over Infiniband RDMA or TCP/IP interconnect into one large parallel network file system. Storage bricks can be made of any commodity hardware such as x86-64 server with SATA-II RAID and Infiniband HBA). Cluster file systems are still not mature for enterprise market. They are too complex to deploy and maintain though they are extremely scalable and cheap. Can be entirely built out of commodity OS and hardware. GlusterFS hopes to solves this problem. GlusterFS achieved 35 GBps read throughput . The GlusterFS Aggregated I/O Benchmark was performed on 64 bricks clustered storage system over 10 Gbps Infiniband interconnect. A cluster of 220 clients pounded the storage system with multiple dd (disk-dump) instances, each reading / writing a 1 GB file with 1MB block size. GlusterFS was configured with unify translator and round-robin scheduler

2 0.92365921 82 high scalability-2007-09-06-Why doesn't anyone use j2ee?

Introduction: From a reader: > Was reading through your very interesting/useful site. >Most of the architectures are non j2ee-Does that mean that >there aren't enough websites that are scalable(with youtube > like userbase) built with j2ee tech-would like to know if there > are any and their architecture as >well. eBay uses Java, but in a very pragmatic way. They use servlets, an application server, the JDK, and they do the rest themselves. They skip JSP, entity beans, and JMS. When you need to scale putting all your eggs in one basket is a risky strategy. Why use JSP when you can do better? When use entity beans when you can do better? Use servlets because they are a very effective way of handling http requests. Use Java because it is fast, runs everywhere, and has a boat load of libraries you can use to build your build your custom system. Probably the major reason J2EE is absentee is simply LAMP. LAMP is just so incredibly functional for most 2-tier shared nothing site

3 0.90597415 3 high scalability-2007-07-09-LiveJournal Architecture

Introduction: A fascinating and detailed story of how LiveJournal evolved their system to scale. LiveJournal was an early player in the free blog service race and faced issues from quickly adding a large number of users. Blog posts come fast and furious which causes a lot of writes and writes are particularly hard to scale. Understanding how LiveJournal faced their scaling problems will help any aspiring website builder. Site: http://www.livejournal.com/ Information Sources LiveJournal - Behind The Scenes Scaling Storytime Google Video Tokyo Video 2005 version Platform Linux MySql Perl Memcached MogileFS Apache What's Inside? Scaling from 1, 2, and 4 hosts to cluster of servers. Avoid single points of failure. Using MySQL replication only takes you so far. Becoming IO bound kills scaling. Spread out writes and reads for more parallelism. You can't keep adding read slaves and scale. Shard storage approach, using DRBD, for maxim

same-blog 4 0.8980304 1352 high scalability-2012-10-31-Gone Fishin': LiveJournal Architecture

Introduction: This was the first architecture profile on HighScalability. IMHO LiveJournal was really the start of the openness on how to build stuff at scale, setting the whole industry off with an excellent role model. They wrote about their architecture, they open sourced their tools, they showed that success wasn't based on keeping secrets, and they set forth principles still followed by our rather amazing industry. No other industry is so open and cooperative, with their eyes cast so far forward, intent on building cool stuff. When all around seems dark it would be good to keep this little bit of light in mind... A fascinating and detailed story of how LiveJournal evolved their system to scale. LiveJournal was an early player in the free blog service race and faced issues from quickly adding a large number of users. Blog posts come fast and furious which causes a lot of writes and writes are particularly hard to scale. Understanding how LiveJournal faced their scaling problems will help any

5 0.87827951 161 high scalability-2007-11-20-Product: SmartFrog a Distributed Configuration and Deployment Framework

Introduction: From Wikipedia : SmartFrog is an open-source software framework, written in Java, that manages the configuration, deployment and coordination of a software system broken into components. These components may be distributed across several network hosts. The configuration of components is described using a domain-specific language, whose syntax resembles that of Java. It is a prototype-based object-oriented language, and may thus be compared to Self. The framework is used internally in a variety of HP products. Also, it is being used by HP Labs partners like CERN. Related Articles Distributed Testing with SmartFrog Puppet the Automated Administration System

6 0.84766477 285 high scalability-2008-03-19-Serving JavaScript Fast

7 0.83695185 886 high scalability-2010-08-24-21 Quality Screencasts on Scaling Rails

8 0.80320764 1143 high scalability-2011-11-16-Google+ Infrastructure Update - the JavaScript Story

9 0.80053222 1124 high scalability-2011-09-26-17 Techniques Used to Scale Turntable.fm and Labmeeting to Millions of Users

10 0.79813397 415 high scalability-2008-10-15-Need help with your Hadoop deployment? This company may help!

11 0.79616791 378 high scalability-2008-09-03-Some Facebook Secrets to Better Operations

12 0.79575956 729 high scalability-2009-10-28-And the winner is: MySQL or Memcached or Tokyo Tyrant?

13 0.79554081 1297 high scalability-2012-08-03-Stuff The Internet Says On Scalability For August 3, 2012

14 0.79529279 1368 high scalability-2012-12-07-Stuff The Internet Says On Scalability For December 7, 2012

15 0.79524159 1231 high scalability-2012-04-20-Stuff The Internet Says On Scalability For April 20, 2012

16 0.79506475 514 high scalability-2009-02-18-Numbers Everyone Should Know

17 0.79456335 636 high scalability-2009-06-23-Learn How to Exploit Multiple Cores for Better Performance and Scalability

18 0.7943666 1245 high scalability-2012-05-14-DynamoDB Talk Notes and the SSD Hot S3 Cold Pattern

19 0.79422712 936 high scalability-2010-11-09-Facebook Uses Non-Stored Procedures to Update Social Graphs

20 0.79362541 1276 high scalability-2012-07-04-Top Features of a Scalable Database