high_scalability high_scalability-2008 high_scalability-2008-276 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: I am in the design phase of getting a website up and running that will have scalability as a main concern. I am looking for opinions on architecture and the like for this endeavor. The site has a few unique characteristics that make scalability difficult. Users will all have a pretty large amount of data that other users will be able to search. The site will be entirely based around search. The catch is that other users will be searching always with a stipulation of 'n' miles from me. I imagine that fact will kill the possibility of query caching for most searches. I have extensive experience with PHP and MYSQL, some experience with ASP.NET/C#, some experience with perl but can learn anything fast. The site will start out on a single server but I want to be 100% certain that I architect the code and databases such that scaling will be simple. What language should I code the site in? What DB would you use: Postgres, MYSQL, MSSQL, BerkelyDB? Should we shard the databa
sentIndex sentText sentNum sentScore
1 I am in the design phase of getting a website up and running that will have scalability as a main concern. [sent-1, score-0.628]
2 I am looking for opinions on architecture and the like for this endeavor. [sent-2, score-0.295]
3 The site has a few unique characteristics that make scalability difficult. [sent-3, score-0.616]
4 Users will all have a pretty large amount of data that other users will be able to search. [sent-4, score-0.414]
5 The catch is that other users will be searching always with a stipulation of 'n' miles from me. [sent-6, score-0.773]
6 I imagine that fact will kill the possibility of query caching for most searches. [sent-7, score-0.742]
7 I have extensive experience with PHP and MYSQL, some experience with ASP. [sent-8, score-0.57]
8 NET/C#, some experience with perl but can learn anything fast. [sent-9, score-0.525]
9 The site will start out on a single server but I want to be 100% certain that I architect the code and databases such that scaling will be simple. [sent-10, score-0.754]
10 What does everyone think for possible architectures on this? [sent-16, score-0.278]
wordName wordTfidf (topN-words)
[('mssql', 0.292), ('berkelydb', 0.279), ('site', 0.259), ('opinions', 0.221), ('experience', 0.211), ('catch', 0.2), ('miles', 0.194), ('possibility', 0.194), ('postgres', 0.184), ('kill', 0.161), ('characteristics', 0.16), ('entirely', 0.16), ('users', 0.158), ('searching', 0.155), ('phase', 0.149), ('extensive', 0.148), ('perl', 0.143), ('mysql', 0.136), ('location', 0.133), ('shard', 0.133), ('architect', 0.125), ('db', 0.123), ('imagine', 0.117), ('certain', 0.113), ('fact', 0.109), ('code', 0.106), ('main', 0.105), ('architectures', 0.105), ('php', 0.105), ('everyone', 0.1), ('unique', 0.1), ('anything', 0.099), ('language', 0.098), ('pretty', 0.098), ('scalability', 0.097), ('amount', 0.091), ('query', 0.089), ('website', 0.08), ('start', 0.078), ('getting', 0.077), ('looking', 0.074), ('possible', 0.073), ('databases', 0.073), ('learn', 0.072), ('caching', 0.072), ('around', 0.067), ('able', 0.067), ('always', 0.066), ('design', 0.062), ('running', 0.058)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999988 276 high scalability-2008-03-15-New Website Design Considerations
Introduction: I am in the design phase of getting a website up and running that will have scalability as a main concern. I am looking for opinions on architecture and the like for this endeavor. The site has a few unique characteristics that make scalability difficult. Users will all have a pretty large amount of data that other users will be able to search. The site will be entirely based around search. The catch is that other users will be searching always with a stipulation of 'n' miles from me. I imagine that fact will kill the possibility of query caching for most searches. I have extensive experience with PHP and MYSQL, some experience with ASP.NET/C#, some experience with perl but can learn anything fast. The site will start out on a single server but I want to be 100% certain that I architect the code and databases such that scaling will be simple. What language should I code the site in? What DB would you use: Postgres, MYSQL, MSSQL, BerkelyDB? Should we shard the databa
2 0.1479072 304 high scalability-2008-04-19-How to build a real-time analytics system?
Introduction: Hello everybody! I am a developer of a website with a lot of traffic. Right now we are managing the whole website using perl + postgresql + fastcgi + memcached + mogileFS + lighttpd + roundrobin DNS distributed over 5 servers and I must say it works like a charm, load is stable and everything works very fast and we are recording about 8 million pageviews per day. The only problem is with postgres database since we have it installed only on one server and if this server goes down, the whole "cluster" goes down. That's why we have a master2slave replication so we still have a backup database except that when the master goes down, all inserts/updates are disabled so the whole website is just read only. But this is not a problem since this configuration is working for us and we don't have any problems with it. Right now we are planning to build our own analytics service that would be customized for our needs. We tried various different software packages but were not satisfi
3 0.13599908 136 high scalability-2007-10-28-Scaling Early Stage Startups
Introduction: Mark Maunder of No VC Required --who advocates not taking VC money lest you be turned into a frog instead of the prince (or princess) you were dreaming of--has an excellent slide deck on how to scale an early stage startup. His blog also has some good SEO tips and a very spooky widget showing the geographical location of his readers. Perfect for Halloween! What is Mark's other worldly scaling strategies for startups? Site: http://novcrequired.com/ Information Sources Slides from Seattle Tech Startup Talk . Scaling Early Stage Startups blog post by Mark Maunder. The Platform Linxux An ISAM type data store. Perl Httperf is used for benchmarking. Websitepulse.com is used for perf monitoring. The Architecture Performance matters because being slow could cost you 20% of your revenue. The UIE guys disagree saying this ain't necessarily so. They explain their reasoning in Usability Tools Podcast: The Truth About Page Download Time . The idea i
4 0.12392465 1 high scalability-2007-07-06-Start Here
Introduction: This page is here to help you get started using High Scalability. Here are a few useful topics to get you going... Why does the High Scalability site exist? Good things to read. Participate by adding your own links to interesting sites and articles. Participate by signing up for the RSS feed. Consider the many benefits of registering as a user. How do I get notification of content and comment changes? Contact High Scalability. About. Why does the High Scalability site exist? To help you build successful scalable websites. This site tries to bring together all the lore, art, science, practice, and experience of building scalable websites into one place so you can learn how to build your website with confidence. When it becomes clear you must grow your website or die, most people have no idea where to start. It's not a skill you learn in school or pick up from a magazine article on a plane flight home. No, building scalable systems is a body o
5 0.11982281 466 high scalability-2008-12-16-Facebook is Hiring
Introduction: I thought with the job situation these days that people might be interested in some open jobs at Facebook. Here's what's available: Facebook is hiring! We are looking for a Systems Engineer/Architect and Site Reliability Engineer. I have attached the job descriptions below. If you are interested, please contact Michelle Bostock mbostock-at-facebook.com. Thanks and Happy Holidays! Systems Architect Palo Alto, CA Description Facebook is seeking a seasoned Systems Architect to join the Operations team. The position is full-time and is based in our main office in downtown Palo Alto and will report to the Manager of Systems Operations. Responsibilities * Analyze application flow and infrastructure design to improve performance and scalability of the site * Collaborate on design of services infrastructure from servers to networking * Monitor, analyze, and make recommendations as appropriate to improve site stability and availability * Evaluate hardware and softwar
7 0.11888805 834 high scalability-2010-06-01-Web Speed Can Push You Off of Google Search Rankings! What Can You Do?
8 0.11654647 808 high scalability-2010-04-12-Poppen.de Architecture
9 0.11624453 152 high scalability-2007-11-13-Flickr Architecture
10 0.1154722 578 high scalability-2009-04-23-Which Key value pair database to be used
11 0.11514954 232 high scalability-2008-01-29-When things aren't scalable
12 0.11358528 73 high scalability-2007-08-23-Postgresql on high availability websites?
13 0.11344521 1440 high scalability-2013-04-15-Scaling Pinterest - From 0 to 10s of Billions of Page Views a Month in Two Years
14 0.11170222 106 high scalability-2007-10-02-Secrets to Fotolog's Scaling Success
15 0.11031376 389 high scalability-2008-09-23-How to Scale with Ruby on Rails
16 0.10839621 611 high scalability-2009-05-31-Need help on Site loading & database optimization - URGENT
20 0.10667334 17 high scalability-2007-07-16-Paper: Guide to Cost-effective Database Scale-Out using MySQL
topicId topicWeight
[(0, 0.179), (1, 0.033), (2, -0.087), (3, -0.124), (4, 0.076), (5, -0.023), (6, -0.098), (7, -0.056), (8, 0.037), (9, 0.034), (10, -0.06), (11, -0.005), (12, -0.009), (13, 0.078), (14, 0.079), (15, -0.072), (16, 0.006), (17, -0.044), (18, 0.027), (19, 0.043), (20, 0.037), (21, -0.022), (22, -0.108), (23, 0.054), (24, -0.029), (25, 0.005), (26, -0.004), (27, -0.076), (28, 0.067), (29, -0.043), (30, 0.069), (31, 0.051), (32, -0.031), (33, 0.008), (34, 0.04), (35, 0.075), (36, -0.01), (37, -0.074), (38, -0.08), (39, 0.062), (40, -0.001), (41, -0.045), (42, 0.035), (43, 0.011), (44, 0.028), (45, -0.052), (46, -0.034), (47, 0.028), (48, -0.032), (49, 0.025)]
simIndex simValue blogId blogTitle
same-blog 1 0.98557514 276 high scalability-2008-03-15-New Website Design Considerations
Introduction: I am in the design phase of getting a website up and running that will have scalability as a main concern. I am looking for opinions on architecture and the like for this endeavor. The site has a few unique characteristics that make scalability difficult. Users will all have a pretty large amount of data that other users will be able to search. The site will be entirely based around search. The catch is that other users will be searching always with a stipulation of 'n' miles from me. I imagine that fact will kill the possibility of query caching for most searches. I have extensive experience with PHP and MYSQL, some experience with ASP.NET/C#, some experience with perl but can learn anything fast. The site will start out on a single server but I want to be 100% certain that I architect the code and databases such that scaling will be simple. What language should I code the site in? What DB would you use: Postgres, MYSQL, MSSQL, BerkelyDB? Should we shard the databa
2 0.82829994 611 high scalability-2009-05-31-Need help on Site loading & database optimization - URGENT
Introduction: Hi Friends, I need some help in making site access fast. On an average my site has the traffic 2500 hits per day and on 16th May it had 60,000 hits. On this day site was loading very slow even it was getting time out. I also check out the processes running by using "top" command it was indicating mysql was taking too much load. There are around 166 tables (Including PHPBB forum) in my database. All contents on site are displayed by fetching it from database. I have also added indexing to respective tables where it is required. Plain PHP/HTML coding is used. Technology: PHP -- 5.2 MYSQL -- 5.0 Apache -- 2.0 Linux Following is all the server details of my site: CPU : Single Socket Dual Core AMD Opteron 1212HE Memory: 2GB DDR RAM Hard Drive: 250GB SATA Ethernet: 100Mb Primary Ethernet Card (/var/log) # uname -a Linux 2.6.9-67.0.15.ELsmp #1 SMP Tue Apr 22 13:50:33 EDT 2008 i686 athlon i386 GNU/Linux kernel version: 2.6.9-67.0.15.ELsmp (/var/log) # free -m total used
3 0.73214507 91 high scalability-2007-09-13-Design Preparations for Scaling
Introduction: Hi there, what do you think is crucial in the code designing of a scalable site? How does one prepare for webfarms and clusters (e.g. in PHP)? Thanks, Stephan
4 0.72580308 632 high scalability-2009-06-15-starting small with growth in mind
Introduction: Hello all, I'm working on a web site that might totally flop or it might explode to be the next facebook/flickr/digg/etc. Since I really don't know how popular the site will be I don't want to spend a ton of money on the hardware/hosting right away but I want to be able to scale it easily if it does grow rapidly. With this in mind, what would be the best approach to launch the site? Thanks, Dan
5 0.71618754 66 high scalability-2007-08-16-What tech is used to build your favorite site?
Introduction: Find out with Builtwith.com. It scans a site and guesses how the site is built. I ran it on this site and it said: Apache, Windows, PHP, Adsense, RSS, CSS, Javascript, and UTF-8 encoding. Correct, yet I think it should have guessed Drupal was the CMS and it should have been able to determine which AJAX library is used. Though it's kind of cool to see which sites use PHP and other technologies.
6 0.70781219 206 high scalability-2008-01-10-MONO ASP.NET. Will it make the web???
7 0.705742 235 high scalability-2008-02-02-The case against ORM Frameworks in High Scalability Architectures
8 0.69317776 711 high scalability-2009-09-22-How Ravelry Scales to 10 Million Requests Using Rails
9 0.68907744 1 high scalability-2007-07-06-Start Here
10 0.67165375 965 high scalability-2010-12-29-Pinboard.in Architecture - Pay to Play to Keep a System Small
11 0.66649979 232 high scalability-2008-01-29-When things aren't scalable
12 0.6643784 54 high scalability-2007-08-02-Multilanguage Website
13 0.65884811 1453 high scalability-2013-05-07-Not Invented Here: A Comical Series on Scalability
14 0.65543866 1349 high scalability-2012-10-29-Gone Fishin': Welcome to High Scalability
15 0.65311605 302 high scalability-2008-04-10-Mysql scalability and failover...
16 0.64729029 1288 high scalability-2012-07-23-Ask HighScalability: How Do I Build My MegaUpload + Itunes + YouTube Startup?
17 0.64712507 61 high scalability-2007-08-07-What qps should we design for in making a MySpace like site?
18 0.6455189 222 high scalability-2008-01-25-Application Database and DAL Architecture
19 0.64534217 167 high scalability-2007-11-27-Starting a website from scratch - what technologies should I use?
20 0.63968265 321 high scalability-2008-05-17-WebSphere Commerce High Availability and Performance Configurations
topicId topicWeight
[(1, 0.233), (2, 0.233), (47, 0.267), (61, 0.118), (79, 0.031)]
simIndex simValue blogId blogTitle
1 0.98388004 57 high scalability-2007-08-03-Scaling IMAP and POP3
Introduction: Just thought I'd drop a brief suggestion to anyone building a large mail system. Our solution for scaling mail pickup was to develop a sharded architecture whereby accounts are spread across a cluster of servers, each with imap/pop3 capability. Then we use a cluster of reverse proxies (Perdition) speaking to the backend imap/pop3 servers . The benefit of this approach is you can use simply use round-robin or HA loadbalancing on the perdition servers that end users connect to (e.g. admins can easily move accounts around on the backend storage servers without affecting end users). Perdition manages routing users to the appropriate backend servers and has MySQL support. What we also liked about this approach was that it had no dependency on a distributed or networked filesystem, so less chance of corruption or data consistency issues. When an individual server reaches capacity, we just off load users to a less used server. If any server goes offline, it only affects the fraction of users
2 0.97523856 81 high scalability-2007-09-06-Scaling IMAP and POP3
Introduction: Another scalability strategy brought to you by Erik Osterman: Just thought I'd drop a brief suggestion to anyone building a large mail system. Our solution for scaling mail pickup was to develop a sharded architecture whereby accounts are spread across a cluster of servers, each with imap/pop3 capability. Then we use a cluster of reverse proxies (Perdition) speaking to the backend imap/pop3 servers . The benefit of this approach is you can use simply use round-robin or HA load balancing on the perdition servers that end users connect to (e.g. admins can easily move accounts around on the backend storage servers without affecting end users). Perdition manages routing users to the appropriate backend servers and has MySQL support. What we also liked about this approach was that it had no dependency on a distributed or networked file system, so less chance of corruption or data consistency issues. When an individual server reaches capacity, we just off load users to a less u
3 0.9666099 756 high scalability-2009-12-30-Terrastore - Scalable, elastic, consistent document store.
Introduction: Terrastore is a new-born document store which provides advanced scalability and elasticity features without sacrificing consistency. Here are a few highlights: Ubiquitous: based on the universally supported HTTP protocol. Distributed: nodes can run and live everywhere on your network. Elastic: you can add and remove nodes dynamically to/from your running cluster with no downtime and no changes at all to your configuration. Scalable at the data layer: documents are partitioned and distributed among your nodes, with automatic and transparent re-balancing when nodes join and leave. Scalable at the computational layer: query and update operations are distributed to the nodes which actually holds the queried/updated data, minimizing network traffic and spreading computational load. Consistent: providing per-document consistency, you're guaranteed to always get the latest value of a single document, with read committed isolation for concurrent modifications. Schemales
4 0.95796239 94 high scalability-2007-09-17-Blog: Adding Simplicity by Dan Pritchett
Introduction: Dan has genuine insight into building software and large scale scalable systems in particular. You'll always learn something interesting reading his blog. A Quick Hit of What's Inside Inverting the Reliability Stack , In Support of Non-Stop Software , Chaotic Perspectives , Latency Exists, Cope! , A Real eBay Architect Analyzes Part 3 , Avoiding Two Phase Commit, Redux Site: http://www.addsimplicity.com/
5 0.95318413 163 high scalability-2007-11-21-n-phase commit for FS writes, reads stay local
Introduction: I am try i ng to f i nd a L i nux FS that wi l l a l low me to rep l icate a l l wr i tes synchronous l y to n nodes i n a web server c l uster, wh i le keep i ng a l l reads local. It shou l d not require specialized hardware.
6 0.93760246 760 high scalability-2010-01-13-10 Hot Scalability Links for January 13, 2010
7 0.90849477 852 high scalability-2010-07-07-Strategy: Recompute Instead of Remember Big Data
8 0.9038375 144 high scalability-2007-11-07-What CDN would you recommend?
9 0.90229511 708 high scalability-2009-09-17-Infinispan narrows the gap between open source and commercial data caches
10 0.9017303 676 high scalability-2009-08-08-Yahoo!'s PNUTS Database: Too Hot, Too Cold or Just Right?
11 0.89934897 550 high scalability-2009-03-30-Ebay history and architecture
12 0.89540142 1062 high scalability-2011-06-15-101 Questions to Ask When Considering a NoSQL Database
same-blog 13 0.8913582 276 high scalability-2008-03-15-New Website Design Considerations
14 0.87731117 1166 high scalability-2011-12-30-Stuff The Internet Says On Scalability For December 30, 2011
15 0.86512166 1054 high scalability-2011-06-06-NoSQL Pain? Learn How to Read-write Scale Without a Complete Re-write
16 0.84450758 1530 high scalability-2013-10-11-Stuff The Internet Says On Scalability For October 11th, 2013
17 0.84376466 1440 high scalability-2013-04-15-Scaling Pinterest - From 0 to 10s of Billions of Page Views a Month in Two Years
18 0.83892542 986 high scalability-2011-02-10-Database Isolation Levels And Their Effects on Performance and Scalability
19 0.82443637 1326 high scalability-2012-09-20-How Vimeo Saves 50% on EC2 by Playing a Smarter Game
20 0.82147986 683 high scalability-2009-08-18-Hardware Architecture Example (geographical level mapping of servers)