high_scalability high_scalability-2007 high_scalability-2007-81 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Another scalability strategy brought to you by Erik Osterman: Just thought I'd drop a brief suggestion to anyone building a large mail system. Our solution for scaling mail pickup was to develop a sharded architecture whereby accounts are spread across a cluster of servers, each with imap/pop3 capability. Then we use a cluster of reverse proxies (Perdition) speaking to the backend imap/pop3 servers . The benefit of this approach is you can use simply use round-robin or HA load balancing on the perdition servers that end users connect to (e.g. admins can easily move accounts around on the backend storage servers without affecting end users). Perdition manages routing users to the appropriate backend servers and has MySQL support. What we also liked about this approach was that it had no dependency on a distributed or networked file system, so less chance of corruption or data consistency issues. When an individual server reaches capacity, we just off load users to a less u
sentIndex sentText sentNum sentScore
1 Another scalability strategy brought to you by Erik Osterman: Just thought I'd drop a brief suggestion to anyone building a large mail system. [sent-1, score-0.852]
2 Our solution for scaling mail pickup was to develop a sharded architecture whereby accounts are spread across a cluster of servers, each with imap/pop3 capability. [sent-2, score-1.075]
3 Then we use a cluster of reverse proxies (Perdition) speaking to the backend imap/pop3 servers . [sent-3, score-0.769]
4 The benefit of this approach is you can use simply use round-robin or HA load balancing on the perdition servers that end users connect to (e. [sent-4, score-1.31]
5 admins can easily move accounts around on the backend storage servers without affecting end users). [sent-6, score-0.868]
6 Perdition manages routing users to the appropriate backend servers and has MySQL support. [sent-7, score-0.749]
7 What we also liked about this approach was that it had no dependency on a distributed or networked file system, so less chance of corruption or data consistency issues. [sent-8, score-0.803]
8 When an individual server reaches capacity, we just off load users to a less used server. [sent-9, score-0.584]
9 If any server goes offline, it only affects the fraction of users assigned to that server. [sent-10, score-0.622]
wordName wordTfidf (topN-words)
[('perdition', 0.54), ('mail', 0.231), ('backend', 0.197), ('accounts', 0.188), ('whereby', 0.172), ('osterman', 0.165), ('users', 0.163), ('affects', 0.16), ('pickup', 0.156), ('erik', 0.143), ('suggestion', 0.138), ('servers', 0.137), ('admins', 0.136), ('corruption', 0.134), ('proxies', 0.126), ('affecting', 0.125), ('networked', 0.123), ('reaches', 0.121), ('speaking', 0.121), ('brief', 0.116), ('ha', 0.115), ('dependency', 0.113), ('fraction', 0.113), ('assigned', 0.109), ('offline', 0.105), ('liked', 0.105), ('reverse', 0.096), ('less', 0.094), ('cluster', 0.092), ('brought', 0.092), ('sharded', 0.092), ('appropriate', 0.091), ('approach', 0.09), ('drop', 0.088), ('manages', 0.087), ('connect', 0.085), ('end', 0.085), ('benefit', 0.084), ('chance', 0.081), ('spread', 0.077), ('server', 0.077), ('routing', 0.074), ('anyone', 0.068), ('individual', 0.067), ('develop', 0.067), ('balancing', 0.064), ('consistency', 0.063), ('load', 0.062), ('thought', 0.06), ('strategy', 0.059)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 81 high scalability-2007-09-06-Scaling IMAP and POP3
Introduction: Another scalability strategy brought to you by Erik Osterman: Just thought I'd drop a brief suggestion to anyone building a large mail system. Our solution for scaling mail pickup was to develop a sharded architecture whereby accounts are spread across a cluster of servers, each with imap/pop3 capability. Then we use a cluster of reverse proxies (Perdition) speaking to the backend imap/pop3 servers . The benefit of this approach is you can use simply use round-robin or HA load balancing on the perdition servers that end users connect to (e.g. admins can easily move accounts around on the backend storage servers without affecting end users). Perdition manages routing users to the appropriate backend servers and has MySQL support. What we also liked about this approach was that it had no dependency on a distributed or networked file system, so less chance of corruption or data consistency issues. When an individual server reaches capacity, we just off load users to a less u
2 0.95607495 57 high scalability-2007-08-03-Scaling IMAP and POP3
Introduction: Just thought I'd drop a brief suggestion to anyone building a large mail system. Our solution for scaling mail pickup was to develop a sharded architecture whereby accounts are spread across a cluster of servers, each with imap/pop3 capability. Then we use a cluster of reverse proxies (Perdition) speaking to the backend imap/pop3 servers . The benefit of this approach is you can use simply use round-robin or HA loadbalancing on the perdition servers that end users connect to (e.g. admins can easily move accounts around on the backend storage servers without affecting end users). Perdition manages routing users to the appropriate backend servers and has MySQL support. What we also liked about this approach was that it had no dependency on a distributed or networked filesystem, so less chance of corruption or data consistency issues. When an individual server reaches capacity, we just off load users to a less used server. If any server goes offline, it only affects the fraction of users
3 0.22845024 80 high scalability-2007-09-06-Product: Perdition Mail Retrieval Proxy
Introduction: Perdition is a fully featured POP3 and IMAP4 proxy server. It is able to handle both SSL and non-SSL connections and redirect users to a real-server based on a database lookup. Perdition supports modular based database access. ODBC, MySQL, PostgreSQL, GDBM, POSIX Regular Expression and NIS modules ship with the distribution. The API for modules is open allowing arbitrary modules to be written to allow access to any data store. Perdition has many uses. Including, creating large mail systems where an end-user's mailbox may be stored on one of several hosts, integrating different mail systems together, migrating between different email infrastructures, and bridging plain-text, SSL and TLS services. It can also be used as part of a firewall. The use of perditon to scale mail services beyond a single box is discussed in high capacity email.
4 0.13832207 1501 high scalability-2013-08-13-In Memoriam: Lavabit Architecture - Creating a Scalable Email Service
Introduction: With Lavabit shutting down under murky circumstances , it seems fitting to repost an old (2009), yet still very good post by Ladar Levison on Lavabit's architecture. I don't know how much of this information is still current, but it should give you a general idea what Lavabit was all about. Getting to Know You What is the name of your system and where can we find out more about it? Note: these links are no longer valid... Lavabit http://lavabit.com http://lavabit.com/network.html http://lavabit.com/about.html What is your system for? Lavabit is a mid-sized email service provider. We currently have about 140,000 registered users with more than 260,000 email addresses. While most of our accounts belong to individual users, we also provide corporate email services to approximately 70 companies. Why did you decide to build this system? We built the system to compete against the other large free email providers, with an emphasis on serving the privacy c
5 0.11199915 233 high scalability-2008-01-30-How Rackspace Now Uses MapReduce and Hadoop to Query Terabytes of Data
Introduction: How do you query hundreds of gigabytes of new data each day streaming in from over 600 hyperactive servers? If you think this sounds like the perfect battle ground for a head-to-head skirmish in the great MapReduce Versus Database War , you would be correct. Bill Boebel, CTO of Mailtrust (Rackspace's mail division), has generously provided a fascinating account of how they evolved their log processing system from an early amoeba'ic text file stored on each machine approach, to a Neandertholic relational database solution that just couldn't compete, and finally to a Homo sapien'ic Hadoop based solution that works wisely for them and has virtually unlimited scalability potential. Rackspace faced a now familiar problem. Lots and lots of data streaming in. Where do you store all that data? How do you do anything useful with it? In the first version of their system logs were stored in flat text files and had to be manually searched by engineers logging into each individual machine. T
6 0.10161509 511 high scalability-2009-02-12-MySpace Architecture
7 0.099386156 830 high scalability-2010-05-25-Strategy: Rule of 3 Admins to Save Your Sanity
8 0.095154174 59 high scalability-2007-08-04-Try Squid as a Reverse Proxy
9 0.080948517 1010 high scalability-2011-03-24-Strategy: Disk Backup for Speed, Tape Backup to Save Your Bacon, Just Ask Google
10 0.079945534 297 high scalability-2008-04-05-Skype Plans for PostgreSQL to Scale to 1 Billion Users
11 0.079345971 30 high scalability-2007-07-26-Product: AWStats a Log Analyzer
12 0.074731074 70 high scalability-2007-08-22-How many machines do you need to run your site?
13 0.074403331 313 high scalability-2008-05-02-Friends for Sale Architecture - A 300 Million Page View-Month Facebook RoR App
14 0.07416337 176 high scalability-2007-12-07-Synchronizing databases in different geographic locations
15 0.071640767 1424 high scalability-2013-03-15-Stuff The Internet Says On Scalability For March 15, 2013
16 0.071259119 253 high scalability-2008-02-19-Building a email communication system
17 0.070964187 117 high scalability-2007-10-08-Paper: Understanding and Building High Availability-Load Balanced Clusters
18 0.070118017 259 high scalability-2008-02-25-Any Suggestions for the Architecture Template?
19 0.070118017 260 high scalability-2008-02-25-Architecture Template Advice Needed
20 0.069023766 448 high scalability-2008-11-22-Google Architecture
topicId topicWeight
[(0, 0.119), (1, 0.047), (2, -0.015), (3, -0.105), (4, 0.009), (5, -0.007), (6, 0.025), (7, -0.068), (8, 0.005), (9, 0.052), (10, -0.005), (11, 0.059), (12, -0.003), (13, -0.011), (14, 0.037), (15, 0.09), (16, 0.03), (17, 0.059), (18, -0.044), (19, 0.068), (20, 0.092), (21, -0.019), (22, -0.138), (23, -0.065), (24, 0.027), (25, 0.029), (26, 0.064), (27, 0.009), (28, -0.061), (29, -0.046), (30, -0.001), (31, -0.023), (32, -0.141), (33, -0.108), (34, -0.027), (35, -0.005), (36, -0.043), (37, -0.011), (38, 0.059), (39, 0.021), (40, 0.098), (41, 0.028), (42, 0.015), (43, -0.063), (44, -0.054), (45, -0.027), (46, 0.039), (47, -0.031), (48, -0.019), (49, 0.006)]
simIndex simValue blogId blogTitle
1 0.95947164 57 high scalability-2007-08-03-Scaling IMAP and POP3
Introduction: Just thought I'd drop a brief suggestion to anyone building a large mail system. Our solution for scaling mail pickup was to develop a sharded architecture whereby accounts are spread across a cluster of servers, each with imap/pop3 capability. Then we use a cluster of reverse proxies (Perdition) speaking to the backend imap/pop3 servers . The benefit of this approach is you can use simply use round-robin or HA loadbalancing on the perdition servers that end users connect to (e.g. admins can easily move accounts around on the backend storage servers without affecting end users). Perdition manages routing users to the appropriate backend servers and has MySQL support. What we also liked about this approach was that it had no dependency on a distributed or networked filesystem, so less chance of corruption or data consistency issues. When an individual server reaches capacity, we just off load users to a less used server. If any server goes offline, it only affects the fraction of users
same-blog 2 0.94378549 81 high scalability-2007-09-06-Scaling IMAP and POP3
Introduction: Another scalability strategy brought to you by Erik Osterman: Just thought I'd drop a brief suggestion to anyone building a large mail system. Our solution for scaling mail pickup was to develop a sharded architecture whereby accounts are spread across a cluster of servers, each with imap/pop3 capability. Then we use a cluster of reverse proxies (Perdition) speaking to the backend imap/pop3 servers . The benefit of this approach is you can use simply use round-robin or HA load balancing on the perdition servers that end users connect to (e.g. admins can easily move accounts around on the backend storage servers without affecting end users). Perdition manages routing users to the appropriate backend servers and has MySQL support. What we also liked about this approach was that it had no dependency on a distributed or networked file system, so less chance of corruption or data consistency issues. When an individual server reaches capacity, we just off load users to a less u
3 0.72698945 80 high scalability-2007-09-06-Product: Perdition Mail Retrieval Proxy
Introduction: Perdition is a fully featured POP3 and IMAP4 proxy server. It is able to handle both SSL and non-SSL connections and redirect users to a real-server based on a database lookup. Perdition supports modular based database access. ODBC, MySQL, PostgreSQL, GDBM, POSIX Regular Expression and NIS modules ship with the distribution. The API for modules is open allowing arbitrary modules to be written to allow access to any data store. Perdition has many uses. Including, creating large mail systems where an end-user's mailbox may be stored on one of several hosts, integrating different mail systems together, migrating between different email infrastructures, and bridging plain-text, SSL and TLS services. It can also be used as part of a firewall. The use of perditon to scale mail services beyond a single box is discussed in high capacity email.
Introduction: In 1999 Dan Kegel issued a big hairy audacious challenge to web servers: It's time for web servers to handle ten thousand clients simultaneously, don't you think? After all, the web is a big place now. This became known as the C10K problem . Engineers solved the C10K scalability problems by fixing OS kernels and moving away from threaded servers like Apache to event-driven servers like Nginx and Node. Today we are considering an even bigger goal, how to support 10 Million Concurrent Connections , which requires even more radical techniques. No similar challenge was issued for managing servers in a datacenter, but according to Dave Neary from Red Hat, in a recent FLOSS Weekly episode , we have passed the 10K barrier for server management with 10,000 or more servers managed per sysadmin. Should we let this milestone pass without mention? Absolutely not! It’s a stunning accomplishment with 200x-2000x increases in productivity. Dave said h
5 0.62181884 63 high scalability-2007-08-09-Lots of questions for high scalability - high availability
Introduction: Hey, I do have a website that I would like to scale. Right now we have 10 servers but this does not scale well. I know how to deal with my apache web servers but have problems with sql servers. I would like to use the "scale out" system and add servers when we need. We have over 100Gb of data for mysql and we tried to have around 20G per server. It works well except that if a server goes down then 1/5 of the user can't access the website. We could use replication but we would need to at least double sql servers to replicate each server. And maybe in the future it's not gonna be enough we would need maybe 3 slaves per master ... well I don't really like this idea. I would prefer to have 8 servers that all deal with data from the 5 servers we have right now and then we could add new servers when we need. I looked at NFS but that does not seem to be a good idea for SQL servers ? Can you confirm?
7 0.59062481 308 high scalability-2008-04-22-Simple NFS failover solution with symbolic link?
8 0.58868897 253 high scalability-2008-02-19-Building a email communication system
9 0.57630163 140 high scalability-2007-11-02-How WordPress.com Tracks 300 Servers Handling 10 Million Pageviews
10 0.57350349 473 high scalability-2008-12-20-Second Life Architecture - The Grid
11 0.57281929 21 high scalability-2007-07-23-GoogleTalk Architecture
12 0.56265843 42 high scalability-2007-07-30-Product: GridLayer. Utility computing for online application
13 0.55734515 297 high scalability-2008-04-05-Skype Plans for PostgreSQL to Scale to 1 Billion Users
14 0.55232346 1593 high scalability-2014-02-10-13 Simple Tricks for Scaling Python and Django with Apache from HackerEarth
15 0.55133915 68 high scalability-2007-08-20-TypePad Architecture
16 0.550053 516 high scalability-2009-02-19-Heavy upload server scalability
17 0.54006696 389 high scalability-2008-09-23-How to Scale with Ruby on Rails
18 0.5358901 251 high scalability-2008-02-18-How to deal with an I-O bottleneck to disk?
19 0.52922195 176 high scalability-2007-12-07-Synchronizing databases in different geographic locations
20 0.52872109 70 high scalability-2007-08-22-How many machines do you need to run your site?
topicId topicWeight
[(1, 0.169), (2, 0.207), (47, 0.446), (61, 0.032), (79, 0.018)]
simIndex simValue blogId blogTitle
1 0.90352595 57 high scalability-2007-08-03-Scaling IMAP and POP3
Introduction: Just thought I'd drop a brief suggestion to anyone building a large mail system. Our solution for scaling mail pickup was to develop a sharded architecture whereby accounts are spread across a cluster of servers, each with imap/pop3 capability. Then we use a cluster of reverse proxies (Perdition) speaking to the backend imap/pop3 servers . The benefit of this approach is you can use simply use round-robin or HA loadbalancing on the perdition servers that end users connect to (e.g. admins can easily move accounts around on the backend storage servers without affecting end users). Perdition manages routing users to the appropriate backend servers and has MySQL support. What we also liked about this approach was that it had no dependency on a distributed or networked filesystem, so less chance of corruption or data consistency issues. When an individual server reaches capacity, we just off load users to a less used server. If any server goes offline, it only affects the fraction of users
same-blog 2 0.87129635 81 high scalability-2007-09-06-Scaling IMAP and POP3
Introduction: Another scalability strategy brought to you by Erik Osterman: Just thought I'd drop a brief suggestion to anyone building a large mail system. Our solution for scaling mail pickup was to develop a sharded architecture whereby accounts are spread across a cluster of servers, each with imap/pop3 capability. Then we use a cluster of reverse proxies (Perdition) speaking to the backend imap/pop3 servers . The benefit of this approach is you can use simply use round-robin or HA load balancing on the perdition servers that end users connect to (e.g. admins can easily move accounts around on the backend storage servers without affecting end users). Perdition manages routing users to the appropriate backend servers and has MySQL support. What we also liked about this approach was that it had no dependency on a distributed or networked file system, so less chance of corruption or data consistency issues. When an individual server reaches capacity, we just off load users to a less u
3 0.84005398 163 high scalability-2007-11-21-n-phase commit for FS writes, reads stay local
Introduction: I am try i ng to f i nd a L i nux FS that wi l l a l low me to rep l icate a l l wr i tes synchronous l y to n nodes i n a web server c l uster, wh i le keep i ng a l l reads local. It shou l d not require specialized hardware.
4 0.83342659 94 high scalability-2007-09-17-Blog: Adding Simplicity by Dan Pritchett
Introduction: Dan has genuine insight into building software and large scale scalable systems in particular. You'll always learn something interesting reading his blog. A Quick Hit of What's Inside Inverting the Reliability Stack , In Support of Non-Stop Software , Chaotic Perspectives , Latency Exists, Cope! , A Real eBay Architect Analyzes Part 3 , Avoiding Two Phase Commit, Redux Site: http://www.addsimplicity.com/
5 0.80557007 756 high scalability-2009-12-30-Terrastore - Scalable, elastic, consistent document store.
Introduction: Terrastore is a new-born document store which provides advanced scalability and elasticity features without sacrificing consistency. Here are a few highlights: Ubiquitous: based on the universally supported HTTP protocol. Distributed: nodes can run and live everywhere on your network. Elastic: you can add and remove nodes dynamically to/from your running cluster with no downtime and no changes at all to your configuration. Scalable at the data layer: documents are partitioned and distributed among your nodes, with automatic and transparent re-balancing when nodes join and leave. Scalable at the computational layer: query and update operations are distributed to the nodes which actually holds the queried/updated data, minimizing network traffic and spreading computational load. Consistent: providing per-document consistency, you're guaranteed to always get the latest value of a single document, with read committed isolation for concurrent modifications. Schemales
6 0.79797971 760 high scalability-2010-01-13-10 Hot Scalability Links for January 13, 2010
7 0.78052485 708 high scalability-2009-09-17-Infinispan narrows the gap between open source and commercial data caches
8 0.7737143 852 high scalability-2010-07-07-Strategy: Recompute Instead of Remember Big Data
9 0.75034767 1326 high scalability-2012-09-20-How Vimeo Saves 50% on EC2 by Playing a Smarter Game
10 0.74738586 676 high scalability-2009-08-08-Yahoo!'s PNUTS Database: Too Hot, Too Cold or Just Right?
11 0.72573346 1166 high scalability-2011-12-30-Stuff The Internet Says On Scalability For December 30, 2011
12 0.7177214 550 high scalability-2009-03-30-Ebay history and architecture
13 0.70487273 144 high scalability-2007-11-07-What CDN would you recommend?
14 0.69020039 1062 high scalability-2011-06-15-101 Questions to Ask When Considering a NoSQL Database
15 0.66996175 1054 high scalability-2011-06-06-NoSQL Pain? Learn How to Read-write Scale Without a Complete Re-write
16 0.66607463 1530 high scalability-2013-10-11-Stuff The Internet Says On Scalability For October 11th, 2013
17 0.65202636 276 high scalability-2008-03-15-New Website Design Considerations
18 0.60897136 1440 high scalability-2013-04-15-Scaling Pinterest - From 0 to 10s of Billions of Page Views a Month in Two Years
19 0.59958655 461 high scalability-2008-12-05-Sprinkle - Provisioning Tool to Build Remote Servers
20 0.59831375 683 high scalability-2009-08-18-Hardware Architecture Example (geographical level mapping of servers)