high_scalability high_scalability-2008 high_scalability-2008-207 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: In a recent project, I utilized RoR's cookie-based session storage to shard geographically distinct user groups. My technique for doing so was unique and, although it was a premature optimization, it is none-the-less an idea worth exploring.
sentIndex sentText sentNum sentScore
1 In a recent project, I utilized RoR's cookie-based session storage to shard geographically distinct user groups. [sent-1, score-1.692]
2 My technique for doing so was unique and, although it was a premature optimization, it is none-the-less an idea worth exploring. [sent-2, score-1.293]
wordName wordTfidf (topN-words)
[('premature', 0.348), ('utilized', 0.348), ('ror', 0.344), ('distinct', 0.298), ('geographically', 0.276), ('exploring', 0.272), ('technique', 0.254), ('although', 0.246), ('optimization', 0.221), ('shard', 0.215), ('recent', 0.196), ('session', 0.183), ('worth', 0.168), ('unique', 0.162), ('project', 0.158), ('idea', 0.115), ('user', 0.093), ('storage', 0.083)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 207 high scalability-2008-01-10-Sharding with Cookie-Based Session Storage
Introduction: In a recent project, I utilized RoR's cookie-based session storage to shard geographically distinct user groups. My technique for doing so was unique and, although it was a premature optimization, it is none-the-less an idea worth exploring.
Introduction: Michael Stonebraker sure knows how to stir up a storm. Unlike for others, that doesn't make him a troll in my mind, he's way too accomplished in the field to be that, but he does have a bit of Barnum & Bailey in him, which serves to get the discussion flowing, and that's a good thing. A lot of previously hidden wisdom and passion unlocks, which we'll try to capture here. This disturbance in the force is over OldSQL vs NoSQL vs NewSQL . Warning, these are not crisp categories, there's leakage all over the place, watch your step: OldSQL (Oracle, MySQL, etc) refers to what some want to term as legacy relational database like MySQL, that don't scale out horizontally with aplomb. NoSQL (CouchDB, Redis, Cassandra, HBase, MongoDB, Riak, Neo4j, etc) refers to, well, a collection of technologies that aren't OldSQL, these often are designed to scale out horizontally, aren't on ACID, and use schemaless non-relational datamodels. NewSQL (Xeround, Clustrix, NimbusDB, GenieDB, Sc
3 0.11778684 240 high scalability-2008-02-05-Handling of Session for a site running from more than 1 data center
Introduction: If using a DB to store session(used by some app server, ex.. websphere), how would an enterprise class site that is housed in 2 different data centers(that are live/live) maintain the session between both data centers. The problem as I see it is that since each data center has their own session database, if I was to flip the users to only access Data Center 1(by changing the DNS records for the site or some other Load balancing technique) then that would cause all previous Data Center 2 users to lose their session. What would be some pure hardware based solutions to this that are being used now? That way the applications supporting the web site can be abstracted from this. As I see now, a solution is to possibly have the session databases in both centers some how replicate the data to each other. I just don't see the best way to even accomplish this you are not even guraunteed that the session ID's will be unique since it's 2 different Application Server tiers(again websphere)
4 0.10999719 97 high scalability-2007-09-18-Session management in highly scalable web sites
Introduction: Hi, Every application server has its own session management implementations for supporting high scalability. But an application architect/developer has to design and implement the application to make the best use of it. What are the guiding principles and pattern for session state management? Websphere System management red book mentions that "Session management performance is optimum when session data per user is around 2Kb. It degrades if session data is more than that". I have following questions. 1. How do you measure session data per user? 2. It is generally recommended that you should keep all the session state in database and keep only the keys in HttpSession object. Then everytime a web request is processed, session data is fetched from the database. This way all the data remains in memory only till the request is processed and actual data in HttpSession is very less. (Only few keys). What is the general practice? At what point you should be switching fr
5 0.10963133 672 high scalability-2009-08-06-An Unorthodox Approach to Database Design : The Coming of the Shard
Introduction: Update 4: Why you don’t want to shard. by Morgon on the MySQL Performance Blog. Optimize everything else first, and then if performance still isn’t good enough, it’s time to take a very bitter medicine. Update 3: Building Scalable Databases: Pros and Cons of Various Database Sharding Schemes by Dare Obasanjo. Excellent discussion of why and when you would choose a sharding architecture, how to shard, and problems with sharding. Update 2: Mr. Moore gets to punt on sharding by Alan Rimm-Kaufman of 37signals. Insightful article on design tradeoffs and the evils of premature optimization. With more memory, more CPU, and new tech like SSD, problems can be avoided before more exotic architectures like sharding are needed. Add features not infrastructure. Jeremy Zawodny says he's wrong wrong wrong. we're running multi-core CPUs at slower clock speeds. Moore won't save you. Update: Dan Pritchett shares some excellent Sharding Lessons : Size Your Shards, Use Math on Shard C
6 0.10305678 185 high scalability-2007-12-13-Is premature scalation a real disease?
7 0.10255577 476 high scalability-2008-12-28-How to Organize a Database Table’s Keys for Scalability
8 0.098750077 152 high scalability-2007-11-13-Flickr Architecture
9 0.093844749 304 high scalability-2008-04-19-How to build a real-time analytics system?
10 0.091864213 194 high scalability-2007-12-26-Golden rule of web caching
11 0.08833269 323 high scalability-2008-05-19-Twitter as a scalability case study
12 0.08448305 358 high scalability-2008-07-26-Sharding the Hibernate Way
13 0.082062304 345 high scalability-2008-06-11-Pyshards aspires to build sharding toolkit for Python
14 0.081938013 845 high scalability-2010-06-22-Exploring the software behind Facebook, the world’s largest site
15 0.081283458 555 high scalability-2009-04-04-Performance Anti-Pattern
16 0.079730324 920 high scalability-2010-10-15-Troubles with Sharding - What can we learn from the Foursquare Incident?
17 0.079301126 1126 high scalability-2011-09-27-Use Instance Caches to Save Money: Latency == $$$
18 0.075490139 472 high scalability-2008-12-19-How to measure memory required for a user session
19 0.075330049 203 high scalability-2008-01-07-How Ruby on Rails Survived a 550k Pageview Digging
20 0.069458902 701 high scalability-2009-09-10-When optimizing - don't forget the Java Virtual Machine (JVM)
topicId topicWeight
[(0, 0.05), (1, 0.028), (2, -0.009), (3, -0.031), (4, -0.009), (5, 0.018), (6, -0.016), (7, -0.022), (8, 0.017), (9, 0.017), (10, 0.006), (11, 0.029), (12, -0.032), (13, 0.044), (14, -0.01), (15, 0.021), (16, 0.001), (17, 0.003), (18, 0.012), (19, 0.032), (20, 0.016), (21, -0.003), (22, -0.024), (23, -0.001), (24, -0.055), (25, -0.016), (26, -0.013), (27, -0.082), (28, -0.037), (29, 0.061), (30, 0.003), (31, 0.019), (32, 0.037), (33, 0.003), (34, 0.031), (35, -0.026), (36, 0.045), (37, 0.016), (38, -0.037), (39, -0.027), (40, -0.033), (41, 0.13), (42, 0.033), (43, -0.004), (44, -0.008), (45, -0.01), (46, -0.054), (47, -0.023), (48, 0.015), (49, -0.018)]
simIndex simValue blogId blogTitle
same-blog 1 0.97965336 207 high scalability-2008-01-10-Sharding with Cookie-Based Session Storage
Introduction: In a recent project, I utilized RoR's cookie-based session storage to shard geographically distinct user groups. My technique for doing so was unique and, although it was a premature optimization, it is none-the-less an idea worth exploring.
2 0.74598336 24 high scalability-2007-07-24-Product: Hibernate Shards
Introduction: If you want to adopt a shard architecture, but don't want to start from scratch, you may want to consider Hibernate's sharding system. Hibernate Shards is a framework that is designed to encapsulate and minimize this complexity by adding support for horizontal partitioning to Hibernate Core. Hibernate Shards key features: Standard Hibernate programming model - Hibernate Shards allows you to continue using the Hibernate APIs you know and love: SessionFactory, Session, Criteria, Query. If you already know how to use Hibernate, you already know how to use Hibernate Shards. Flexible sharding strategies - Distribute data across your shards any way you want. Use one of the default strategies we provide or plug in your own application-specific logic. Support for virtual shards - Think your sharding strategy is never going to change? Think again. Adding new shards and redistributing your data is one of the toughest operational challenges you will face once you've deployed your
3 0.68828624 358 high scalability-2008-07-26-Sharding the Hibernate Way
Introduction: Update : A very nice JavaWorld podcast interview with Google engineer Max Ross on Hibernate Shards . Max defines Hibernate Shards (horizontal partitioning), how it works (pretty well), virtual shards (don't ask), what they need to do in the future (query, replication, operational tools), and how it relates to Google AppEngine (not much). To scale you are supposed to partition your data. Sounds good, but how do you do it? When you actually sit down to work out all the details it’s not that easy. Hibernate Shards to the rescue! Hibernate shards is: an extension to the core Hibernate product that adds facilities for horizontal partitioning. If you know the core Hibernate API you know the shards API. No learning curve at all. Here is what a few members of the core group had to say about the Hibernate Shards open source project. Although there are some limitations, from the sound of it they are doing useful stuff in the right way and it’s very much worth looking at, especially if you us
4 0.62948191 476 high scalability-2008-12-28-How to Organize a Database Table’s Keys for Scalability
Introduction: The key (no pun intended) to understanding how to organize your dataset’s data is to think of each shard not as an individual database, but as one large singular database. Just as in a normal single server database setup where you have a unique key for each row within a table, each row key within each individual shard must be unique to the whole dataset partitioned across all shards. There are a few different ways we can accomplish uniqueness of row keys across a shard cluster. Each has its pro’s and con’s and the one chosen should be specific to the problems you’re trying to solve.
5 0.55923319 345 high scalability-2008-06-11-Pyshards aspires to build sharding toolkit for Python
Introduction: I've been interested in sharding concepts since first hearing the term "shard" a few years back. My interest had been piqued earlier, the first time I read about Google's original approach to distributed search. It was described as a hashtable-like system in which independent physical machines play the role of the buckets. More recently, I needed the capacity and performance of a Sharded system, but did not find helpful libraries or toolkits which would assist with the configuration for my language of preference these days, which is Python. And, since I had a few weeks on my hands, I decided I would begin the work of creating these tools. The result of my initial work the Pyshards project, a still-incomplete python and MySQL based horizontal partitioning and sharding toolkit. HighScalability.com readers will already know that horizontal partitioning is a data segmenting pattern in which distinct groups of physical row-based datasets are distributed across multiple partitions. Whe
6 0.54430372 672 high scalability-2009-08-06-An Unorthodox Approach to Database Design : The Coming of the Shard
7 0.53669375 152 high scalability-2007-11-13-Flickr Architecture
8 0.52773136 546 high scalability-2009-03-20-Alternate strategy for database sharding
9 0.49208602 1514 high scalability-2013-09-09-Need Help with Database Scalability? Understand I-O
10 0.48924395 775 high scalability-2010-02-10-ElasticSearch - Open Source, Distributed, RESTful Search Engine
11 0.47621265 847 high scalability-2010-06-23-Product: dbShards - Share Nothing. Shard Everything.
12 0.46796152 857 high scalability-2010-07-13-DbShards Part Deux - The Internals
13 0.46103138 920 high scalability-2010-10-15-Troubles with Sharding - What can we learn from the Foursquare Incident?
14 0.45903778 383 high scalability-2008-09-10-Shard servers -- go big or small?
15 0.45035094 472 high scalability-2008-12-19-How to measure memory required for a user session
16 0.44167945 561 high scalability-2009-04-08-N+1+caching is ok?
17 0.43764567 97 high scalability-2007-09-18-Session management in highly scalable web sites
18 0.42393336 1046 high scalability-2011-05-23-Evernote Architecture - 9 Million Users and 150 Million Requests a Day
19 0.41190681 583 high scalability-2009-04-26-Scale-up vs. Scale-out: A Case Study by IBM using Nutch-Lucene
20 0.40054238 482 high scalability-2009-01-04-Alternative Memcache Usage: A Highly Scalable, Highly Available, In-Memory Shard Index
topicId topicWeight
[(1, 0.099), (2, 0.186), (31, 0.431), (61, 0.091)]
simIndex simValue blogId blogTitle
1 0.94553626 62 high scalability-2007-08-08-Partial String Matching
Introduction: Is there any alternative to LIKE '%...%' OR LIKE '%...%' in MySQL if you have to offer partial string matching on a large dataset?
same-blog 2 0.88659942 207 high scalability-2008-01-10-Sharding with Cookie-Based Session Storage
Introduction: In a recent project, I utilized RoR's cookie-based session storage to shard geographically distinct user groups. My technique for doing so was unique and, although it was a premature optimization, it is none-the-less an idea worth exploring.
3 0.79901171 615 high scalability-2009-06-01-HotPads on AWS
Introduction: HotPads abandoned our managed hosting in December and took the leap over to EC2 and its siblings. The presentation has a lot of detail on costs and other things to watch out for, so if you're currently planning your "cloud" architecture, you'll find some of this really helpful.
4 0.67360663 1651 high scalability-2014-05-20-It's Networking. In Space! Or How E.T. Will Phone Home.
Introduction: What will the version of the Internet that follows us to the stars look like? Yes, people are really thinking seriously about this sort of thing. Specifically the InterPlanetary Networking Special Interest Group (IPNSIG). Ansible-like faster-than-light communication it isn't. There's no magical warp drive. Nor is a network of telepaths acting as a 'verse spanning telegraph system. It's more mundane than that. And in many ways more interesting as it's sort of like the old Internet on steroids, the one that was based on on UUCP and dial-up connections, but over vastly longer distances and with much longer delays : The Interplanetary Internet (based on IPN, also called InterPlaNet) is a conceived computer network in space, consisting of a set of network nodes which can communicate with each other.[1][2] Communication would be greatly delayed by the great interplanetary distances, so the IPN needs a new set of protocols and technology that are tolerant to large delays and
5 0.63164872 368 high scalability-2008-08-17-Wuala - P2P Online Storage Cloud
Introduction: How do you design a reliable distributed file system when the expected availability of the individual nodes are only ~1/5? That is the case for P2P systems. Dominik Grolimund, the founder of a Swiss startup Caleido will show you how! They have launched Wuala , the social online storage service which scales as new nodes join the P2P network. The goal of Wua.la is to provide distributed online storage that is: large scalable reliable secure by harnessing the idle resources of participating computers. This challenge is an old dream of computer science. In fact as Andrew Tanenbaum wrote in 1995: "The design of a world-wide, fully transparent distributed filesystem fot simultaneous use by millions of mobile and frequently disconnected users is left as an exercise for the reader" After three years of research and development at at ETH Zurich, the Swiss Federal Institute of Technology on a distributed storage system, Caleido is ready to unveil the resu
6 0.62064993 892 high scalability-2010-09-02-Distributed Hashing Algorithms by Example: Consistent Hashing
7 0.56270909 785 high scalability-2010-02-26-MySQL and Memcached: End of an Era?
8 0.55929214 1255 high scalability-2012-06-01-Stuff The Internet Says On Scalability For June 1, 2012
9 0.54499787 702 high scalability-2009-09-11-The interactive cloud
10 0.53215301 294 high scalability-2008-04-01-How to update video views count effectively?
11 0.46411449 877 high scalability-2010-08-12-Designing Web Applications for Scalability
12 0.45861876 685 high scalability-2009-08-20-Dependency Injection and AOP frameworks for .NET
13 0.45843044 754 high scalability-2009-12-22-Incremental deployment
14 0.45777887 887 high scalability-2010-08-24-Sponsored Post: deviantART, Okta, EzRez, Cloud Sigma, ManageEngine, Site24x7
15 0.45698258 568 high scalability-2009-04-14-Designing a Scalable Twitter
16 0.45596629 64 high scalability-2007-08-10-How do we make a large real-time search engine?
17 0.45190173 86 high scalability-2007-09-09-Clustering Solution
18 0.45174107 379 high scalability-2008-09-04-Database question for upcoming project
19 0.45161945 240 high scalability-2008-02-05-Handling of Session for a site running from more than 1 data center
20 0.45108116 1135 high scalability-2011-10-31-15 Ways to Make Your Application Feel More Responsive under Google App Engine