high_scalability high_scalability-2008 high_scalability-2008-359 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Ehcache is a pure Java cache with the following features: fast, simple, small foot print, minimal dependencies, provides memory and disk stores for scalability into gigabytes, scalable to hundreds of caches is a pluggable cache for Hibernate, tuned for high concurrent load on large multi-cpu servers, provides LRU, LFU and FIFO cache eviction policies, and is production tested. Ehcache is used by LinkedIn to cache member profiles. The user guide says it's possible to get at 2.5 times system speedup for persistent Object Relational Caching, a 1000 times system speedup for Web Page Caching, and a 1.6 times system speedup Web Page Fragment Caching. From the website: Introduction Ehcache is a cache library. Before getting into ehcache, it is worth stepping back and thinking about caching generally. About Caches Wiktionary defines a cache as A store of things that will be required in future, and can be retrieved rapidly . That is the nub of it. In computer science terms, a cac
sentIndex sentText sentNum sentScore
1 5 times system speedup for persistent Object Relational Caching, a 1000 times system speedup for Web Page Caching, and a 1. [sent-4, score-0.456]
2 Before getting into ehcache, it is worth stepping back and thinking about caching generally. [sent-7, score-0.366]
3 In computer science terms, a cache is a collection of temporary data which either duplicates data located elsewhere or is the result of a computation. [sent-10, score-0.383]
4 Why caching works Locality of Reference While ehcache concerns itself with Java objects, caching is used throughout computing, from CPU caches to the DNS system. [sent-12, score-1.097]
5 While there is a small list of popular items, there is a long tail of less popular ones. [sent-18, score-0.278]
6 The medium answer is that it often depends on whether it is CPU bound or I/O bound. [sent-26, score-0.743]
7 If an application is I/O bound then then the time taken to complete a computation depends principally on the rate at which data can be obtained. [sent-27, score-0.784]
8 If it is CPU bound, then the time taken principally depends on the speed of the CPU and main memory. [sent-28, score-0.416]
9 While the focus for caching is on improving performance, it it also worth realizing that it reduces load. [sent-29, score-0.437]
10 Speeding up CPU bound Applications CPU bound applications are often sped up by: * improving algorithm performance * parallelizing the computations across multiple CPUs (SMP) or multiple machines (Clusters). [sent-32, score-1.127]
11 An example from ehcache would be large web pages that have a high rendering cost. [sent-35, score-0.349]
12 Another caching of authentication status, where authentication requires cryptographic transforms. [sent-36, score-0.464]
13 Speeding up I/O bound Applications Many applications are I/O bound, either by disk or network operations. [sent-37, score-0.376]
14 Hard disks are speeding up by using their own caching of blocks into memory. [sent-41, score-0.306]
15 Network operations can be bound by a number of factors: * time to set up and tear down connections * latency, or the minimum round trip time * throughput limits * marshalling and unmarhshalling overhead The caching of data can often help a lot with I/O bound applications. [sent-42, score-1.293]
16 Some examples of ehcache uses are: * Data Access Object caching for Hibernate * Web page caching, for pages generated from databases. [sent-43, score-0.655]
17 In this case, caching may be able to reduce the workload required. [sent-47, score-0.306]
18 If caching can cause 90 of that 100 to be cache hits and not even get to the database, then the database can scale 10 times higher than otherwise. [sent-48, score-0.622]
19 Therefore the speed up mostly depends on how much reuse a piece of data gets. [sent-51, score-0.427]
20 In a system where data is reused a lot, the speed up is large. [sent-53, score-0.321]
wordName wordTfidf (topN-words)
[('bound', 0.376), ('ehcache', 0.349), ('caching', 0.306), ('depends', 0.175), ('reused', 0.171), ('cache', 0.168), ('principally', 0.162), ('speedup', 0.145), ('tail', 0.119), ('law', 0.104), ('piece', 0.102), ('cpu', 0.101), ('answer', 0.101), ('hibernate', 0.098), ('long', 0.096), ('often', 0.091), ('ecommerce', 0.086), ('computations', 0.084), ('times', 0.083), ('factors', 0.081), ('speed', 0.079), ('authentication', 0.079), ('throughout', 0.074), ('duplicates', 0.073), ('alleviated', 0.073), ('cachingin', 0.073), ('lfu', 0.073), ('marshalling', 0.073), ('nub', 0.073), ('overheadthe', 0.073), ('vernacular', 0.073), ('data', 0.071), ('improving', 0.071), ('short', 0.07), ('eviction', 0.069), ('sped', 0.069), ('anderson', 0.066), ('scalabilitythe', 0.066), ('hits', 0.065), ('term', 0.064), ('obtaining', 0.063), ('small', 0.063), ('used', 0.062), ('coined', 0.061), ('fifo', 0.061), ('reduces', 0.06), ('items', 0.06), ('stepping', 0.06), ('multitude', 0.06), ('parallelizing', 0.06)]
simIndex simValue blogId blogTitle
same-blog 1 0.9999997 359 high scalability-2008-07-29-Ehcache - A Java Distributed Cache
Introduction: Ehcache is a pure Java cache with the following features: fast, simple, small foot print, minimal dependencies, provides memory and disk stores for scalability into gigabytes, scalable to hundreds of caches is a pluggable cache for Hibernate, tuned for high concurrent load on large multi-cpu servers, provides LRU, LFU and FIFO cache eviction policies, and is production tested. Ehcache is used by LinkedIn to cache member profiles. The user guide says it's possible to get at 2.5 times system speedup for persistent Object Relational Caching, a 1000 times system speedup for Web Page Caching, and a 1.6 times system speedup Web Page Fragment Caching. From the website: Introduction Ehcache is a cache library. Before getting into ehcache, it is worth stepping back and thinking about caching generally. About Caches Wiktionary defines a cache as A store of things that will be required in future, and can be retrieved rapidly . That is the nub of it. In computer science terms, a cac
2 0.19831581 1646 high scalability-2014-05-12-4 Architecture Issues When Scaling Web Applications: Bottlenecks, Database, CPU, IO
Introduction: This is a guest repost by Venkatesh CM at Architecture Issues Scaling Web Applications . I will cover architecture issues that show up while scaling and performance tuning large scale web application in this blog. Lets start by defining few terms to create common understanding and vocabulary. Later on I will go through different issues that pop-up while scaling web application like Architecture bottlenecks Scaling Database CPU Bound Application IO Bound Application Determining optimal thread pool size of an web application will be covered in next blog. Performance Term performance of web application is used to mean several things. Most developers are primarily concerned with are response time and scalability. Response Time Is the time taken by web application to process request and return response. Applications should respond to requests (response time) within acceptable duration. If application is taking beyond the acceptable time, it is said to
3 0.1808539 1118 high scalability-2011-09-19-Big Iron Returns with BigMemory
Introduction: This is a guest post by Greg Luck Founder and CTO, Ehcache Terracotta Inc. Note: this article contains a bit too much of a product pitch, but the points are still generally valid and useful. The legendary Moore’s Law, which states that the number of transistors that can be placed inexpensively on an integrated circuit doubles approximately every two years, has held true since 1965. It follows that integrated circuits will continue to get smaller, with chip fabrication currently at a minuscule 22nm process (1). Users of big iron hardware, or servers that are dense in terms of CPU power and memory capacity, benefit from this trend as their hardware becomes cheaper and more powerful over time. At some point soon, however, density limits imposed by quantum mechanics will preclude further density increases. At the same time, low-cost commodity hardware influences enterprise architects to scale their applications horizontally, where processing is spread across clusters of l
4 0.17887495 360 high scalability-2008-08-04-A Bunch of Great Strategies for Using Memcached and MySQL Better Together
Introduction: The primero recommendation for speeding up a website is almost always to add cache and more cache. And after that add a little more cache just in case. Memcached is almost always given as the recommended cache to use. What we don't often hear is how to effectively use a cache in our own products. MySQL hosted two excellent webinars (referenced below) on the subject of how to deploy and use memcached. The star of the show, other than MySQL of course, is Farhan Mashraqi of Fotolog. You may recall we did an earlier article on Fotolog in Secrets to Fotolog's Scaling Success , which was one of my personal favorites. Fotolog, as they themselves point out, is probably the largest site nobody has ever heard of, pulling in more page views than even Flickr. Fotolog has 51 instances of memcached on 21 servers with 175G in use and 254G available. As a large successful photo-blogging site they have very demanding performance and scaling requirements. To meet those requirements they've developed a
5 0.16148674 194 high scalability-2007-12-26-Golden rule of web caching
Introduction: Effective content caching is one of the key features of scalable web sites. Although there are several out-of-the-box options for caching with modern web technologies, a custom built cache still provides the best performance.
6 0.14405571 495 high scalability-2009-01-17-Intro to Caching,Caching algorithms and caching frameworks part 1
7 0.1334139 538 high scalability-2009-03-16-Are Cloud Based Memory Architectures the Next Big Thing?
8 0.13184494 602 high scalability-2009-05-17-Scaling Django Web Apps by Mike Malone
10 0.12996998 1346 high scalability-2012-10-24-Saving Cash Using Less Cache - 90% Savings in the Caching Tier
11 0.12723848 274 high scalability-2008-03-12-YouTube Architecture
12 0.1267772 367 high scalability-2008-08-17-Strategy: Drop Memcached, Add More MySQL Servers
13 0.12124052 373 high scalability-2008-08-29-Product: ScaleOut StateServer is Memcached on Steroids
14 0.12094379 721 high scalability-2009-10-13-Why are Facebook, Digg, and Twitter so hard to scale?
15 0.12079987 911 high scalability-2010-09-30-More Troubles with Caching
16 0.11776892 1207 high scalability-2012-03-12-Google: Taming the Long Latency Tail - When More Machines Equals Worse Results
17 0.11726677 661 high scalability-2009-07-25-Latency is Everywhere and it Costs You Sales - How to Crush it
18 0.11719766 910 high scalability-2010-09-30-Facebook and Site Failures Caused by Complex, Weakly Interacting, Layered Systems
19 0.1156773 662 high scalability-2009-07-27-Handle 700 Percent More Requests Using Squid and APC Cache
20 0.11314775 1413 high scalability-2013-02-27-42 Monster Problems that Attack as Loads Increase
topicId topicWeight
[(0, 0.199), (1, 0.104), (2, -0.047), (3, -0.085), (4, -0.023), (5, 0.062), (6, 0.047), (7, 0.02), (8, -0.115), (9, 0.024), (10, -0.011), (11, -0.06), (12, -0.015), (13, 0.106), (14, -0.086), (15, -0.076), (16, -0.034), (17, -0.055), (18, 0.047), (19, -0.006), (20, -0.079), (21, 0.081), (22, 0.093), (23, 0.013), (24, -0.051), (25, 0.046), (26, -0.022), (27, 0.036), (28, -0.035), (29, 0.025), (30, -0.031), (31, 0.031), (32, 0.035), (33, -0.024), (34, -0.015), (35, 0.016), (36, 0.01), (37, 0.019), (38, 0.025), (39, -0.013), (40, 0.02), (41, 0.032), (42, -0.017), (43, -0.019), (44, 0.006), (45, 0.018), (46, -0.002), (47, 0.007), (48, -0.006), (49, -0.039)]
simIndex simValue blogId blogTitle
same-blog 1 0.96964747 359 high scalability-2008-07-29-Ehcache - A Java Distributed Cache
Introduction: Ehcache is a pure Java cache with the following features: fast, simple, small foot print, minimal dependencies, provides memory and disk stores for scalability into gigabytes, scalable to hundreds of caches is a pluggable cache for Hibernate, tuned for high concurrent load on large multi-cpu servers, provides LRU, LFU and FIFO cache eviction policies, and is production tested. Ehcache is used by LinkedIn to cache member profiles. The user guide says it's possible to get at 2.5 times system speedup for persistent Object Relational Caching, a 1000 times system speedup for Web Page Caching, and a 1.6 times system speedup Web Page Fragment Caching. From the website: Introduction Ehcache is a cache library. Before getting into ehcache, it is worth stepping back and thinking about caching generally. About Caches Wiktionary defines a cache as A store of things that will be required in future, and can be retrieved rapidly . That is the nub of it. In computer science terms, a cac
2 0.853356 1346 high scalability-2012-10-24-Saving Cash Using Less Cache - 90% Savings in the Caching Tier
Introduction: In a paper delivered at HotCloud '12 by a group from CMU and Intel Labs, Saving Cash by Using Less Cache ( slides , pdf ), they show it may be possible to use less DRAM under low load conditions to save on operational costs. There are some issues with this idea, but in a give me more cache era, it could be an interesting source of cost savings for your product. Caching is used to: Reduce load on the database. Reduce latency. Problem: RAM in the cloud is quite expensive. A third of costs can come from the caching tier. Solution: Shrink your cache when the load is lower. Their work shows when the load drops below a certain point you can throw away 50% of your cache while still maintaining performance. A few popular items often account for most of your hits, implying can remove the cache for the long tail. Use two tiers of servers, the Retiring Group, which is the group of servers you want to get rid of. The Primary Group is the group of servers you
3 0.85160482 174 high scalability-2007-12-05-Product: Tugela Cache
Introduction: Tugela Cache is a cache system like memecached, but instead of storing data just in RAM, it stores data in the file system using a b-tree. You trade latency in order to have a very large cache. It's useful for sites that have caching requirements that exceed their available memory. It uses the same wire protocol as memcached so it can be dropped in without a hassle. From the website: As large MediaWiki deployments may gain performance using Memcached, at some level cost of RAM to store all objects becomes too high. In order to balance resource usage and make more use of our Apache server disks, Tugela, the distributed cached on-disk hash database, has arrived. Tugela Cache is derived from Memcached. Much of the code remains the same, but notably, these changes: Internal slab allocator replaced by BerkeleyDB B-Tree database. Expiry policy management moved to external program tugela-expire Much statistics code made obsolete. An interesting point brought up in the comme
4 0.85005397 436 high scalability-2008-11-02-Strategy: How to Manage Sessions Using Memcached
Introduction: Dormando shows an enlightened middle way for storing sessions in cache and the database. Sessions are a perfect cache candidate because they are transient, smallish, and since they are usually accessed on every page access removing all that load from the database is a good thing. But as Dormando points out session caches have problems. If you remove expiration times from the cache and you run out of memory then no more logins. If a cache server fails or needs to be upgrade then you just logged out a bunch of potentially angry users. The middle ground Dormando proposes is using both the cache and the database: Reads : read from the cache first, then the database. Typical cache logic. Writes : write to memcached every time, write to the database every N seconds (assuming the data has changed). There's a small chance of data loss, but you've still greatly reduced the database load while providing reliability. Nice solution.
5 0.83075899 673 high scalability-2009-08-07-Strategy: Break Up the Memcache Dog Pile
Introduction: Update: Asynchronous HTTP cache validations . A proposed HTTP caching extension: if your application can afford to show slightly out of date content, then stale-while-revalidate can guarantee that the user will always be served directly from the cache, hence guaranteeing a consistent response-time user-experience. Caching is like aspirin for headaches. Head hurts: pop a 'sprin. Slow site: add caching. Facebook must have a lot of headaches because they popped 805 memcached servers between 10,000 web servers and 1,800 MySQL servers and they reportedly have a 99% cache hit rate. But what's the best way for you to cache for your application? It's a remarkably complex and rich topic. Alexey Kovyrin talks about one common caching problem called the Dog Pile Effect in Dog-pile Effect and How to Avoid it with Ruby on Rails . Glenn Franxman also has a Django solution in MintCache . Data is usually cached because it's too expensive to calculate for every hit. Maybe it's a gnarly S
6 0.82643336 360 high scalability-2008-08-04-A Bunch of Great Strategies for Using Memcached and MySQL Better Together
8 0.81090963 836 high scalability-2010-06-04-Strategy: Cache Larger Chunks - Cache Hit Rate is a Bad Indicator
9 0.79538941 662 high scalability-2009-07-27-Handle 700 Percent More Requests Using Squid and APC Cache
10 0.79516214 467 high scalability-2008-12-16-[ANN] New Open Source Cache System
11 0.79246962 577 high scalability-2009-04-22-Gear6 Web cache - the hardware solution for working with Memcache
12 0.78454345 1467 high scalability-2013-05-30-Google Finds NUMA Up to 20% Slower for Gmail and Websearch
13 0.7775538 149 high scalability-2007-11-12-Scaling Using Cache Farms and Read Pooling
14 0.77605319 911 high scalability-2010-09-30-More Troubles with Caching
15 0.77424657 495 high scalability-2009-01-17-Intro to Caching,Caching algorithms and caching frameworks part 1
16 0.77175248 1633 high scalability-2014-04-16-Six Lessons Learned the Hard Way About Scaling a Million User System
17 0.76473051 602 high scalability-2009-05-17-Scaling Django Web Apps by Mike Malone
18 0.76444721 1321 high scalability-2012-09-12-Using Varnish for Paywalls: Moving Logic to the Edge
19 0.76028687 1246 high scalability-2012-05-16-Big List of 20 Common Bottlenecks
20 0.75838476 1407 high scalability-2013-02-15-Stuff The Internet Says On Scalability For February 15, 2013
topicId topicWeight
[(1, 0.063), (2, 0.327), (10, 0.033), (14, 0.011), (22, 0.011), (26, 0.018), (30, 0.031), (40, 0.012), (44, 0.144), (61, 0.072), (77, 0.02), (79, 0.11), (85, 0.017), (94, 0.044)]
simIndex simValue blogId blogTitle
same-blog 1 0.95125431 359 high scalability-2008-07-29-Ehcache - A Java Distributed Cache
Introduction: Ehcache is a pure Java cache with the following features: fast, simple, small foot print, minimal dependencies, provides memory and disk stores for scalability into gigabytes, scalable to hundreds of caches is a pluggable cache for Hibernate, tuned for high concurrent load on large multi-cpu servers, provides LRU, LFU and FIFO cache eviction policies, and is production tested. Ehcache is used by LinkedIn to cache member profiles. The user guide says it's possible to get at 2.5 times system speedup for persistent Object Relational Caching, a 1000 times system speedup for Web Page Caching, and a 1.6 times system speedup Web Page Fragment Caching. From the website: Introduction Ehcache is a cache library. Before getting into ehcache, it is worth stepping back and thinking about caching generally. About Caches Wiktionary defines a cache as A store of things that will be required in future, and can be retrieved rapidly . That is the nub of it. In computer science terms, a cac
2 0.94350004 817 high scalability-2010-04-29-Product: SciDB - A Science-Oriented DBMS at 100 Petabytes
Introduction: Scientists are doing it for themselves. Doing what? Databases. The idea is that most databases are designed to meet the needs of businesses, not science, so scientists are banding together at scidb.org to create their own Domain Specific Database, for science. The goal is to be able to handle datasets in the 100PB range and larger. SciDB, Inc. is building an open source database technology product designed specifically to satisfy the demands of data-intensive scientific problems. With the advice of the world's leading scientists across a variety of disciplines including astronomy, biology, physics, oceanography, atmospheric sciences, and climatology, our computer scientists are currently designing and prototyping this technology The scientists that are participating in our open source project believe that the SciDB database — when completed — will dramatically impact their ability to conduct their experiments faster and more efficiently and further improve the qual
3 0.92467827 660 high scalability-2009-07-21-Paper: Parallelizing the Web Browser
Introduction: There have been reports that software engineering is dead . Maybe, like the future, software engineering is simply not evenly distributed? When you read this paper I think you'll agree there is some real engineering going on, it's just that most of the things we need to build do not require real engineering. Much like my old childhood tree fort could be patched together and was "good enough." This brings to mind the old joke: If a software tree falls in the woods would anyone hear it fall? Only if it tweeted on the way down... What this paper really showed me is we need not only to change programming practices and constructs, but we also need to design solutions that allow for deep parallelism to begin with. Grafting parallelism on later is difficult. Parallel execution requires knowing precisely how components are dependent on each other and that level of precision tends to go far beyond the human attention span. In particular this paper deals with how to parallelize the browser on
4 0.92041516 1413 high scalability-2013-02-27-42 Monster Problems that Attack as Loads Increase
Introduction: For solutions take a look at: 7 Life Saving Scalability Defenses Against Load Monster Attacks . This is a look at all the bad things that can happen to your carefully crafted program as loads increase: all hell breaks lose. Sure, you can scale out or scale up, but you can also choose to program better. Make your system handle larger loads. This saves money because fewer boxes are needed and it will make the entire application more reliable and have better response times. And it can be quite satisfying as a programmer. Large Number Of Objects We usually get into scaling problems when the number of objects gets larger. Clearly resource usage of all types is stressed as the number of objects grow. Continuous Failures Makes An Infinite Event Stream During large network failure scenarios there is never time for the system recover. We are in a continual state of stress. Lots of High Priority Work For example, rerouting is a high priority activity. If there is a large amount
5 0.92002481 1387 high scalability-2013-01-15-More Numbers Every Awesome Programmer Must Know
Introduction: Colin Scott , a Berkeley researcher, updated Jeff Dean’s famous Numbers Everyone Should Know with his Latency Numbers Every Programmer Should Know interactive graphic. The interactive aspect is cool because it has a slider that let’s you see numbers back from as early as 1990 to the far far future of 2020. Colin explained his motivation for updating the numbers : The other day, a friend mentioned a latency number to me, and I realized that it was an order of magnitude smaller than what I had memorized from Jeff’s talk. The problem, of course, is that hardware performance increases exponentially! After some digging, I actually found that the numbers Jeff quotes are over a decade old Since numbers without interpretation are simply data, take a look at Google Pro Tip: Use Back-Of-The-Envelope-Calculations To Choose The Best Design . The idea is back-of-the-envelope calculations are estimates you create using a combination of thought experiments and common perfor
7 0.91911751 1418 high scalability-2013-03-06-Low Level Scalability Solutions - The Aggregation Collection
9 0.91696739 1421 high scalability-2013-03-11-Low Level Scalability Solutions - The Conditioning Collection
10 0.91679907 910 high scalability-2010-09-30-Facebook and Site Failures Caused by Complex, Weakly Interacting, Layered Systems
11 0.91656584 221 high scalability-2008-01-24-Mailinator Architecture
12 0.91650373 1456 high scalability-2013-05-13-The Secret to 10 Million Concurrent Connections -The Kernel is the Problem, Not the Solution
13 0.91593009 533 high scalability-2009-03-11-The Implications of Punctuated Scalabilium for Website Architecture
14 0.91380703 721 high scalability-2009-10-13-Why are Facebook, Digg, and Twitter so hard to scale?
15 0.91357762 960 high scalability-2010-12-20-Netflix: Use Less Chatty Protocols in the Cloud - Plus 26 Fixes
16 0.91332668 1017 high scalability-2011-04-06-Netflix: Run Consistency Checkers All the time to Fixup Transactions
17 0.91255695 1001 high scalability-2011-03-09-Google and Netflix Strategy: Use Partial Responses to Reduce Request Sizes
18 0.91236293 844 high scalability-2010-06-18-Paper: The Declarative Imperative: Experiences and Conjectures in Distributed Logic
19 0.91175914 124 high scalability-2007-10-16-How Scalable are Single Page Ajax Apps?
20 0.91155469 76 high scalability-2007-08-29-Skype Failed the Boot Scalability Test: Is P2P fundamentally flawed?