high_scalability high_scalability-2007 high_scalability-2007-71 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Hi, Some of the articles of the site claims profiling is essential. Is there any established approach to profiling WEB apps? Or it too much depends on technologies used?
sentIndex sentText sentNum sentScore
1 Hi, Some of the articles of the site claims profiling is essential. [sent-1, score-1.449]
2 Is there any established approach to profiling WEB apps? [sent-2, score-1.146]
wordName wordTfidf (topN-words)
[('profiling', 0.679), ('claims', 0.427), ('established', 0.326), ('depends', 0.287), ('articles', 0.218), ('technologies', 0.182), ('apps', 0.176), ('approach', 0.141), ('site', 0.125), ('much', 0.092), ('used', 0.085), ('web', 0.073)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 71 high scalability-2007-08-22-Profiling WEB applications
Introduction: Hi, Some of the articles of the site claims profiling is essential. Is there any established approach to profiling WEB apps? Or it too much depends on technologies used?
2 0.098599873 1197 high scalability-2012-02-21-Pixable Architecture - Crawling, Analyzing, and Ranking 20 Million Photos a Day
Introduction: This is a guest post by Alberto Lopez Toledo, PHD, CTO of Pixable, and Julio Viera, VP of Engineering at Pixable. Pixable aggregates photos from across your different social networks and finds the best ones so you never miss an important moment. That means currently processing the metadata of more than 20 million new photos per day: crawling, analyzing, ranking, and sorting them along with the other 5+ billion that are already stored in our database. Making sense of all that data has challenges, but two in particular rise above the rest: How to access millions of photos per day from Facebook, Twitter, Instagram, and other services in the most efficient manner. How to process, organize, index, and store all the meta-data related to those photos. Sure, Pixable’s infrastructure is changing continuously, but there are some things that we have learned over the last year. As a result, we have been able to build a scalable infrastructure that takes advantage of today’s tools,
3 0.087562062 464 high scalability-2008-12-13-Strategy: Facebook Tweaks to Handle 6 Time as Many Memcached Requests
Introduction: Our latest strategy is taken from a great post by Paul Saab of Facebook , detailing how with changes Facebook has made to memcached they have: ...been able to scale memcached to handle 200,000 UDP requests per second with an average latency of 173 microseconds. The total throughput achieved is 300,000 UDP requests/s, but the latency at that request rate is too high to be useful in our system. This is an amazing increase from 50,000 UDP requests/s using the stock version of Linux and memcached. To scale Facebook has hundreds of thousands of TCP connections open to their memcached processes. First, this is still amazing. It's not so long ago you could have never done this. Optimizing connection use was always a priority because the OS simply couldn't handle large numbers of connections or large numbers of threads or large numbers of CPUs. To get to this point is a big accomplishment. Still, at that scale there are problems that are often solved. Some of the problem Facebook faced a
4 0.084964961 570 high scalability-2009-04-15-Implementing large scale web analytics
Introduction: Does anyone know of any articles or papers that discuss the nuts and bolts of how web analytics is implemented at organizations with large volumes of web traffic and a critcal business need to analyze that data - e.g. places like Amazon.com, eBay, and Google? Just as a fun project I'm planning to build my own web log analysis app that can effectively index and query large volumes of web log data (i.e. TB range). But first I'd like to learn more about how it's done in the organizations whose lifeblood depends on this stuff. Even just a high level architectural overview of their approaches would be nice to have.
5 0.084224023 1003 high scalability-2011-03-14-6 Lessons from Dropbox - One Million Files Saved Every 15 minutes
Introduction: Dropbox saves one million files every 15 minutes, more tweets than even Twitterers tweet. That mind blowing statistic was revealed by Rian Hunter, a Dropbox Engineer, in his presentation How Dropbox Did It and How Python Helped at PyCon 2011. The first part of the presentation is some Dropbox lore, origin stories and other foundational myths. We learn that Dropbox is a startup company located in San Francisco that has probably one of the most popular file synchronization and sharing tools in the world, shipping Python on the desktop and supporting millions of users and growing every day . About half way through the talk turns technical. Not a lot of info on how Dropbox handles this massive scale was dropped, but there were a number of good lessons to ponder: Use Python 99.9 % of their code is in Python. Used on the server backend; desktop client, website controller logic, API backend, and analytics. Can't use Python on the Android due to memory constraints. Runs
6 0.078562766 668 high scalability-2009-08-01-15 Scalability and Performance Best Practices
7 0.075041376 1354 high scalability-2012-11-05-Are we seeing the renaissance of enterprises in the cloud?
8 0.065892734 359 high scalability-2008-07-29-Ehcache - A Java Distributed Cache
9 0.064596385 1032 high scalability-2011-05-02-Stack Overflow Makes Slow Pages 100x Faster by Simple SQL Tuning
10 0.062390015 1462 high scalability-2013-05-22-Strategy: Stop Using Linked-Lists
11 0.05641257 223 high scalability-2008-01-25-Google: Introduction to Distributed System Design
12 0.05603008 375 high scalability-2008-09-01-A Scalability checklist?
13 0.05576111 194 high scalability-2007-12-26-Golden rule of web caching
14 0.054545045 632 high scalability-2009-06-15-starting small with growth in mind
15 0.054089565 72 high scalability-2007-08-22-Wikimedia architecture
17 0.048292965 66 high scalability-2007-08-16-What tech is used to build your favorite site?
18 0.047616418 1102 high scalability-2011-08-22-Strategy: Run a Scalable, Available, and Cheap Static Site on S3 or GitHub
19 0.047085166 821 high scalability-2010-05-03-MocoSpace Architecture - 3 Billion Mobile Page Views a Month
20 0.047029167 883 high scalability-2010-08-20-Hot Scalability Links For Aug 20, 2010
topicId topicWeight
[(0, 0.052), (1, 0.007), (2, 0.006), (3, -0.036), (4, 0.022), (5, -0.017), (6, -0.019), (7, -0.003), (8, -0.002), (9, 0.032), (10, -0.041), (11, -0.031), (12, -0.01), (13, -0.004), (14, 0.009), (15, -0.019), (16, 0.007), (17, -0.008), (18, 0.039), (19, 0.007), (20, -0.018), (21, 0.004), (22, -0.003), (23, 0.005), (24, 0.001), (25, -0.039), (26, -0.037), (27, -0.02), (28, 0.022), (29, -0.024), (30, -0.009), (31, 0.031), (32, -0.012), (33, -0.002), (34, -0.027), (35, 0.006), (36, 0.02), (37, -0.004), (38, -0.014), (39, 0.03), (40, -0.019), (41, 0.007), (42, -0.014), (43, 0.006), (44, -0.01), (45, -0.036), (46, -0.005), (47, -0.015), (48, -0.007), (49, -0.014)]
simIndex simValue blogId blogTitle
same-blog 1 0.93740457 71 high scalability-2007-08-22-Profiling WEB applications
Introduction: Hi, Some of the articles of the site claims profiling is essential. Is there any established approach to profiling WEB apps? Or it too much depends on technologies used?
2 0.72261351 66 high scalability-2007-08-16-What tech is used to build your favorite site?
Introduction: Find out with Builtwith.com. It scans a site and guesses how the site is built. I ran it on this site and it said: Apache, Windows, PHP, Adsense, RSS, CSS, Javascript, and UTF-8 encoding. Correct, yet I think it should have guessed Drupal was the CMS and it should have been able to determine which AJAX library is used. Though it's kind of cool to see which sites use PHP and other technologies.
3 0.69387543 611 high scalability-2009-05-31-Need help on Site loading & database optimization - URGENT
Introduction: Hi Friends, I need some help in making site access fast. On an average my site has the traffic 2500 hits per day and on 16th May it had 60,000 hits. On this day site was loading very slow even it was getting time out. I also check out the processes running by using "top" command it was indicating mysql was taking too much load. There are around 166 tables (Including PHPBB forum) in my database. All contents on site are displayed by fetching it from database. I have also added indexing to respective tables where it is required. Plain PHP/HTML coding is used. Technology: PHP -- 5.2 MYSQL -- 5.0 Apache -- 2.0 Linux Following is all the server details of my site: CPU : Single Socket Dual Core AMD Opteron 1212HE Memory: 2GB DDR RAM Hard Drive: 250GB SATA Ethernet: 100Mb Primary Ethernet Card (/var/log) # uname -a Linux 2.6.9-67.0.15.ELsmp #1 SMP Tue Apr 22 13:50:33 EDT 2008 i686 athlon i386 GNU/Linux kernel version: 2.6.9-67.0.15.ELsmp (/var/log) # free -m total used
4 0.6610508 232 high scalability-2008-01-29-When things aren't scalable
Introduction: OK, I know this site is for scalable web site design. But as there aren't any sites I can find for graceful failure under "slashdotted" like pressure I'll ask here. Does anyone have a sensible way, once you have a "web application" that either won't scale, or can't scale, that you can give some users a good consistent experience and bounce other users to a busy site page. I have seen sites do this to varying degrees, some of which work better than others, but no explanations beyond simply bouncing requests to a "we're busy page server" when you have more than a given number of connections. This is obviously useless as a web page likely requires multiple connection (ignoring keep-alive, pipelining etc) multiple connection to completely render properly. The normal problem is users getting a page and not the "furniture" for that page like images or css. Other problems are having to wait ages to get the busy page or the site being slow even if you do "get in". And some site let
5 0.65877217 632 high scalability-2009-06-15-starting small with growth in mind
Introduction: Hello all, I'm working on a web site that might totally flop or it might explode to be the next facebook/flickr/digg/etc. Since I really don't know how popular the site will be I don't want to spend a ton of money on the hardware/hosting right away but I want to be able to scale it easily if it does grow rapidly. With this in mind, what would be the best approach to launch the site? Thanks, Dan
6 0.65443611 321 high scalability-2008-05-17-WebSphere Commerce High Availability and Performance Configurations
7 0.63380814 10 high scalability-2007-07-15-Book: Building Scalable Web Sites
8 0.62806499 265 high scalability-2008-03-03-Two data streams for a happy website
9 0.62387294 730 high scalability-2009-10-28-GemFire: Solving the hardest problems in data management
10 0.61861557 8 high scalability-2007-07-12-Should I use LAMP or Windows?
11 0.6107803 91 high scalability-2007-09-13-Design Preparations for Scaling
12 0.6023069 1 high scalability-2007-07-06-Start Here
13 0.59655607 965 high scalability-2010-12-29-Pinboard.in Architecture - Pay to Play to Keep a System Small
14 0.58754224 175 high scalability-2007-12-05-how to: Load Balancing with iis
15 0.58397323 159 high scalability-2007-11-18-Reverse Proxy
16 0.58301437 1102 high scalability-2011-08-22-Strategy: Run a Scalable, Available, and Cheap Static Site on S3 or GitHub
17 0.578713 572 high scalability-2009-04-16-Paper: The End of an Architectural Era (It’s Time for a Complete Rewrite)
18 0.57525331 298 high scalability-2008-04-07-Lazy web sites run faster
19 0.57254982 571 high scalability-2009-04-15-Using HTTP cache headers effectively
20 0.57165498 711 high scalability-2009-09-22-How Ravelry Scales to 10 Million Requests Using Rails
topicId topicWeight
[(1, 0.152), (2, 0.164), (83, 0.429)]
simIndex simValue blogId blogTitle
same-blog 1 0.84937644 71 high scalability-2007-08-22-Profiling WEB applications
Introduction: Hi, Some of the articles of the site claims profiling is essential. Is there any established approach to profiling WEB apps? Or it too much depends on technologies used?
2 0.64361745 963 high scalability-2010-12-23-Paper: CRDTs: Consistency without concurrency control
Introduction: For a great Christmas read forget The Night Before Christmas , a heart warming poem written by Clement Moore for his children, that created the modern idea of Santa Clause we all know and anticipate each Christmas eve. Instead, curl up with a some potent eggnog , nog being any drink made with rum, and read CRDTs: Consistency without concurrency control  by Mihai Letia, Nuno Preguiça, and Marc Shapiro, which talks about CRDTs (Commutative Replicated Data Type), a data type whose operations commute when they are concurrent . From the introduction, which also serves as a nice concise overview of distributed consistency issues: Shared read-only data is easy to scale by using well-understood replication techniques. However, sharing mutable data at a large scale is a difficult problem, because of the CAP impossibility result [5]. Two approaches dominate in practice. One ensures scalability by giving up consistency guarantees, for instance using the Last-Writer-Wins (LWW) approach [
3 0.57540393 1026 high scalability-2011-04-18-6 Ways Not to Scale that Will Make You Hip, Popular and Loved By VCs
Introduction: This is a hilarious presentation by Josh Berkus , called Scale Fail , given at O'Reilly MySQL CE 2011. Josh is entertaining, well spoken, and cleverly hides insight inside chaos. And he makes some dang good points along the way. Josh has a problem, you see Josh has learned how to make sites that are both scalable and reliable. So he's puzzled why companies "whose downtime interfaces (Twitter) are more well known than their uptime interfaces" get all the attention, respect, and money for being failures. Just doing your job doesn't make you a hero. You need these self-inflicted wounds in-order to have the war stories to share at conferences. They get the attention. Just doing your job is boring. This is so unfair in that way life can be. So if you want to turn the tables and take the low road to fame and fortune, here's Josh's program for learning how not to scale: Be trendy . Use the tool that has the most buzz: NoSQL, Cloud, MapReduce, Rails, RabbitMQ. It helps you no
4 0.54829806 236 high scalability-2008-02-03-Ideas on how to scale a shared inventory database???
Introduction: We have a database today that holds all of our shared inventory. How do we scale out ? We run into concurrency issues today as mutliple users may want to access the same inventory,etc. Im sure its a common problem.. So how do folks implement this while also having faster response to available inventory and also ensuring no downtime Thanks
5 0.51729089 355 high scalability-2008-07-21-Eucalyptus - Build Your Own Private EC2 Cloud
Introduction: Update: InfoQ links to a few excellent Eucalyptus updates: Velocity Conference Video by Rich Wolski and a Visualization.com interview Rich Wolski on Eucalyptus: Open Source Cloud Computing . Eucalyptus is generating some excitement on the Cloud Computing group as a potential vendor neutral EC2 compatible cloud platform. Two reasons why Eucalyptus is potentially important: private clouds and cloud portability: Private clouds . Let's say you want a cloud like infrastructure for architectural purposes but you want it to run on your own hardware in your own secure environment. How would you do this today? Hm.... Cloud portability . With the number of cloud offerings increasing how can you maintain some level of vendor neutrality among this "swarm" of different options? Portability is a key capability for cloud customers as the only real power customers have is in where they take their business and the only way you can change suppliers is if there's a ready market of fun
6 0.50649214 1203 high scalability-2012-03-02-Stuff The Internet Says On Scalability For March 2, 2012
8 0.46311879 734 high scalability-2009-10-30-Hot Scalabilty Links for October 30 2009
10 0.45953685 866 high scalability-2010-07-27-Sponsored Post: Okta, EzRez, VoltDB, Digg, Cloud Sigma, Applications Manager, Site24x7
11 0.45874619 887 high scalability-2010-08-24-Sponsored Post: deviantART, Okta, EzRez, Cloud Sigma, ManageEngine, Site24x7
12 0.45479983 1633 high scalability-2014-04-16-Six Lessons Learned the Hard Way About Scaling a Million User System
13 0.4517546 821 high scalability-2010-05-03-MocoSpace Architecture - 3 Billion Mobile Page Views a Month
14 0.44630325 310 high scalability-2008-04-29-High performance file server
15 0.44533315 1517 high scalability-2013-09-16-The Hidden DNS Tax - Cascading Timeouts and Errors
16 0.4450618 754 high scalability-2009-12-22-Incremental deployment
17 0.44425708 407 high scalability-2008-10-10-The Art of Capacity Planning: Scaling Web Resources
18 0.44360358 701 high scalability-2009-09-10-When optimizing - don't forget the Java Virtual Machine (JVM)
19 0.44286126 1641 high scalability-2014-05-01-Paper: Can Programming Be Liberated From The Von Neumann Style?
20 0.44165426 983 high scalability-2011-02-02-Piccolo - Building Distributed Programs that are 11x Faster than Hadoop