high_scalability high_scalability-2008 high_scalability-2008-282 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: [Tim O'Reilly] Continuing my series of queries about how "Web 2.0" companies used databases, I asked Cal Henderson of Flickr to tell me "how the folksonomy model intersects with the traditional database. How do you manage a tag cloud?"
sentIndex sentText sentNum sentScore
1 [Tim O'Reilly] Continuing my series of queries about how "Web 2. [sent-1, score-0.281]
2 0" companies used databases, I asked Cal Henderson of Flickr to tell me "how the folksonomy model intersects with the traditional database. [sent-2, score-1.239]
wordName wordTfidf (topN-words)
[('intersects', 0.443), ('henderson', 0.397), ('cal', 0.397), ('continuing', 0.306), ('tim', 0.274), ('tag', 0.267), ('flickr', 0.232), ('asked', 0.207), ('series', 0.159), ('tell', 0.154), ('traditional', 0.152), ('manage', 0.129), ('queries', 0.122), ('companies', 0.12), ('databases', 0.104), ('model', 0.1), ('cloud', 0.072), ('used', 0.063), ('web', 0.054)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999994 282 high scalability-2008-03-18-Database War Stories #3: Flickr
Introduction: [Tim O'Reilly] Continuing my series of queries about how "Web 2.0" companies used databases, I asked Cal Henderson of Flickr to tell me "how the folksonomy model intersects with the traditional database. How do you manage a tag cloud?"
2 0.15283956 152 high scalability-2007-11-13-Flickr Architecture
Introduction: Update: Flickr hits 2 Billion photos served. That's a lot of hamburgers. Flickr is both my favorite bird and the web's leading photo sharing site. Flickr has an amazing challenge, they must handle a vast sea of ever expanding new content, ever increasing legions of users, and a constant stream of new features, all while providing excellent performance. How do they do it? Site: http://www.flickr.com Information Sources Flickr and PHP (an early document) Capacity Planning for LAMP Federation at Flickr: Doing Billions of Queries a Day by Dathan Pattishall. Building Scalable Web Sites by Cal Henderson from Flickr. Database War Stories #3: Flickr by Tim O'Reilly Cal Henderson's Talks . A lot of useful PowerPoint presentations. Platform PHP MySQL Shards Memcached for a caching layer. Squid in reverse-proxy for html and images. Linux (RedHat) Smarty for templating Perl PEAR for XML and Email parsing ImageMagick, for ima
3 0.15283127 188 high scalability-2007-12-19-How can I learn to scale my project?
Introduction: This is a question asked on the ycombinator list and there are some good responses. I gave a quick response, but I particularly like neilk's knock out of the park insightful answer: Read Cal Henderson's book. (I'd add in Theo's book and Release It! too) The center of your design should be the data store, not a process. You transition the data store from state to state, securely and reliably, in small increments. Avoid globals and session state. The more "pure" your function is, the easier it will be to cache or partition. Don't make your data store too smart. Calculations and renderings should happen in a separate, asynchronous process. The data store should be able to handle lots of concurrent connections. Minimize locking. (Read about optimistic locking). Protect your algorithm from the implementation of the data store, with a helper class or module or whatever. But don't (DO NOT) try to build a framework for any conceivable query. Just the ones your algorithm needs. V
4 0.12220416 41 high scalability-2007-07-30-Product: Flickr
Introduction: Flickr offers a free basic account with limited upload bandwidth and limited storage. Download bandwidth is unlimited. Upgrading to a paid Pro account for $25/year removes all upload and storage restrictions. Flickr's terms of use warn that "professional or corporate uses of Flickr are prohibited", and all external images require a link back to Flickr.
5 0.12179486 285 high scalability-2008-03-19-Serving JavaScript Fast
Introduction: Cal Henderson writes at thinkvitamin.com : "With our so-called "Web 2.0' applications and their rich content and interaction, we expect our applications to increasingly make use of CSS and JavaScript. To make sure these applications are nice and snappy to use, we need to optimize the size and nature of content required to render the page, making sure we’re delivering the optimum experience. In practice, this means a combination of making our content as small and fast to download as possible, while avoiding unnecessarily refetching unmodified resources." A lot of good comments too.
6 0.097360209 348 high scalability-2008-07-09-Federation at Flickr: Doing Billions of Queries Per Day
7 0.070871986 543 high scalability-2009-03-17-Sun to Announce Open Cloud APIs at CommunityOne
8 0.066340387 122 high scalability-2007-10-14-Product: The Spread Toolkit
9 0.062538534 204 high scalability-2008-01-08-Virus Scanning for Uploaded content
10 0.061777681 65 high scalability-2007-08-16-Scaling Secret #2: Denormalizing Your Way to Speed and Profit
11 0.055349007 918 high scalability-2010-10-12-The CIO’s Problem: Cloud “Mess” or Cloud “Mash”
12 0.05474893 1514 high scalability-2013-09-09-Need Help with Database Scalability? Understand I-O
13 0.051450346 407 high scalability-2008-10-10-The Art of Capacity Planning: Scaling Web Resources
15 0.050463974 560 high scalability-2009-04-08-Learned lessons from the largest player (Flickr, YouTube, Google, etc)
16 0.049904108 539 high scalability-2009-03-16-Books: Web 2.0 Architectures and Cloud Application Architectures
17 0.049499851 325 high scalability-2008-05-25-How do you explain cloud computing to your grandma?
18 0.049440864 584 high scalability-2009-04-27-Some Questions from a newbie
19 0.047928829 589 high scalability-2009-05-05-Drop ACID and Think About Data
20 0.046648134 1 high scalability-2007-07-06-Start Here
topicId topicWeight
[(0, 0.054), (1, 0.003), (2, 0.026), (3, 0.002), (4, -0.002), (5, 0.015), (6, -0.023), (7, -0.045), (8, 0.011), (9, 0.011), (10, 0.001), (11, 0.018), (12, -0.045), (13, 0.026), (14, 0.005), (15, -0.018), (16, 0.005), (17, -0.009), (18, 0.019), (19, -0.001), (20, -0.022), (21, -0.015), (22, -0.001), (23, -0.004), (24, -0.012), (25, -0.006), (26, -0.023), (27, 0.024), (28, 0.029), (29, 0.015), (30, 0.002), (31, 0.01), (32, 0.044), (33, -0.005), (34, 0.0), (35, -0.006), (36, 0.008), (37, 0.019), (38, -0.034), (39, -0.015), (40, -0.045), (41, 0.028), (42, -0.01), (43, -0.028), (44, -0.032), (45, 0.019), (46, -0.011), (47, -0.019), (48, -0.02), (49, -0.016)]
simIndex simValue blogId blogTitle
same-blog 1 0.92014503 282 high scalability-2008-03-18-Database War Stories #3: Flickr
Introduction: [Tim O'Reilly] Continuing my series of queries about how "Web 2.0" companies used databases, I asked Cal Henderson of Flickr to tell me "how the folksonomy model intersects with the traditional database. How do you manage a tag cloud?"
2 0.62415737 407 high scalability-2008-10-10-The Art of Capacity Planning: Scaling Web Resources
Introduction: Update 3: The book was released! Find it on Amazon at The Art of Capacity Planning . Update 2: Maybe the iPhone can use a little capacity planning? What's Behind the iPhone 3G Glitches : One source says Apple programmed the Infineon chip to demand a more powerful 3G signal than the iPhone really requires. So if too many people try to make a call or go on the Internet in a given area, some of the devices will decide there's insufficient power and switch to the slower network—even if there is enough 3G bandwidth available. Update: To get a taste of what will be served, mySQL DBA has a nice post titled Capacity Planning, Architecture, Scaling, Response time, Throughput . You learn how to figure out when your application will break by building a 3rd order polynomial. Cool stuff! John Allspaw who is the Operations Engineering Manager at Flickr is about to publish a book with O'Reilly. There are not much details so far but it seems interesting and relev
3 0.55534089 539 high scalability-2009-03-16-Books: Web 2.0 Architectures and Cloud Application Architectures
Introduction: I am excited about the upcoming release of two books on Web 2.0 and Cloud Application Architectures by O'Reilly. Web 2.0 Architectures (estimated release in May 2009) What entrepreneurs and information architects need to know Using several high-profile Web 2.0 companies as examples, authors Duane Nickull, Dion Hinchcliffe, and James Governor have distilled the core patterns of Web 2.0 coupled with an abstract model and reference architecture. The result is a base of knowledge that developers, business people, futurists, and entrepreneurs can understand and use as a source of ideas and inspiration. Featured architectures include Google, Flickr, BitTorrent, MySpace, Facebook, and Wikipedia. Cloud Application Architectures (estimated release in April 2009) Building Applications and Infrastructure in the Cloud This book by George Reese offers tested techniques for creating web applications on cloud computing infrastructures and for migrating existing system
4 0.54187453 974 high scalability-2011-01-18-Paper: Relational Cloud: A Database-as-a-Service for the Cloud
Introduction: The Relational Cloud Project is an effort by a group of researchers at MIT to investigate technologies and challenges related to Database-as-a-Service within cloud-computing . They are trying to figure out how the advantages of the DaaS (Database-as-a-Service) model, that we've seen arise in other areas like OLAP and NoSQL, can be applied to relational databases. The DaaS advantages as they see them are: 1) predictable costs, proportional to the quality of service and actual workloads, 2) lower technical complexity, thanks to a unified and simplified service access interface, and 3) virtually infinite resources ready at hand. An interesting description of their approach is explained in the paper Relational Cloud: A Database-as-a-Service for the Cloud . From the abstract: This paper introduces a new transactional “database-as-a-service” (DBaaS) called Relational Cloud. A DBaaS promises to move much of the operational burden of provisioning, configuration, scaling, performance tun
5 0.53291482 428 high scalability-2008-10-24-11 Secrets of a Cloud Scale Consultant That They Dont' Want You to Know
Introduction: OK, there is no "they" and "they" wouldn't care if you knew anyway. After all, this isn't a blog about really important stuff like investing, acne cures, or cheap natural cleansing products. But the secrets are real. Super cloud scaling consultant Kent Langley has put together a comprehensive checklist to consider when developing for the cloud: ORM for Data Partitioning and Query Splitting - Split queries between updates and deletes from the start Monitoring process, resources, and uptime - Process Monitoring, Resource Monitoring, UpTime Monitoring Performance Testing and Capacity Planning - Can't make good decisions without doing some degree of Performance Testing and Capacity planning. Static vs. Dynamic Content splitting / CDN - Reverse Proxy, Splitting Static and Dynamic content Bundling and Compressing JS and CSS - Bundle them, compress, version, and then properly cache those bundles Logging - Log appropriately and monitor those logs Pragmatic Cach
6 0.53281462 348 high scalability-2008-07-09-Federation at Flickr: Doing Billions of Queries Per Day
7 0.51193565 89 high scalability-2007-09-10-Is there a difference between partitioning and federation and sharding?
8 0.50819778 1654 high scalability-2014-06-05-Cloud Architecture Revolution
9 0.50433713 452 high scalability-2008-12-01-An Open Source Web Solution - Lighttpd Web Server and Chip Multithreading Technology
10 0.49320006 65 high scalability-2007-08-16-Scaling Secret #2: Denormalizing Your Way to Speed and Profit
11 0.49294263 1087 high scalability-2011-07-26-Web 2.0 Killed the Middleware Star
12 0.48920307 757 high scalability-2010-01-04-11 Strategies to Rock Your Startup’s Scalability in 2010
13 0.4880605 358 high scalability-2008-07-26-Sharding the Hibernate Way
14 0.48755261 797 high scalability-2010-03-19-Hot Scalability Links for March 19, 2010
15 0.48166263 139 high scalability-2007-10-30-Paper: Dynamo: Amazon’s Highly Available Key-value Store
16 0.47637191 584 high scalability-2009-04-27-Some Questions from a newbie
18 0.47477409 524 high scalability-2009-03-04-Its time for auto scaling – avoid peak load provisioning for web applications
19 0.47156769 31 high scalability-2007-07-26-Product: Symfony a Web Framework
20 0.47117925 285 high scalability-2008-03-19-Serving JavaScript Fast
topicId topicWeight
[(1, 0.177), (4, 0.546), (79, 0.073)]
simIndex simValue blogId blogTitle
same-blog 1 0.82698435 282 high scalability-2008-03-18-Database War Stories #3: Flickr
Introduction: [Tim O'Reilly] Continuing my series of queries about how "Web 2.0" companies used databases, I asked Cal Henderson of Flickr to tell me "how the folksonomy model intersects with the traditional database. How do you manage a tag cloud?"
2 0.67909116 469 high scalability-2008-12-17-Scalability Strategies Primer: Database Sharding
Introduction: This article is a primer, intended to shine some much needed light on the logical, process oriented implementations of database scalability strategies in the form of a broad introduction. More specifically, the intent is to elaborate on the majority of these implementations by example.
3 0.60346115 12 high scalability-2007-07-15-Isilon Clustred Storage System
Introduction: The Isilon IQ family of clustered storage systems was designed from the ground up to meet the needs of data-intensive enterprises and high-performance computing environments. By combining Isilon's OneFS® operating system software with the latest advances in industry-standard hardware, Isilon delivers modular, pay-as-you-grow, enterprise-class clustered storage systems. OneFS, with TrueScale™ technology, powers the industry's first and only storage system that enables linear or independent scaling of performance and capacity. This new flexible and tunable system, featuring a robust suite of clustered storage software applications, provides customers with an "out of the box" solution that is fully optimized for the widest range of applications and workflow needs. * Scales from 4 TB ti 1 PB * Throughput of up to 10 GB per seond * Linear scaling * Easy to manage Related Articles Inside Skinny On Isilon by StorageMojo
4 0.47599757 1343 high scalability-2012-10-18-Save up to 30% by Selecting Better Performing Amazon Instances
Introduction: If you like the idea of exploiting market inconsistencies to lower your costs then you will love this paper and video from the Hot Cloud '12 conference: Exploiting Hardware Heterogeneity within the Same Instance Type of Amazon EC2 . The conclusion is interesting and is a source of good guidance: Amazon EC2 uses diversified hardware to host the same type of instance. The hardware diversity results in performance variation. In general, the variation between the fast instances and slow instances can reach 40%. In some applications, the variation can even approach up to 60%. By selecting fast instances within the same instance type, Amazon EC2 users can acquire up to 30% of cost saving, if the fast instances have a relatively low probability. The abstract: Cloud computing providers might start with near-homogeneous hardware environment. Over time, the homogeneous environment will most likely evolve into heterogeneous one because of possible upgrades and replac
5 0.46329054 1157 high scalability-2011-12-14-Virtualization and Cloud Computing is Changing the Network to East-West Routing
Introduction: It’s called “east-west” networking, which when compared to its predecessor, “north-south” networking, evinces images of maelstroms and hurricane winds and tsunamis for some reason. It could be the subtle correlation between the transformative shift this change in networking patterns has on the data center with that of El Niño’s transformative power upon the weather patterns across the globe. Traditionally, data center networks have focused on North-South network traffic. The assumption is that clients on the edge would mainly communicate with servers at the core, rather than across the network to other clients. But server virtualization changes all this, with servers, virtual appliances and even virtual desktops scattered across the same physical infrastructure. These environments are also highly dynamic, with workloads moving to different physical locations on the network as virtual servers are migrated (in the case of data center networks) and clients move
6 0.40292969 40 high scalability-2007-07-30-Product: Amazon Elastic Compute Cloud
8 0.38964224 309 high scalability-2008-04-23-Behind The Scenes of Google Scalability
9 0.3828606 1164 high scalability-2011-12-27-PlentyOfFish Update - 6 Billion Pageviews and 32 Billion Images a Month
10 0.37095308 79 high scalability-2007-09-01-On-Demand Infinitely Scalable Database Seed the Amazon EC2 Cloud
11 0.36635587 916 high scalability-2010-10-07-Hot Scalability Links For Oct 8, 2010
12 0.36280447 1213 high scalability-2012-03-22-Paper: Revisiting Network I-O APIs: The netmap Framework
15 0.35707194 919 high scalability-2010-10-14-I, Cloud
16 0.33103547 1094 high scalability-2011-08-08-Tagged Architecture - Scaling to 100 Million Users, 1000 Servers, and 5 Billion Page Views
17 0.31464887 184 high scalability-2007-12-13-Amazon SimpleDB - Scalable Cloud Database
18 0.31437796 410 high scalability-2008-10-13-SQL Server 2008 Database Performance and Scalability
19 0.31360766 11 high scalability-2007-07-15-Coyote Point Load Balancing Systems
20 0.31247017 305 high scalability-2008-04-21-Google App Engine - what about existing applications?