high_scalability high_scalability-2007 high_scalability-2007-182 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: People sometimes wonder why Oracle isn't mentioned on this site more. Maybe it will now as Michael Nygard reports Oracle 11g now does read/write splitting with their Active Data Guard product. Average replication latency was 1 second and it's accomplished with standard Oracle JDBC drivers. They see a 250% increase in transactions per service for read-write service. And a 110% improvement in tps for read-only service was found. You see a change in hardware architecture with the new setup. They now recommend using a primary and multiple standby servers, a single controller per server, and a single set of disks in RAID1. Previously the recommendation was to have a primary and secondary server with two controllers per server and a set of mirrored disks per controller. The changes increase performance, availability, and hardware utilization. They also have a useful looking best practices document for High Availability called Maximum Availability Architecture (MAA) .
sentIndex sentText sentNum sentScore
1 People sometimes wonder why Oracle isn't mentioned on this site more. [sent-1, score-0.47]
2 Maybe it will now as Michael Nygard reports Oracle 11g now does read/write splitting with their Active Data Guard product. [sent-2, score-0.154]
3 Average replication latency was 1 second and it's accomplished with standard Oracle JDBC drivers. [sent-3, score-0.499]
4 They see a 250% increase in transactions per service for read-write service. [sent-4, score-0.63]
5 And a 110% improvement in tps for read-only service was found. [sent-5, score-0.439]
6 You see a change in hardware architecture with the new setup. [sent-6, score-0.379]
7 They now recommend using a primary and multiple standby servers, a single controller per server, and a single set of disks in RAID1. [sent-7, score-1.553]
8 Previously the recommendation was to have a primary and secondary server with two controllers per server and a set of mirrored disks per controller. [sent-8, score-1.997]
9 They also have a useful looking best practices document for High Availability called Maximum Availability Architecture (MAA) . [sent-10, score-0.462]
wordName wordTfidf (topN-words)
[('nygard', 0.249), ('disks', 0.247), ('mirrored', 0.224), ('primary', 0.215), ('availability', 0.212), ('jdbc', 0.211), ('standby', 0.205), ('per', 0.197), ('recommendation', 0.196), ('tps', 0.194), ('controllers', 0.192), ('oracle', 0.189), ('accomplished', 0.185), ('recommend', 0.172), ('increase', 0.16), ('controller', 0.16), ('splitting', 0.154), ('improvement', 0.146), ('wonder', 0.145), ('mentioned', 0.145), ('previously', 0.141), ('secondary', 0.138), ('document', 0.125), ('hardware', 0.125), ('practices', 0.124), ('sometimes', 0.12), ('server', 0.115), ('set', 0.113), ('maybe', 0.103), ('average', 0.1), ('service', 0.099), ('architecture', 0.098), ('standard', 0.098), ('single', 0.095), ('see', 0.092), ('useful', 0.086), ('replication', 0.084), ('transactions', 0.082), ('changes', 0.076), ('looking', 0.068), ('second', 0.067), ('latency', 0.065), ('change', 0.064), ('site', 0.06), ('best', 0.059), ('multiple', 0.054), ('people', 0.048), ('two', 0.048), ('servers', 0.041), ('high', 0.04)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 182 high scalability-2007-12-12-Oracle Can Do Read-Write Splitting Too
Introduction: People sometimes wonder why Oracle isn't mentioned on this site more. Maybe it will now as Michael Nygard reports Oracle 11g now does read/write splitting with their Active Data Guard product. Average replication latency was 1 second and it's accomplished with standard Oracle JDBC drivers. They see a 250% increase in transactions per service for read-write service. And a 110% improvement in tps for read-only service was found. You see a change in hardware architecture with the new setup. They now recommend using a primary and multiple standby servers, a single controller per server, and a single set of disks in RAID1. Previously the recommendation was to have a primary and secondary server with two controllers per server and a set of mirrored disks per controller. The changes increase performance, availability, and hardware utilization. They also have a useful looking best practices document for High Availability called Maximum Availability Architecture (MAA) .
Introduction: “ Data is everywhere, never be at a single location. Not scalable, not maintainable. ” –Alex Szalay While Galileo played life and death doctrinal games over the mysteries revealed by the telescope, another revolution went unnoticed, the microscope gave up mystery after mystery and nobody yet understood how subversive would be what it revealed. For the first time these new tools of perceptual augmentation allowed humans to peek behind the veil of appearance. A new new eye driving human invention and discovery for hundreds of years. Data is another material that hides, revealing itself only when we look at different scales and investigate its underlying patterns. If the universe is truly made of information , then we are looking into truly primal stuff. A new eye is needed for Data and an ambitious project called Data-scope aims to be the lens. A detailed paper on the Data-Scope tells more about what it is: The Data-Scope is a new scientific instrum
3 0.14035201 157 high scalability-2007-11-16-Product: lbpool - Load Balancing JDBC Pool
Introduction: From the website: The lbpool project provides a load balancing JDBC driver for use with DB connection pools. It wraps a normal JDBC driver providing reconnect semantics in the event of additional hardware availability, partial system failure, or uneven load distribution. It also evenly distributes all new connections among slave DB servers in a given pool. Each time connect() is called it will attempt to use the best server with the least system load. The biggest scalability issue with large applications that are mostly READ bound is the number of transactions per second that the disks in your cluster can handle. You can generally solve this in two ways. 1. Buy bigger and faster disks with expensive RAID controllers. 2. Buy CHEAP hardware on CHEAP disks but lots of machines. We prefer the cheap hardware approach and lbpool allows you to do this. Even if you *did* manage to use cheap hardware most load balancing hardware is expensive, requires a redundant balancer (if it
Introduction: This a guest post by Rajkumar Iyer , a Member of Technical Staff at Aerospike. About a year ago, Aerospike embarked upon a quest to increase in-memory database performance - 1 Million TPS on a single inexpensive commodity server. NoSQL has the reputation of speed, and we saw great benefit from improving latency and throughput of cacheless architectures. At that time, we took a version of Aerospike delivering about 200K TPS, improved a few things - performance went to 500k TPS - and published the Aerospike 2.0 Community Edition. We then used kernel tuning techniques and published the recipe for how we achieved 1 M TPS on $5k of hardware. This year we continued the quest. Our goal was to achieve 1 Million database transactions per second per server; more than doubling previous performance. This compares to Cassandra’s boast of 1M TPS on over 300 servers in Google Compute Engine - at a cost of $2 million dollars per year. We achieved this without kernel tuning. This article d
5 0.11292823 1279 high scalability-2012-07-09-Data Replication in NoSQL Databases
Introduction: This is the third guest post ( part 1 , part 2 ) of a series by Greg Lindahl, CTO of blekko, the spam free search engine. Previously, Greg was Founder and Distinguished Engineer at PathScale, at which he was the architect of the InfiniPath low-latency InfiniBand HCA, used to build tightly-coupled supercomputing clusters. blekko's home-grown NoSQL database was designed from the start to support a web-scale search engine, with 1,000s of servers and petabytes of disk. Data replication is a very important part of keeping the database up and serving queries. Like many NoSQL database authors, we decided to keep R=3 copies of each piece of data in the database, and not use RAID to improve reliability. The key goal we were shooting for was a database which degrades gracefully when there are many small failures over time, without needing human intervention. Why don't we like RAID for big NoSQL databases? Most big storage systems use RAID levels like 3, 4, 5, or 10 to improve relia
6 0.11181126 621 high scalability-2009-06-06-Graph server
7 0.10710015 102 high scalability-2007-09-27-Product: Sequoia Database Clustering Technology
8 0.10375573 1565 high scalability-2013-12-16-22 Recommendations for Building Effective High Traffic Web Software
9 0.10192687 1004 high scalability-2011-03-14-Twitter by the Numbers - 460,000 New Accounts and 140 Million Tweets Per Day
10 0.10050969 661 high scalability-2009-07-25-Latency is Everywhere and it Costs You Sales - How to Crush it
11 0.094953127 271 high scalability-2008-03-08-Product: DRBD - Distributed Replicated Block Device
12 0.094885491 1041 high scalability-2011-05-15-Building a Database remote availability site
13 0.094157077 331 high scalability-2008-05-27-eBay Architecture
14 0.09364666 383 high scalability-2008-09-10-Shard servers -- go big or small?
15 0.093086526 513 high scalability-2009-02-16-Handle 1 Billion Events Per Day Using a Memory Grid
16 0.091407329 274 high scalability-2008-03-12-YouTube Architecture
17 0.090314686 308 high scalability-2008-04-22-Simple NFS failover solution with symbolic link?
18 0.089763731 72 high scalability-2007-08-22-Wikimedia architecture
19 0.089409433 1180 high scalability-2012-01-24-The State of NoSQL in 2012
20 0.086691141 829 high scalability-2010-05-20-Strategy: Scale Writes to 734 Million Records Per Day Using Time Partitioning
topicId topicWeight
[(0, 0.145), (1, 0.046), (2, -0.03), (3, -0.056), (4, -0.027), (5, 0.03), (6, 0.034), (7, -0.035), (8, -0.009), (9, 0.006), (10, -0.014), (11, -0.033), (12, -0.009), (13, 0.014), (14, -0.007), (15, 0.07), (16, 0.033), (17, 0.027), (18, -0.007), (19, 0.031), (20, 0.02), (21, 0.044), (22, -0.034), (23, -0.069), (24, -0.044), (25, -0.049), (26, -0.042), (27, -0.045), (28, 0.033), (29, 0.041), (30, 0.056), (31, 0.001), (32, -0.01), (33, -0.054), (34, -0.013), (35, 0.064), (36, 0.024), (37, 0.017), (38, -0.037), (39, 0.017), (40, 0.022), (41, -0.059), (42, 0.038), (43, -0.006), (44, 0.045), (45, -0.02), (46, 0.002), (47, 0.004), (48, -0.021), (49, 0.014)]
simIndex simValue blogId blogTitle
same-blog 1 0.97231352 182 high scalability-2007-12-12-Oracle Can Do Read-Write Splitting Too
Introduction: People sometimes wonder why Oracle isn't mentioned on this site more. Maybe it will now as Michael Nygard reports Oracle 11g now does read/write splitting with their Active Data Guard product. Average replication latency was 1 second and it's accomplished with standard Oracle JDBC drivers. They see a 250% increase in transactions per service for read-write service. And a 110% improvement in tps for read-only service was found. You see a change in hardware architecture with the new setup. They now recommend using a primary and multiple standby servers, a single controller per server, and a single set of disks in RAID1. Previously the recommendation was to have a primary and secondary server with two controllers per server and a set of mirrored disks per controller. The changes increase performance, availability, and hardware utilization. They also have a useful looking best practices document for High Availability called Maximum Availability Architecture (MAA) .
2 0.68522692 72 high scalability-2007-08-22-Wikimedia architecture
Introduction: Wikimedia is the platform on which Wikipedia, Wiktionary, and the other seven wiki dwarfs are built on. This document is just excellent for the student trying to scale the heights of giant websites. It is full of details and innovative ideas that have been proven on some of the most used websites on the internet. Site: http://wikimedia.org/ Information Sources Wikimedia architecture http://meta.wikimedia.org/wiki/Wikimedia_servers scale-out vs scale-up in the from Oracle to MySQL blog. Platform Apache Linux MySQL PHP Squid LVS Lucene for Search Memcached for Distributed Object Cache Lighttpd Image Server The Stats 8 million articles spread over hundreds of language projects (english, dutch, ...) 10th busiest site in the world (source: Alexa) Exponential growth: doubling every 4-6 months in terms of visitors / traffic / servers 30 000 HTTP requests/s during peak-time 3 Gbit/s of data traffic 3 data centers: Tampa, A
3 0.68079692 1511 high scalability-2013-09-04-Wide Fast SATA: the Recipe for Hot Performance
Introduction: This is a guest post by Brian Bulkowski , CTO and co-founder of Aerospike , a leading clustered NoSQL database, has worked in the area of high performance commodity systems since 1989. This blog post will tell you exactly how to build a multi-terabyte high throughput datacenter server. A fast, reliable multi-terrabyte data tier can be used for recent behavior (messages, tweets, plays, actions), or anywhere that today you use Redis or Memcache. You need to know: Which SSDs work Which chassis work How to configure your RAID cards Intel’s SATA solutions – combined with a high capacity storage server like the Dell R720xd and a host bus adapter based on the LSI 2208, and a Flash optimized database like Aerospike , enables high throughput and low latency. In a wide configuration, with 12 to 20 drives per 2U server, individual servers can cost-effectively serve at high throughput with 16T at $2.50 per GB with the s3700, or $1.25 with the s3500. Other SSD of
4 0.66903108 68 high scalability-2007-08-20-TypePad Architecture
Introduction: TypePad is considered the largest paid blogging service in the world. After experience problems because of their meteoric growth, they eventually transitioned to an architecture patterned after their sister company, LiveJournal. Site: http://www.typepad.com/ The Platform MySQL Memcached Perl MogileFS Apache Linux The Stats As of 2005 TypePad sends 250mbps of traffic using multiple network pipes for 3TB of traffic a day. They were growing by 10-20% each month. I was unable to find more recent statistics. The Architecture Original Architecture: - Single server running Linux, Apache, Postgres, Perl, mod_perl - Storage was NFS on a filer. A Devastating Crash Caused a New Direction - A RAID controller failed and spewed data across all RAID disks. - The database was corrupted and the backups were corrupted. - Their redundant filers suffered from "split brain" syndrome. They move to LiveJournal Architecture type architecture which isn't surprising
5 0.66783798 430 high scalability-2008-10-26-Should you use a SAN to scale your architecture?
Introduction: This is a question everyone must struggle with when building out their datacenter. Storage choices are always the ones I have the least confidence in. David Marks in his blog You Can Change It Later! asks the question Should I get a SAN to scale my site architecture? and answers no. A better solution is to use commodity hardware, directly attach storage on servers, and partition across servers to scale and for greater availability. David's reasoning is interesting: A SAN creates a SPOF (single point of failure) that is dependent on a vendor to fly and fix when there's a problem. This can lead to long down times during this outage you have no access to your data at all. Using easily available commodity hardware minimizes risks to your company, it's not just about saving money. Zooming over to Fry's to buy emergency equipment provides the kind of agility startups need in order to respond quickly to ever changing situations. It's hard to beat the power and flexibility (backup
6 0.66374308 513 high scalability-2009-02-16-Handle 1 Billion Events Per Day Using a Memory Grid
7 0.65684247 511 high scalability-2009-02-12-MySpace Architecture
8 0.65223515 392 high scalability-2008-09-24-Building a Scalable Architecture for Web Apps
9 0.64371592 1369 high scalability-2012-12-10-Switch your databases to Flash storage. Now. Or you're doing it wrong.
10 0.63970745 671 high scalability-2009-08-05-Stack Overflow Architecture
11 0.63574558 998 high scalability-2011-03-03-Stack Overflow Architecture Update - Now at 95 Million Page Views a Month
12 0.63501483 473 high scalability-2008-12-20-Second Life Architecture - The Grid
13 0.63462716 1372 high scalability-2012-12-14-Stuff The Internet Says On Scalability For December 14, 2012
15 0.62762296 1527 high scalability-2013-10-04-Stuff The Internet Says On Scalability For October 4th, 2013
16 0.6231854 1281 high scalability-2012-07-11-FictionPress: Publishing 6 Million Works of Fiction on the Web
17 0.62167007 1521 high scalability-2013-09-23-Salesforce Architecture - How they Handle 1.3 Billion Transactions a Day
18 0.62115139 157 high scalability-2007-11-16-Product: lbpool - Load Balancing JDBC Pool
19 0.61716199 679 high scalability-2009-08-11-13 Scalability Best Practices
20 0.61419886 1046 high scalability-2011-05-23-Evernote Architecture - 9 Million Users and 150 Million Requests a Day
topicId topicWeight
[(1, 0.233), (2, 0.047), (30, 0.544), (79, 0.049)]
simIndex simValue blogId blogTitle
1 0.98610604 14 high scalability-2007-07-15-Web Analytics: An Hour a Day
Introduction: Web Analytics: An Hour A Day is the first book by an in-the-trenches practitioner of web analytics. It provides a unique insider’s perspective of the challenges and opportunities that web analytics presents to each person who touches the Web in your organization. Rather than spamming you with metrics and definitions, Web Analytics: An Hour A Day will enhance your mindset and teach you how to fish for yourself. Avinash Kaushik is a expert in web analytics and author of the top-rated blog Occam’s Razor (http://www.kaushik.net/avinash). In this book, he goes beyond web analytics concepts and definitions to provide a step-by-step guide to implementing a successful web analytics strategy. His revolutionary approach to web analytics challenges prevalent thinking about the field and guides readers to a solution that will provide truly informed and actionable insights.
2 0.94127864 131 high scalability-2007-10-25-Should JSPs be avoided for high scalability?
Introduction: I just heard about some web sites where Velocity templates are used to render HTML instead of using JSPs and all the processing in performed in servlets. Can JSPs cause issue with scalability? Thanks, Unmesh
3 0.90638483 991 high scalability-2011-02-16-Paper: An Experimental Investigation of the Akamai Adaptive Video Streaming
Introduction: Video is hot on the Internet and people are really interested in knowing how to make it work. Dan Rayburn has a post pointing to a fascinating paper: An Experimental Investigation of the Akamai Adaptive Video Streaming , which talks in some detail about the protocols big players like YouTube, Skype and Akamai use to serve video over on an inherently video unfriendly medium like the Internet. For Akamai they found: Each video is encoded in five versions at different bit rates and stored in separate files. The client sends commands to the server with an average inter departure time of about 2 s, i.e. the control algorithm is executed on average each 2 seconds. Akamai uses only the video level to adapt the video source to the available bandwidth, whereas the frame rate of the video is kept constant. When a sudden drop in the available bandwidth occurs, short interruptions of the video playback can occur due to the a large actuation delay. For a sudden increase of the avai
4 0.9021852 1016 high scalability-2011-04-04-Scaling Social Ecommerce Architecture Case study
Introduction: A recent study showed that over 92 percent of executives from leading retailers are focusing their marketing efforts on Facebook and subsequent applications. Furthermore, over 71 percent of users have confirmed they are more likely to make a purchase after “liking” a brand they find online. ( source ) Sears Architect Tomer Gabel provides an insightful overview on how they built a Social Ecommerce solution for Sears.com that can handle complex relationship quires in real time. Tomer goes through: the architectural considerations behind their solution why they chose memory over disk how they partitioned the data to gain scalability why they chose to execute code with the data using GigaSpaces Map/Reduce execution framework how they integrated with Facebook why they chose GigaSpaces over Coherence and Terracotta for in-memory caching and scale In this post I tried to summarize the main takeaway from the interview. You can also watch the full interview (highly reco
5 0.86827952 16 high scalability-2007-07-16-Book: High Performance MySQL
Introduction: As users come to depend on MySQL, they find that they have to deal with issues of reliability, scalability, and performance--issues that are not well documented but are critical to a smoothly functioning site. This book is an insider's guide to these little understood topics. Author Jeremy Zawodny has managed large numbers of MySQL servers for mission-critical work at Yahoo!, maintained years of contacts with the MySQL AB team, and presents regularly at conferences. Jeremy and Derek have spent months experimenting, interviewing major users of MySQL, talking to MySQL AB, benchmarking, and writing some of their own tools in order to produce the information in this book. In High Performance MySQL you will learn about MySQL indexing and optimization in depth so you can make better use of these key features. You will learn practical replication, backup, and load-balancing strategies with information that goes beyond available tools to discuss their effects in real-life environments. And you
same-blog 6 0.86303705 182 high scalability-2007-12-12-Oracle Can Do Read-Write Splitting Too
7 0.86107528 1459 high scalability-2013-05-16-Paper: Warp: Multi-Key Transactions for Key-Value Stores
8 0.83038872 308 high scalability-2008-04-22-Simple NFS failover solution with symbolic link?
9 0.82036424 43 high scalability-2007-07-30-Product: ImageShack
10 0.81132579 500 high scalability-2009-01-22-Heterogeneous vs. Homogeneous System Architectures
11 0.77831173 334 high scalability-2008-05-29-Amazon Improves Diagonal Scaling Support with High-CPU Instances
12 0.73066986 44 high scalability-2007-07-30-Product: Photobucket
13 0.71851069 261 high scalability-2008-02-25-Make Your Site Run 10 Times Faster
14 0.70123094 783 high scalability-2010-02-24-Hot Scalability Links for February 24, 2010
15 0.69288903 263 high scalability-2008-02-27-Product: System Imager - Automate Deployment and Installs
16 0.68490624 336 high scalability-2008-05-31-Biggest Under Reported Story: Google's BigTable Costs 10 Times Less than Amazon's SimpleDB
17 0.67944545 831 high scalability-2010-05-26-End-To-End Performance Study of Cloud Services
18 0.67637116 1284 high scalability-2012-07-16-Cinchcast Architecture - Producing 1,500 Hours of Audio Every Day
19 0.60901701 788 high scalability-2010-03-04-How MySpace Tested Their Live Site with 1 Million Concurrent Users
20 0.60350817 291 high scalability-2008-03-29-20 New Rules for Faster Web Pages