high_scalability high_scalability-2007 high_scalability-2007-151 knowledge-graph by maker-knowledge-mining

151 high scalability-2007-11-12-a8cjdbc - Database Clustering via JDBC


meta infos for this blog

Source: html

Introduction: Practically any software project nowadays could not survive without a database (DBMS) backend storing all the business data that is vital to you and/or your customers. When projects grow larger, the amount of data usually grows larger exponentially. So you start moving the DBMS to a separate server to gain more speed and capacity. Which is all good and healthy but you do not gain any extra safety for this business data. You might be backing up your database once a day so in case the database server crashes you don't lose EVERYTHING, but how much can you really afford to lose?  Well clearly this depends on what kind of data you are storing. In our case the users of our solutions use our software products to do their everyday (all day) work. They have "everything" they need for their business stored in the database we are providing. So is 24 hours of data loss acceptable? No, not really. One hour? Maybe. But what we really want is a second database running with the EXACT same data. We


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Practically any software project nowadays could not survive without a database (DBMS) backend storing all the business data that is vital to you and/or your customers. [sent-1, score-0.757]

2 When projects grow larger, the amount of data usually grows larger exponentially. [sent-2, score-0.209]

3 So you start moving the DBMS to a separate server to gain more speed and capacity. [sent-3, score-0.163]

4 Which is all good and healthy but you do not gain any extra safety for this business data. [sent-4, score-0.626]

5 You might be backing up your database once a day so in case the database server crashes you don't lose EVERYTHING, but how much can you really afford to lose? [sent-5, score-1.246]

6   Well clearly this depends on what kind of data you are storing. [sent-6, score-0.166]

7 In our case the users of our solutions use our software products to do their everyday (all day) work. [sent-7, score-0.318]

8 They have "everything" they need for their business stored in the database we are providing. [sent-8, score-0.371]

9 But what we really want is a second database running with the EXACT same data. [sent-13, score-0.457]

10 We mostly use PostgreSQL which does not have built in database replication. [sent-14, score-0.239]

11 There is some solution based on triggers to replicate the data from one database to another one. [sent-15, score-0.448]

12 We have learned that setting all this up on an existing database with plenty of tables is rather complicated and changing the database structure afterwards can not be done with simple create/alter statements anymore. [sent-16, score-1.153]

13 And since we ARE running solutions that constantly change and improve, we need to be able to deploy updates including database structure changes quickly and easily. [sent-17, score-0.653]

14 So what we really wanted was a transparent JDBC layer that does the replication for us. [sent-18, score-0.263]

15 We tested a great solution called "Sequoia", but it is also a rather heavy-weight product with a lot of features that did not really help in the performance department and that we didn't need anyway. [sent-19, score-0.42]

16 ) for backups: the ability to detach one server, do the backup on that machine and then reattach the server automatic and transparent failover / failsafe Fast In-VM-Replication - no serialisation Easy integration Site: http://www. [sent-21, score-0.256]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('jdbc', 0.261), ('database', 0.239), ('safety', 0.211), ('dbms', 0.2), ('gain', 0.163), ('lose', 0.163), ('sequoia', 0.154), ('nowadays', 0.149), ('transparent', 0.146), ('afterwards', 0.145), ('multiply', 0.141), ('vital', 0.133), ('business', 0.132), ('practically', 0.131), ('everyday', 0.126), ('structure', 0.126), ('statements', 0.123), ('healthy', 0.12), ('larger', 0.118), ('triggers', 0.118), ('reads', 0.118), ('really', 0.117), ('department', 0.116), ('crashes', 0.111), ('ability', 0.11), ('backing', 0.109), ('acceptable', 0.105), ('survive', 0.104), ('solutions', 0.103), ('running', 0.101), ('rather', 0.1), ('afford', 0.099), ('driver', 0.097), ('waste', 0.095), ('exact', 0.094), ('postgresql', 0.092), ('grows', 0.091), ('plenty', 0.091), ('replicate', 0.091), ('backups', 0.09), ('setting', 0.09), ('case', 0.089), ('tested', 0.087), ('loss', 0.087), ('depends', 0.085), ('constantly', 0.084), ('hour', 0.082), ('everything', 0.081), ('clearly', 0.081), ('day', 0.08)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 151 high scalability-2007-11-12-a8cjdbc - Database Clustering via JDBC

Introduction: Practically any software project nowadays could not survive without a database (DBMS) backend storing all the business data that is vital to you and/or your customers. When projects grow larger, the amount of data usually grows larger exponentially. So you start moving the DBMS to a separate server to gain more speed and capacity. Which is all good and healthy but you do not gain any extra safety for this business data. You might be backing up your database once a day so in case the database server crashes you don't lose EVERYTHING, but how much can you really afford to lose?  Well clearly this depends on what kind of data you are storing. In our case the users of our solutions use our software products to do their everyday (all day) work. They have "everything" they need for their business stored in the database we are providing. So is 24 hours of data loss acceptable? No, not really. One hour? Maybe. But what we really want is a second database running with the EXACT same data. We

2 0.18666673 102 high scalability-2007-09-27-Product: Sequoia Database Clustering Technology

Introduction: Sequoia is a transparent middleware solution offering clustering, load balancing and failover services for any database. Sequoia is the continuation of the C-JDBC project. The database is distributed and replicated among several nodes and Sequoia balances the queries among these nodes. Sequoia handles node and network failures with transparent failover. It also provides support for hot recovery, online maintenance operations and online upgrades. Features in a nutshell No modification of existing applications or databases. Operational with any database providing a JDBC driver. High availability provided by advanced RAIDb technology. Transparent failover and recovery capabilities. Performance scalability with unique load balancing and query result caching features. Integrated JMX-based administration and monitoring. 100% Java implementation allowing portability across platforms with a JRE 1.4 or greater. Open source licensed under Apache v2 license. Professi

3 0.14072196 900 high scalability-2010-09-11-Google's Colossus Makes Search Real-time by Dumping MapReduce

Introduction: As the Kings of scaling, when Google changes its search infrastructure over to do something completely different, it's news. In Google search index splits with MapReduce , an exclusive interview by Cade Metz with Eisar Lipkovitz, a senior director of engineering at Google, we learn a bit more of the secret scaling sauce behind Google Instant , Google's new faster, real-time search system. The challenge for Google has been how to support a real-time world when the core of their search technology, the famous MapReduce, is batch oriented. Simple, they got rid of MapReduce. At least they got rid of MapReduce as the backbone for calculating search indexes. MapReduce still excels as a general query mechanism against masses of data, but real-time search requires a very specialized tool, and Google built one. Internally the successor to Google's famed Google File System, was code named Colossus. Details are slowly coming out about their new goals and approach: Goal is to update the

4 0.1388111 1041 high scalability-2011-05-15-Building a Database remote availability site

Introduction: The AWS East Region outage showed all of us the importance of running our apps and databases across multiple Amazon regions (or multiple cloud providers). In this post, I’ll try to explain how to build a MySQL (or Amazon RDS) redundant site. For simplicity, we create a passive redundant site. This means that the site is not used during normal operation and only comes into action when the primary site crashes. There are many reasons for choosing such an architecture – it’s easy to configure, simple to understand, and minimizes the risk of data collision. The downside is that you have hardware just sitting around doing nothing. Still, it’s a common enough scenario. So what do we need to do to make it work? DATA SYNCHRONIZATION We need to synchronize the database. This is done by means of database replication. Now, there are two options for database replication: synchronous and a-synchronous. Synchronous replication is great; it ensures that the backup database is identical to the

5 0.1330044 302 high scalability-2008-04-10-Mysql scalability and failover...

Introduction: Hi, I am an owner of an large community website and currently we are having problems with our database architecture. We are using 2 database servers and spread tables across them to divide read/writes. We have about 90% reads and 10% writes. We use Memcached on all our webservers to cache as much as we can, traffic is load balanced between webservers. We have 2 extra servers ready to put to use! We have looked into a couple of solution so far: Continuent Uni/Cluster aka Sequoia -> Commercial version way too expensive and Java isn't as fast as it suppose to be. MySQL Proxy -> We couldn't find any good example on how to create a master - master with failover scenario. MySQL Clustering -> Seems to be not mature enough, had a lot of performance issues when we tried to go online with it. MySQL DRDB HA -> Only good for failover, cannot be scaled! MySQL Replication -> Well don't get me started ;) So now I turn to you guys to help me out, I am with my hands in my hair a

6 0.12806454 538 high scalability-2009-03-16-Are Cloud Based Memory Architectures the Next Big Thing?

7 0.1231285 1508 high scalability-2013-08-28-Sean Hull's 20 Biggest Bottlenecks that Reduce and Slow Down Scalability

8 0.12108021 857 high scalability-2010-07-13-DbShards Part Deux - The Internals

9 0.11185152 1065 high scalability-2011-06-21-Running TPC-C on MySQL-RDS

10 0.11158547 1276 high scalability-2012-07-04-Top Features of a Scalable Database

11 0.10953024 1064 high scalability-2011-06-20-35+ Use Cases for Choosing Your Next NoSQL Database

12 0.10898114 589 high scalability-2009-05-05-Drop ACID and Think About Data

13 0.10790969 1589 high scalability-2014-02-03-How Google Backs Up the Internet Along With Exabytes of Other Data

14 0.10775528 297 high scalability-2008-04-05-Skype Plans for PostgreSQL to Scale to 1 Billion Users

15 0.10650861 331 high scalability-2008-05-27-eBay Architecture

16 0.10322217 1288 high scalability-2012-07-23-Ask HighScalability: How Do I Build My MegaUpload + Itunes + YouTube Startup?

17 0.10310515 70 high scalability-2007-08-22-How many machines do you need to run your site?

18 0.10163869 658 high scalability-2009-07-17-Against all the odds

19 0.10156133 313 high scalability-2008-05-02-Friends for Sale Architecture - A 300 Million Page View-Month Facebook RoR App

20 0.1009879 954 high scalability-2010-12-06-What the heck are you actually using NoSQL for?


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.202), (1, 0.077), (2, -0.021), (3, -0.067), (4, 0.018), (5, 0.068), (6, 0.022), (7, -0.08), (8, 0.037), (9, -0.032), (10, -0.039), (11, 0.022), (12, -0.049), (13, 0.035), (14, 0.062), (15, 0.028), (16, 0.012), (17, 0.001), (18, -0.004), (19, 0.043), (20, 0.008), (21, -0.003), (22, -0.002), (23, 0.007), (24, 0.015), (25, 0.035), (26, -0.04), (27, -0.049), (28, 0.016), (29, 0.043), (30, -0.034), (31, 0.029), (32, -0.044), (33, -0.011), (34, -0.032), (35, 0.03), (36, -0.007), (37, 0.011), (38, 0.058), (39, 0.015), (40, 0.009), (41, -0.061), (42, -0.004), (43, 0.007), (44, 0.012), (45, -0.016), (46, 0.089), (47, 0.005), (48, -0.009), (49, 0.038)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97255254 151 high scalability-2007-11-12-a8cjdbc - Database Clustering via JDBC

Introduction: Practically any software project nowadays could not survive without a database (DBMS) backend storing all the business data that is vital to you and/or your customers. When projects grow larger, the amount of data usually grows larger exponentially. So you start moving the DBMS to a separate server to gain more speed and capacity. Which is all good and healthy but you do not gain any extra safety for this business data. You might be backing up your database once a day so in case the database server crashes you don't lose EVERYTHING, but how much can you really afford to lose?  Well clearly this depends on what kind of data you are storing. In our case the users of our solutions use our software products to do their everyday (all day) work. They have "everything" they need for their business stored in the database we are providing. So is 24 hours of data loss acceptable? No, not really. One hour? Maybe. But what we really want is a second database running with the EXACT same data. We

2 0.83635849 297 high scalability-2008-04-05-Skype Plans for PostgreSQL to Scale to 1 Billion Users

Introduction: Skype uses PostgreSQL as their backend database . PostgreSQL doesn't get enough run in the database world so I was excited to see how PostgreSQL is used "as the main DB for most of [Skype's] business needs." Their approach is to use a traditional stored procedure interface for accessing data and on top of that layer proxy servers which hash SQL requests to a set of database servers that actually carry out queries. The result is a horizontally partitioned system that they think will scale to handle 1 billion users. Skype's goal is an architecture that can handle 1 billion plus users. This level of scale isn't practically solvable with one really big computer, so our masked superhero horizontal scaling comes to the rescue. Hardware is dual or quad Opterons with SCSI RAID. Followed common database progression: Start with one DB. Add new databases partitioned by functionality. Replicate read-mostly data for better read access. Then horizontally partition data across multiple nod

3 0.82195133 782 high scalability-2010-02-23-When to migrate your database?

Introduction: Why migrate your database? Efficiency and availability problems are harming your business – reports are out of date, your batch processing window is nearing its limits, outages (unplanned/planned) frequently halt work. Database consolidation – remove the costs that result from a heterogeneous database environment (DBAs time, database vendor pricing, database versions, hardware, OSs, patches, upgrades etc.). OK, so the driving forces for migration are clear,  what now? Read more on BigDataMatters.com

4 0.81921685 380 high scalability-2008-09-05-Product: Tungsten Replicator

Introduction: With Tungsten Replicator Continuent is trying to deliver a better master/slave replication system. Their goal: scalability, reliability with seamless failover, no performance loss. From their website: The Tungsten Replicator implements open source database-neutral master/slave replication. Master/slave replication is a highly flexible technology that can solve a wide variety of problems including the following: * Availability - Failing over to a slave database if your master database dies * Performance Scaling - Spreading reads across many copies of data * Cross-Site Clustering - Maintaining active database replicas across WANs * Change Data Capture - Extracting changes to load data warehouses or update other systems * Zero Downtime Upgrade - Performing upgrades on a slave server which then becomes the master The Tungsten Replicator architecture is flexible and designed to support addition of new databases easily. It includes pluggable extractor and applier modules to help transf

5 0.79536659 1508 high scalability-2013-08-28-Sean Hull's 20 Biggest Bottlenecks that Reduce and Slow Down Scalability

Introduction: This article is a lightly edited version of  20 Obstacles to Scalability  by Sean Hull  ( with permission) from the always excellent and thought provoking  ACM Queue . 1. TWO-PHASE COMMIT Normally when data is changed in a database, it is written both to memory and to disk. When a commit happens, a relational database makes a commitment to freeze the data somewhere on real storage media. Remember, memory doesn't survive a crash or reboot. Even if the data is cached in memory, the database still has to write it to disk. MySQL binary logs or Oracle redo logs fit the bill. With a MySQL cluster or distributed file system such as DRBD (Distributed Replicated Block Device) or Amazon Multi-AZ (Multi-Availability Zone), a commit occurs not only locally, but also at the remote end. A two-phase commit means waiting for an acknowledgment from the far end. Because of network and other latency, those commits can be slowed down by milliseconds, as though all the cars on a highway were slowe

6 0.78890598 351 high scalability-2008-07-16-The Mother of All Database Normalization Debates on Coding Horror

7 0.78249943 1041 high scalability-2011-05-15-Building a Database remote availability site

8 0.77158278 65 high scalability-2007-08-16-Scaling Secret #2: Denormalizing Your Way to Speed and Profit

9 0.76888567 847 high scalability-2010-06-23-Product: dbShards - Share Nothing. Shard Everything.

10 0.76853991 1065 high scalability-2011-06-21-Running TPC-C on MySQL-RDS

11 0.76709759 849 high scalability-2010-06-28-VoltDB Decapitates Six SQL Urban Myths and Delivers Internet Scale OLTP in the Process

12 0.76167148 671 high scalability-2009-08-05-Stack Overflow Architecture

13 0.75423133 817 high scalability-2010-04-29-Product: SciDB - A Science-Oriented DBMS at 100 Petabytes

14 0.75247777 675 high scalability-2009-08-08-1dbase vs. many and cloud hosting vs. dedicated server(s)?

15 0.75177217 1304 high scalability-2012-08-14-MemSQL Architecture - The Fast (MVCC, InMem, LockFree, CodeGen) and Familiar (SQL)

16 0.75130218 799 high scalability-2010-03-23-Digg: 4000% Performance Increase by Sorting in PHP Rather than MySQL

17 0.75033206 1597 high scalability-2014-02-17-How the AOL.com Architecture Evolved to 99.999% Availability, 8 Million Visitors Per Day, and 200,000 Requests Per Second

18 0.75016719 473 high scalability-2008-12-20-Second Life Architecture - The Grid

19 0.74861193 511 high scalability-2009-02-12-MySpace Architecture

20 0.74764633 235 high scalability-2008-02-02-The case against ORM Frameworks in High Scalability Architectures


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(1, 0.18), (2, 0.267), (10, 0.063), (27, 0.012), (38, 0.098), (40, 0.026), (61, 0.086), (79, 0.12), (85, 0.056)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.98121786 1452 high scalability-2013-05-06-7 Not So Sexy Tips for Saving Money On Amazon

Introduction: Harish Ganesan  CTO of 8KMiles  has a very helpful blog,  Cloud, Big Data and Mobile , where he shows a nice analytical bent which leads to a lot of practical advice and cost saving tips: Use SQS Batch Requests  to reduce the number of requests hitting SQS which saves costs. Sending 10 messages in a single batch request which in the example save $30/month. Use SQS Long Polling  to reduce extra polling requests, cutting down empty receives, which in the example saves ~$600 in empty receive leakage costs. Choose the right search technology choice to save costs in AWS  by matching your activity pattern to the technology. For a small application with constant load or a heavily utilized search tier or seasonal loads Amazon Cloud Search looks like the cost efficient play.  Use Amazon CloudFront Price Class to minimize costs  by selecting the right Price Class for your audience to potentially reduce delivery costs by excluding Amazon CloudFront’s more expensive edge locatio

same-blog 2 0.97034836 151 high scalability-2007-11-12-a8cjdbc - Database Clustering via JDBC

Introduction: Practically any software project nowadays could not survive without a database (DBMS) backend storing all the business data that is vital to you and/or your customers. When projects grow larger, the amount of data usually grows larger exponentially. So you start moving the DBMS to a separate server to gain more speed and capacity. Which is all good and healthy but you do not gain any extra safety for this business data. You might be backing up your database once a day so in case the database server crashes you don't lose EVERYTHING, but how much can you really afford to lose?  Well clearly this depends on what kind of data you are storing. In our case the users of our solutions use our software products to do their everyday (all day) work. They have "everything" they need for their business stored in the database we are providing. So is 24 hours of data loss acceptable? No, not really. One hour? Maybe. But what we really want is a second database running with the EXACT same data. We

3 0.9670738 1121 high scalability-2011-09-21-5 Scalability Poisons and 3 Cloud Scalability Antidotes

Introduction: Sean Hull with two helpful posts: 5 Things That are Toxic to Scalability : Object Relational Mappers. Create complex queries that hard to optimize and tweak. Synchronous, Serial, Coupled or Locking Processes. Locks are like stop signs, traffic circles keep the traffic flowing. Row level locking is better than table level locking. Use async replication. Use eventual consistency for clusters. One Copy of Your Database. A single database server is a choke point.   Create parallel databases and let a driver select between them. Having No Metrics. Visualize what's happening to your system using one of the many monitoring packages. Lack of Feature Flags. Be able to turn off features via a flag so when a spike hits features can be turned off to reduce load. 3 Ways to Boost Cloud Scalability : Use Auto-scaling. Spin-up new instances when a threshold is passed and back down again when traffic drops. Horizontally Scale the Database Tier. MySQL in a master

4 0.96323919 994 high scalability-2011-02-23-This stuff isn't taught, you learn it bit by bit as you solve each problem.

Introduction: "For the things we have to learn before we can do them, we learn by doing them." -- Aristotle A really nice Internet moment happened in the HackerNews thread  Disqus: Scaling the World’s Largest Django Application , when David Kitchen crafted an awesome response to a question about how you learn to build scalable systems. It's so good I thought I would reproduce it here. Question : asked   by   grovulent : Not like this is a problem I have to worry about. But where on earth does one learn this stuff?  The talk is useful - as an overview of what they use - but I know nothing of how to implement a single step. Answer : answered by  David Kitchen of buro9 : It's called experience.   Which perhaps sounds rude, but it's not meant to be. This stuff isn't taught per se, you learn it bit by bit as you solve each problem that you face. I learned about HAProxy when my site load exceeded that which a single web server could manage.

5 0.95871323 340 high scalability-2008-06-06-Economies of Non-Scale

Introduction: Scalability forces us to think differently. What worked on a small scale doesn't always work on a large scale -- and costs are no different. If 90% of our application is free of contention, and only 10% is spent on a shared resources, we will need to grow our compute resources by a factor of 100 to scale by a factor of 10! Another important thing to note is that 10x, in this case, is the limit of our ability to scale, even if more resources are added. 1. The cost of non-linearly scalable applications grows exponentially with the demand for more scale. 2. Non-linearly scalable applications have an absolute limit of scalability. According to Amdhal's Law, with 10% contention, the maximum scaling limit is 10. With 40% contention, our maximum scaling limit is 2.5 - no matter how many hardware resources we will throw at the problem. This post discuss in further details how to measure the true cost of non linearly scalable systems and suggest a model for reducing that cost signifi

6 0.95809305 1401 high scalability-2013-02-06-Super Bowl Advertisers Ready for the Traffic? Nope..It's Lights Out.

7 0.95528787 274 high scalability-2008-03-12-YouTube Architecture

8 0.95478362 1000 high scalability-2011-03-08-Medialets Architecture - Defeating the Daunting Mobile Device Data Deluge

9 0.95464873 671 high scalability-2009-08-05-Stack Overflow Architecture

10 0.95456958 1041 high scalability-2011-05-15-Building a Database remote availability site

11 0.95456034 1118 high scalability-2011-09-19-Big Iron Returns with BigMemory

12 0.9543196 1511 high scalability-2013-09-04-Wide Fast SATA: the Recipe for Hot Performance

13 0.95425993 1008 high scalability-2011-03-22-Facebook's New Realtime Analytics System: HBase to Process 20 Billion Events Per Day

14 0.95407796 1470 high scalability-2013-06-05-A Simple 6 Step Transition Guide for Moving Away from X to AWS

15 0.95387262 1186 high scalability-2012-02-02-The Data-Scope Project - 6PB storage, 500GBytes-sec sequential IO, 20M IOPS, 130TFlops

16 0.9535324 1444 high scalability-2013-04-23-Facebook Secrets of Web Performance

17 0.95339912 160 high scalability-2007-11-19-Tailrank Architecture - Learn How to Track Memes Across the Entire Blogosphere

18 0.95323312 1177 high scalability-2012-01-19-Is it time to get rid of the Linux OS model in the cloud?

19 0.95275784 1519 high scalability-2013-09-18-If You're Programming a Cell Phone Like a Server You're Doing it Wrong

20 0.95204407 1075 high scalability-2011-07-07-Myth: Google Uses Server Farms So You Should Too - Resurrection of the Big-Ass Machines