high_scalability high_scalability-2007 high_scalability-2007-165 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Hello all, does anyone have experience in scaling a european website to china? The main problem in china is the internet connectivity to websites outside china, that means latency and packetloss (and perhaps filtering) make things difficult. The options I see are: 1. Host you application in china, but where? I haven't got a answer from any chinese ISP I contacted. On the other hand I don't really want to host in china. 2. Build your own CDN. Wikipedia shows how it goes. Get a bunch of machines (but where? see point 1) put squid on them, implement intelligent cache invalidation and you're set. But where can I get machines in china? Where do I need them in china? There are soe big isps with limited peering capability, so I'd need servers in every network. 3. Get professional CDN services. Akamai, ChinaCache, CDNetworks, etc etc.. They all provide services in china. The problem is: they are all very expensive. 4. Amazon EC2/S3 ? Is it worth thinking about this
sentIndex sentText sentNum sentScore
1 Hello all, does anyone have experience in scaling a european website to china? [sent-1, score-0.128]
2 The main problem in china is the internet connectivity to websites outside china, that means latency and packetloss (and perhaps filtering) make things difficult. [sent-2, score-1.237]
3 I haven't got a answer from any chinese ISP I contacted. [sent-5, score-0.239]
4 On the other hand I don't really want to host in china. [sent-6, score-0.182]
5 see point 1) put squid on them, implement intelligent cache invalidation and you're set. [sent-11, score-0.393]
6 There are soe big isps with limited peering capability, so I'd need servers in every network. [sent-14, score-0.335]
7 My favourite way: Rent a bunch of linux servers in 4-5 big cities in china in different networks and build my own CDN. [sent-27, score-1.373]
wordName wordTfidf (topN-words)
[('china', 0.775), ('connectivity', 0.183), ('bunch', 0.148), ('cdnetworks', 0.146), ('favourite', 0.146), ('chinese', 0.135), ('ireland', 0.135), ('isps', 0.114), ('host', 0.112), ('hello', 0.111), ('peering', 0.111), ('invalidation', 0.106), ('isp', 0.104), ('cities', 0.102), ('squid', 0.101), ('rent', 0.101), ('wikipedia', 0.099), ('akamai', 0.095), ('stuck', 0.086), ('filtering', 0.086), ('intelligent', 0.085), ('machines', 0.084), ('capability', 0.081), ('european', 0.072), ('hand', 0.07), ('perhaps', 0.062), ('professional', 0.061), ('cdn', 0.061), ('outside', 0.06), ('options', 0.059), ('limited', 0.056), ('get', 0.056), ('anyone', 0.056), ('problem', 0.055), ('etc', 0.055), ('shows', 0.054), ('big', 0.054), ('build', 0.054), ('answer', 0.053), ('main', 0.053), ('thinking', 0.053), ('worth', 0.052), ('got', 0.051), ('implement', 0.051), ('see', 0.05), ('websites', 0.049), ('linux', 0.048), ('way', 0.046), ('networks', 0.046), ('sure', 0.045)]
simIndex simValue blogId blogTitle
same-blog 1 1.0000001 165 high scalability-2007-11-26-Scale to China
Introduction: Hello all, does anyone have experience in scaling a european website to china? The main problem in china is the internet connectivity to websites outside china, that means latency and packetloss (and perhaps filtering) make things difficult. The options I see are: 1. Host you application in china, but where? I haven't got a answer from any chinese ISP I contacted. On the other hand I don't really want to host in china. 2. Build your own CDN. Wikipedia shows how it goes. Get a bunch of machines (but where? see point 1) put squid on them, implement intelligent cache invalidation and you're set. But where can I get machines in china? Where do I need them in china? There are soe big isps with limited peering capability, so I'd need servers in every network. 3. Get professional CDN services. Akamai, ChinaCache, CDNetworks, etc etc.. They all provide services in china. The problem is: they are all very expensive. 4. Amazon EC2/S3 ? Is it worth thinking about this
2 0.13031754 1585 high scalability-2014-01-24-Stuff The Internet Says On Scalability For January 24th, 2014
Introduction: Hey, it's HighScalability time: Gorgeous image from Scientific American's Your Brain by the Numbers Quotable Quotes: @jezhumble : Google does everything off trunk despite 10k devs across 40 offices. @KentLangley : "in 2016. When it goes online, the SKA is expected to produce 700 terabytes of data each day" Jonathan Marks : It's actually a talk about how NOT to be creative. And what he [John Cleese] describes is the way most international broadcasters operated for most of their existence. They were content factories, slave to an artificial transmission schedule. Because they didn't take time to be creative, then ended up sounding like a tape machine. They were run by a computer algorithm. Not a human soul. There was never room for a creative pause. Routine was the solution. And that's creativities biggest enemy. 40% better single-threaded performance in MariaDB . Using perf, cache misses were found and the fix was using the righ
3 0.12070976 647 high scalability-2009-07-02-Hypertable is a New BigTable Clone that Runs on HDFS or KFS
Introduction: Update 3 : Presentation from the NoSQL conference : slides , video 1 , video 2 . Update 2 : The folks at Hypertable would like you to know that Hypertable is now officially sponsored by Baidu , China’s Leading Search Engine. As a sponsor of Hypertable, Baidu has committed an industrious team of engineers, numerous servers, and support resources to improve the quality and development of the open source technology. Update : InfoQ interview on Hypertable Lead Discusses Hadoop and Distributed Databases . Hypertable differs from HBase in that it is a higher performance implementation of Bigtable. Skrentablog gives the heads up on Hypertable , Zvents' open-source BigTable clone. It's written in C++ and can run on top of either HDFS or KFS. Performance looks encouraging at 28M rows of data inserted at a per-node write rate of 7mb/sec .
4 0.090960555 1100 high scalability-2011-08-18-Paper: The Akamai Network - 61,000 servers, 1,000 networks, 70 countries
Introduction: Update : as of the end of Q2 2011, Akamai had 95,811 servers deployed globally. Akamai is the CDN to the stars. It claims to deliver between 15 and 30 percent of all Web traffic, with major customers like Facebook, Twitter, Apple, and the US military. Traditionally quite secretive, we get a peek behind the curtain in this paper: The Akamai Network: A Platform for High-Performance Internet Applications by Erik Nygren, Ramesh Sitaraman, and Jennifer Sun. Abstract: Comprising more than 61,000 servers located across nearly 1,000 networks in 70 countries worldwide, the Akamai platform delivers hundreds of billions of Internet interactions daily, helping thousands of enterprises boost the performance and reliability of their Internet applications. In this paper, we give an overview of the components and capabilities of this large-scale distributed computing platform, and offer some insight into its architecture, design principles, operation, and management. Delivering a
5 0.085764974 1068 high scalability-2011-06-27-TripAdvisor Architecture - 40M Visitors, 200M Dynamic Page Views, 30TB Data
Introduction: This is a guest post by Andy Gelfond , VP of Engineering for TripAdvisor. Andy has been with TripAdvisor for six and a half years, wrote a lot of code in the earlier days, and has been building and running a first class engineering and operations team that is responsible for the worlds largest travel site. There's an update for this article at An Epic TripAdvisor Update: Why Not Run On The Cloud? The Grand Experiment . For TripAdvisor , scalability is woven into our organization on many levels - data center, software architecture, development/deployment/operations, and, most importantly, within the culture and organization. It is not enough to have a scalable data center, or a scalable software architecture. The process of designing, coding, testing, and deploying code also needs to be scalable. All of this starts with hiring and a culture and an organization that values and supports a distributed, fast, and effective development and operation of a complex and highly scalable co
6 0.071463227 662 high scalability-2009-07-27-Handle 700 Percent More Requests Using Squid and APC Cache
7 0.069192141 134 high scalability-2007-10-26-Paper: Wikipedia's Site Internals, Configuration, Code Examples and Management Issues
8 0.065012828 1365 high scalability-2012-11-30-Stuff The Internet Says On Scalability For November 30, 2012
9 0.064553708 60 high scalability-2007-08-07-Can you profit from the coming Content Delivery Network wars?
10 0.062711224 1115 high scalability-2011-09-14-Big List of Scalabilty Conferences
12 0.062399149 74 high scalability-2007-08-23-Product: Varnish
13 0.061838999 200 high scalability-2008-01-02-WEB hosting Select
14 0.061804567 169 high scalability-2007-12-01-many website, one setup, many databases
15 0.060920428 576 high scalability-2009-04-21-What CDN would you recommend?
16 0.060636841 1106 high scalability-2011-08-26-Stuff The Internet Says On Scalability For August 26, 2011
17 0.060370389 1223 high scalability-2012-04-06-Stuff The Internet Says On Scalability For April 6, 2012
18 0.059589945 72 high scalability-2007-08-22-Wikimedia architecture
20 0.059478994 1431 high scalability-2013-03-29-Stuff The Internet Says On Scalability For March 29, 2013
topicId topicWeight
[(0, 0.088), (1, 0.025), (2, 0.002), (3, -0.035), (4, -0.014), (5, -0.046), (6, -0.024), (7, -0.0), (8, 0.005), (9, 0.011), (10, -0.029), (11, -0.027), (12, -0.017), (13, -0.01), (14, 0.019), (15, 0.001), (16, 0.01), (17, 0.005), (18, -0.028), (19, -0.02), (20, -0.023), (21, 0.023), (22, 0.036), (23, 0.025), (24, -0.007), (25, 0.014), (26, 0.044), (27, 0.001), (28, -0.018), (29, 0.008), (30, 0.008), (31, 0.002), (32, -0.0), (33, -0.011), (34, 0.005), (35, 0.018), (36, 0.028), (37, 0.017), (38, 0.007), (39, 0.023), (40, -0.002), (41, 0.01), (42, -0.013), (43, -0.034), (44, 0.006), (45, 0.025), (46, -0.021), (47, 0.034), (48, -0.016), (49, -0.039)]
simIndex simValue blogId blogTitle
same-blog 1 0.93163693 165 high scalability-2007-11-26-Scale to China
Introduction: Hello all, does anyone have experience in scaling a european website to china? The main problem in china is the internet connectivity to websites outside china, that means latency and packetloss (and perhaps filtering) make things difficult. The options I see are: 1. Host you application in china, but where? I haven't got a answer from any chinese ISP I contacted. On the other hand I don't really want to host in china. 2. Build your own CDN. Wikipedia shows how it goes. Get a bunch of machines (but where? see point 1) put squid on them, implement intelligent cache invalidation and you're set. But where can I get machines in china? Where do I need them in china? There are soe big isps with limited peering capability, so I'd need servers in every network. 3. Get professional CDN services. Akamai, ChinaCache, CDNetworks, etc etc.. They all provide services in china. The problem is: they are all very expensive. 4. Amazon EC2/S3 ? Is it worth thinking about this
2 0.68692416 181 high scalability-2007-12-11-Hosting and CDN for startup video sharing site
Introduction: This question is for all the gurus here. Please help this novice x I am starting a video sharing site like YouTube in India. I want to offer the best quality possible, at minimum cost. Nothing new about it, right? :). I have done some research on the dedicated hosting services and CDN services available and I have some basic knowledge on these. Following are my requirements 1) My budget is $500 to $1000 per month for hosting (including CDN if and as applicable). 2) I will need around 500GB of storage and 1TB per month of bandwidth in first 2-3 months and then about 10TB of storage and 5TB per month of bandwidth. And more ... depending on how big it gets (I can afford more when it gets big) 3) 90% of my viewers are in India. Other 10% are in US and UK. Based on the above, could you please answer my following questions? 1) Can I go with just a good dedicated server to start with and get a CDN service later on when the site gets big? Or do you think its wise
3 0.67044902 267 high scalability-2008-03-05-Oprah is the Real Social Network
Introduction: A lot of new internet TV station startups are in the wind these days and there's a question about how they can scale their broadcasts. Today's state of the art shows you can't yet mimic the reach of broadcast TV with internet tech. But as Oprah proves, you can still capture a lot of eyeballs, if you are Oprah... Oprah drew a stunning 500,000 simultaneous viewers for an Eckhart Tolle webcast. Move Networks and Limelight Networks hosted the "broadcast" where traffic peaked at 242Gbps. A variable bitrate scheme was used so depending on their connection, a viewer could have seen 150Kbps or as high as 750Kbps. Dan Rayburn thinks The big take away from this webcast is that it shows proof that the Internet is not built to handle TV like distribution and those who think that live TV shows will be broadcast on the Internet with millions and millions of people watching, it's just not going to happen. To handle more users comments suggested capping the bitrate at 300K, using P2P
4 0.66140139 375 high scalability-2008-09-01-A Scalability checklist?
Introduction: Hi everyone, I'm researching on Scalability for a college paper, and found this site great, but it has too many tips, articles and the like, but I can't see a hierarchical organization of subjects, I would need something like a checklist of things or fields, or technologies to take into account when assesing scalability. So far I've identified these: - Hardware scalability: - scale out - scale up - Cache What types of cache are there? app-level, os-level, network-level, I/O-level? - Load Balancing - DB Clustering Am I missing something important? (I'm sure I am) I don't expect you to give a lecture here, but maybe point some things out, give me some useful links... Thanks!
5 0.65874726 120 high scalability-2007-10-11-How Flickr Handles Moving You to Another Shard
Introduction: Colin Charles has cool picture showing Flickr's message telling him they'll need about 15 minutes to move his 11,500 images to another shard. One, that's a lot of pictures! Two, it just goes to show you don't have to make this stuff complicated. Sure, it might be nice if their infrastructure could auto-balance shards with no down time and no loss of performance, but do you really need to go to all the extra complexity? The manual system works and though Colin would probably like his service to have been up, I am sure his day will still be a pleasant one.
6 0.64530069 193 high scalability-2007-12-26-Finding an excellent LAMP developer
7 0.64115793 60 high scalability-2007-08-07-Can you profit from the coming Content Delivery Network wars?
8 0.63625115 1070 high scalability-2011-06-29-Second Hand Seizure : A New Cause of Site Death
9 0.63373345 198 high scalability-2008-01-01-HOW CDN works
10 0.63367522 488 high scalability-2009-01-08-file synchronization solutions
11 0.63095427 1131 high scalability-2011-10-24-StackExchange Architecture Updates - Running Smoothly, Amazon 4x More Expensive
12 0.62370056 1402 high scalability-2013-02-07-Ask HighScalability: Web asset server concept - 3rd party software available?
13 0.62010902 1037 high scalability-2011-05-10-Viddler Architecture - 7 Million Embeds a Day and 1500 Req-Sec Peak
14 0.61329657 266 high scalability-2008-03-04-Manage Downtime Risk by Connecting Multiple Data Centers into a Secure Virtual LAN
15 0.61167252 1552 high scalability-2013-11-22-Stuff The Internet Says On Scalability For November 22th, 2013
17 0.60841638 23 high scalability-2007-07-24-Major Websites Down: Or Why You Want to Run in Two or More Data Centers.
18 0.60707313 576 high scalability-2009-04-21-What CDN would you recommend?
19 0.60080105 1584 high scalability-2014-01-22-How would you build the next Internet? Loons, Drones, Copters, Satellites, or Something Else?
20 0.59937316 1506 high scalability-2013-08-23-Stuff The Internet Says On Scalability For August 23, 2013
topicId topicWeight
[(1, 0.082), (2, 0.169), (10, 0.029), (56, 0.041), (61, 0.045), (64, 0.267), (79, 0.225)]
simIndex simValue blogId blogTitle
1 0.91059041 1202 high scalability-2012-03-01-Grace Hopper to Programmers: Mind Your Nanoseconds!
Introduction: Computing pioneer Grace Hopper , inventor of the compiler , searched for a concrete way to create an intuitive understanding of just how fast is a nanosecond, a billionth of a second, which was the speed of their new computer circuits. As an illustration she settled on the length of wire that is as long as light can travel in one nanosecond. The length is a very portable 11.8 inches . A microseconds worth of wire is a still portable, but a much bulkier 984 feet. In one millisecond light travels 186 miles, which only Hercules could carry. In today's terms, at a 3.06 GHz clock speed , there's .33 nanoseconds between ticks, or 3.73 inches of light travel. Understanding the profligate ways of programmers, she suggests that every programmer wear a necklace of a microseconds worth of wire so they know what they are wasting when they throw away microseconds. And if a General is busting your chops about satellite messages taking too long to send, you can bust out your piece of wire and e
2 0.9039889 1560 high scalability-2013-12-09-In Memory: Grace Hopper to Programmers: Mind Your Nanoseconds!
Introduction: This is an article published last year, but as today is Grace Hopper's birthday I thought it would be a good time to share again an amazing talk from this amazing woman. Computing pioneer Grace Hopper , inventor of the compiler , searched for a concrete way to create an intuitive understanding of just how fast is a nanosecond, a billionth of a second, which was the speed of their new computer circuits. As an illustration she settled on the length of wire that is as long as light can travel in one nanosecond. The length is a very portable 11.8 inches . A microseconds worth of wire is a still portable, but a much bulkier 984 feet. In one millisecond light travels 186 miles, which only Hercules could carry. In today's terms, at a 3.06 GHz clock speed , there's .33 nanoseconds between ticks, or 3.73 inches of light travel. Understanding the profligate ways of programmers, she suggests that every programmer wear a necklace of a microseconds worth of wire so they know what they are
same-blog 3 0.89674497 165 high scalability-2007-11-26-Scale to China
Introduction: Hello all, does anyone have experience in scaling a european website to china? The main problem in china is the internet connectivity to websites outside china, that means latency and packetloss (and perhaps filtering) make things difficult. The options I see are: 1. Host you application in china, but where? I haven't got a answer from any chinese ISP I contacted. On the other hand I don't really want to host in china. 2. Build your own CDN. Wikipedia shows how it goes. Get a bunch of machines (but where? see point 1) put squid on them, implement intelligent cache invalidation and you're set. But where can I get machines in china? Where do I need them in china? There are soe big isps with limited peering capability, so I'd need servers in every network. 3. Get professional CDN services. Akamai, ChinaCache, CDNetworks, etc etc.. They all provide services in china. The problem is: they are all very expensive. 4. Amazon EC2/S3 ? Is it worth thinking about this
4 0.79394823 914 high scalability-2010-10-04-Paper: An Analysis of Linux Scalability to Many Cores
Introduction: An Analysis of Linux Scalability to Many Cores , by a number of MIT researchers, is a refreshingly practical paper on what it takes to scale Linux and common applications like Exim, memcached, Apache, PostgreSQL, gmake, Psearchy, and MapReduce to run on 48 core systems. A very timely paper given moderately massive multicore systems are reportedly the near future of computing. This paper must have taken a lot of work. They both tracked down bottlenecks in a number of applications and the Linux kernel and they also tried to fix them. Modestly speaking the authors said they made "modest" changes to the kernel and applications, but there's nothing modest about what they did. It's excellent work. After the next bit, which is the abstract, there is a list of the problems they found and how they fixed them. The abstract: This paper analyzes the scalability of seven system applications (Exim, memcached, Apache, PostgreSQL, gmake, Psearchy, and MapReduce) running on Linux on a 48-core
5 0.73325098 1420 high scalability-2013-03-08-Stuff The Internet Says On Scalability For March 8, 2013
Introduction: Hey, it's HighScalability time: Quotable Quotes: @ibogost : Disabling features of SimCity due to ineffective central infrastructure is probably the most realistic simulation of the modern city. antirez : The point is simply to show how SSDs can't be considered, currently, as a bit slower version of memory. Their performance characteristics are a lot more about, simply, "faster disks". @jessenoller : I only use JavaScript so I can gain maximum scalability across multiple cores. Also unicorns. Paint thinner gingerbread @liammclennan : high-scalability ruby. Why bother? @scomma : Problem with BitCoin is not scalability, not even usability. It's whether someone will crack the algorithm and render BTC entirely useless. @webclimber : Amazing how often I find myself explaining that scalability is not magical @mvmsan : Flash as Primary Storage - Highest Cost, Lack of HA, scalability and management features #flas
6 0.73277092 1494 high scalability-2013-07-19-Stuff The Internet Says On Scalability For July 19, 2013
7 0.73219764 1403 high scalability-2013-02-08-Stuff The Internet Says On Scalability For February 8, 2013
8 0.73127645 871 high scalability-2010-08-04-Dremel: Interactive Analysis of Web-Scale Datasets - Data as a Programming Paradigm
9 0.73110467 1048 high scalability-2011-05-27-Stuff The Internet Says On Scalability For May 27, 2011
10 0.72979569 786 high scalability-2010-03-02-Using the Ambient Cloud as an Application Runtime
11 0.72953081 448 high scalability-2008-11-22-Google Architecture
12 0.72949505 680 high scalability-2009-08-13-Reconnoiter - Large-Scale Trending and Fault-Detection
13 0.72672129 1485 high scalability-2013-07-01-PRISM: The Amazingly Low Cost of Using BigData to Know More About You in Under a Minute
14 0.7230112 627 high scalability-2009-06-11-Yahoo! Distribution of Hadoop
15 0.72283143 289 high scalability-2008-03-27-Amazon Announces Static IP Addresses and Multiple Datacenter Operation
16 0.71880412 1328 high scalability-2012-09-24-Google Spanner's Most Surprising Revelation: NoSQL is Out and NewSQL is In
17 0.71531123 867 high scalability-2010-07-27-YeSQL: An Overview of the Various Query Semantics in the Post Only-SQL World
18 0.71455884 601 high scalability-2009-05-17-Product: Hadoop
19 0.71432936 1476 high scalability-2013-06-14-Stuff The Internet Says On Scalability For June 14, 2013
20 0.71376091 323 high scalability-2008-05-19-Twitter as a scalability case study