high_scalability high_scalability-2007 high_scalability-2007-168 knowledge-graph by maker-knowledge-mining

168 high scalability-2007-11-30-Strategy: Efficiently Geo-referencing IPs

meta infos for this blog

Source: html

Introduction: A lot of apps need to map IP addresses to locations. Jeremy Cole in On efficiently geo-referencing IPs with MaxMind GeoIP and MySQL GIS succinctly explains the many uses for such a feature: Geo-referencing IPs is, in a nutshell, converting an IP address, perhaps from an incoming web visitor, a log file, a data file, or some other place, into the name of some entity owning that IP address. There are a lot of reasons you may want to geo-reference IP addresses to country, city, etc., such as in simple ad targeting systems, geographic load balancing, web analytics, and many more applications. This is difficult to do efficiently, at least it gives me a bit of brain freeze. In the same post Jeremy nicely explains where to get the geo-rereferncing data, how to load data, and the performance of different approaches for IP address searching. It's a great practical introduction to the subject.

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 A lot of apps need to map IP addresses to locations. [sent-1, score-0.437]

2 There are a lot of reasons you may want to geo-reference IP addresses to country, city, etc. [sent-3, score-0.367]

3 , such as in simple ad targeting systems, geographic load balancing, web analytics, and many more applications. [sent-4, score-0.552]

4 This is difficult to do efficiently, at least it gives me a bit of brain freeze. [sent-5, score-0.387]

5 In the same post Jeremy nicely explains where to get the geo-rereferncing data, how to load data, and the performance of different approaches for IP address searching. [sent-6, score-0.663]

similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('ip', 0.468), ('ips', 0.279), ('jeremy', 0.269), ('geoip', 0.219), ('addresses', 0.212), ('nutshell', 0.206), ('explains', 0.203), ('cole', 0.174), ('owning', 0.17), ('efficiently', 0.16), ('visitor', 0.143), ('targeting', 0.143), ('country', 0.141), ('converting', 0.138), ('address', 0.137), ('geographic', 0.131), ('city', 0.126), ('entity', 0.121), ('nicely', 0.114), ('file', 0.108), ('reference', 0.108), ('subject', 0.107), ('brain', 0.105), ('introduction', 0.103), ('incoming', 0.103), ('ad', 0.097), ('map', 0.095), ('approaches', 0.094), ('practical', 0.094), ('reasons', 0.089), ('perhaps', 0.088), ('name', 0.083), ('analytics', 0.077), ('difficult', 0.075), ('balancing', 0.073), ('bit', 0.073), ('log', 0.073), ('load', 0.071), ('least', 0.071), ('place', 0.069), ('feature', 0.066), ('lot', 0.066), ('apps', 0.064), ('gives', 0.063), ('many', 0.056), ('web', 0.054), ('uses', 0.051), ('data', 0.049), ('mysql', 0.048), ('post', 0.044)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000001 168 high scalability-2007-11-30-Strategy: Efficiently Geo-referencing IPs

2 0.37850624 289 high scalability-2008-03-27-Amazon Announces Static IP Addresses and Multiple Datacenter Operation

Introduction: Amazon is fixing two of their major problems: no static IP addresses and single datacenter operation. By adding these two new features developers can finally build a no apology system on Amazon. Before you always had to throw in an apology or two. No, we don't have low failover times because of the silly DNS games and unexceptionable DNS update and propagation times and no, we don't operate in more than one datacenter. No more. Now Amazon is adding Elastic IP Addresses and Availability Zones . Elastic IP addresses are far better than normal IP addresses because they are both in tight with Jessica Alba and they are: Static IP addresses designed for dynamic cloud computing. An Elastic IP address is associated with your account, not a particular instance, and you control that address until you choose to explicitly release it. Unlike traditional static IP addresses, however, Elastic IP addresses allow you to mask instance or availability zone failures by programmatica

3 0.14699973 290 high scalability-2008-03-28-How to Get DNS Names of a Web Server

Introduction: For some special reason, I'm trying to make a web server able to get all the DNS names mapped to its IP. Let me explain more, I'm creating a website that will run in a web farm, every web server in the farm will have some subdomains mapped to its ip, what I want is that whenever my application starts on a web server is to be able to get all the subdomains mapped/assigned to that server, e.g. sub1.mydomain.com, sub2.mydomain.com. I understand that I have to use reverse dns lookup (i.e. give the IP get the domain name), but I also want to get all the subdomains not just the first one that maps to that IP. I've been reading about DNS on the internet but I don't seem to find any information on how to achieve what I want, normally you use dns to get the ip of a domain but I'm not sure that all servers enable reverse lookup. The problem is that I'm still not sure whether I'll host my own DNS server or use the services of some company (many companies offer DNS hosting services), so, my qu

4 0.12078713 907 high scalability-2010-09-23-Working With Large Data Sets

Introduction: This is an excerpt from my blogpost Working With Large Data Sets ... For the past 18 months I’ve moved from working on the SMTP proxy to working on our other systems, all of which make use of the data we collect from each connection. It’s a fair amount of data and it can be up to 2Kb in size for each connection. Our servers receive approximately 1000 of these pieces of data per second, which is fairly sustained due to our global distribution of customers. If you compare that to Twitter’s peak of 3,283 tweets per second (maximum of 140 characters), you can see it’s not a small amount of data that we are dealing with here. I recently set out to scientifically prove the benefits of throttling, which is our technology for slowing down connections in order to detect spambots, who are kind enough to disconnect quite quickly when they see a slow connection. Due to the nature of the data we had, I needed to work with a long range of data to show evidence that an IP that appeared on Spam

5 0.10946524 221 high scalability-2008-01-24-Mailinator Architecture

Introduction: Update: A fun exploration of applied searching in How to search for the word "pen1s" in 185 emails every second . When indexOf doesn't cut it you just trie harder. Has a drunken friend ever inspired you to create a first of its kind internet service that is loved by millions, deemed subversive by thousands, all while handling over 1.2 billion emails a year on one rickity old server? That's how Paul Tyma came to build Mailinator. Mailinator is a free no-setup web service for thwarting evil spammers by creating throw-away registration email addresses. If you don't give web sites you real email address they can't spam you. They spam Mailinator instead :-) I love design with a point-of-view and Mailinator has a big giant harry one: performance first, second, and last. Why? Because Mailinator is free and that allows Paul to showcase his different perspective on design. While competitors buy big Iron to handle load, Paul uses a big idea instead: pick the right problem and create a

6 0.10735284 798 high scalability-2010-03-22-7 Secrets to Successfully Scaling with Scalr (on Amazon) by Sebastian Stadil

7 0.10696024 645 high scalability-2009-06-30-Hot New Trend: Linking Clouds Through Cheap IP VPNs Instead of Private Lines

8 0.10572389 1158 high scalability-2011-12-16-Stuff The Internet Says On Scalability For December 16, 2011

9 0.1030728 427 high scalability-2008-10-22-Server load balancing architectures, Part 2: Application-level load balancing

10 0.098479643 841 high scalability-2010-06-14-How scalable could be a cPanel Hosting service?

11 0.096725792 39 high scalability-2007-07-30-Product: Akamai

12 0.089070469 37 high scalability-2007-07-28-Product: Web Log Storming

13 0.087819099 114 high scalability-2007-10-07-Product: Wackamole

14 0.084120259 1000 high scalability-2011-03-08-Medialets Architecture - Defeating the Daunting Mobile Device Data Deluge

15 0.081660531 16 high scalability-2007-07-16-Book: High Performance MySQL

16 0.081647336 303 high scalability-2008-04-18-Scaling Mania at MySQL Conference 2008

17 0.077452108 371 high scalability-2008-08-24-A Scalable, Commodity Data Center Network Architecture

18 0.076956116 1359 high scalability-2012-11-15-Gone Fishin': Justin.Tv's Live Video Broadcasting Architecture

19 0.076656446 796 high scalability-2010-03-16-Justin.tv's Live Video Broadcasting Architecture

20 0.075001925 1159 high scalability-2011-12-19-How Twitter Stores 250 Million Tweets a Day Using MySQL

similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.101), (1, 0.035), (2, -0.017), (3, -0.034), (4, -0.013), (5, 0.003), (6, 0.01), (7, -0.047), (8, 0.013), (9, 0.018), (10, 0.017), (11, -0.038), (12, 0.016), (13, -0.041), (14, 0.059), (15, 0.009), (16, 0.042), (17, 0.028), (18, -0.014), (19, -0.009), (20, 0.037), (21, 0.053), (22, -0.048), (23, 0.018), (24, 0.077), (25, 0.02), (26, -0.027), (27, 0.007), (28, -0.006), (29, -0.025), (30, -0.042), (31, -0.076), (32, 0.005), (33, 0.027), (34, -0.039), (35, -0.02), (36, -0.016), (37, 0.061), (38, -0.001), (39, -0.026), (40, -0.007), (41, 0.096), (42, 0.019), (43, 0.027), (44, 0.06), (45, 0.108), (46, 0.002), (47, -0.025), (48, 0.047), (49, -0.018)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.94089186 168 high scalability-2007-11-30-Strategy: Efficiently Geo-referencing IPs

2 0.65279138 290 high scalability-2008-03-28-How to Get DNS Names of a Web Server

3 0.63082904 878 high scalability-2010-08-12-Strategy: Terminate SSL Connections in Hardware and Reduce Server Count by 40%

Introduction: This is an interesting tidbit from near the end of the Packet Pushers podcast Show 15 – Saving the Web With Dinky Putt Putt Firewalls . The conversation was about how SSL connections need to terminate before they can be processed by a WAF ( Web Application Firewall ), which inspects HTTP for security problems like SQL injection and cross-site scripting exploits. Much was made that if programmers did their job better these appliances wouldn't be necessary, but I digress. To terminate SSL most shops run SSL connections into Intel based Linux boxes running Apache. This setup is convenient for developers, but it's not optimized for SSL, so it's slow and costly. Much of the capacity of these servers are unnecessarily consumed processing SSL. Load balancers on the other hand have crypto cards that terminate SSL very efficiently in hardware. Efficiently enough that if you are willing to get rid of the general purpose Linux boxes and use your big iron load balancers, your server count c

4 0.62603492 29 high scalability-2007-07-25-Product: lighttpd

Introduction: Lighttpd (pronounced "lighty") is a web server which is designed to be secure, fast, standards-compliant, and flexible while being optimized for speed-critical environments. Its low memory footprint (compared to other web servers), light CPU load and its speed goals make lighttpd suitable for servers that are suffering load problems, or for serving static media separately from dynamic content. lighttpd is free software / open source, and is distributed under the BSD license. lighttpd runs on GNU/Linux and other Unix-like operating systems and Microsoft Windows. Load-balancing FastCGI, SCGI and HTTP-proxy support chroot support select()-/poll()-based web server Support for more efficient event notification schemes like kqueue and epoll Conditional rewrites (mod_rewrite) SSL and TLS support, via openSSL. Authentication against an LDAP server rrdtool statistics Rule-based downloading with possibility of a script handling only authentication Server-side includes supp

5 0.59952015 509 high scalability-2009-02-05-Product: HAProxy - The Reliable, High Performance TCP-HTTP Load Balancer

Introduction: Update: Load Balancing in Amazon EC2 with HAProxy. Grig Gheorghiu writes a nice post on HAProxy functionality and configuration: Emulating virtual servers, Logging, SSL, Load balancing algorithms, Session persistence with cookies, Server health checks, etc. Adapted From the website: HAProxy is a free, very fast and reliable solution offering high availability, load balancing, and proxying for TCP and HTTP-based applications. It is particularly suited for web sites crawling under very high loads while needing persistence or Layer7 processing. Supporting tens of thousands of connections is clearly realistic with todays hardware. Its mode of operation makes its integration into existing architectures very easy and riskless, while still offering the possibility not to expose fragile web servers to the Net. Currently, two major versions are supported : * version 1.1 - maintains critical sites online since 200 The most stable and reliable, has reached years of uptime. Receive

6 0.59067076 300 high scalability-2008-04-07-Scalr - Open Source Auto-scaling Hosting on Amazon EC2

7 0.57692212 289 high scalability-2008-03-27-Amazon Announces Static IP Addresses and Multiple Datacenter Operation

8 0.56139684 1517 high scalability-2013-09-16-The Hidden DNS Tax - Cascading Timeouts and Errors

9 0.56084865 109 high scalability-2007-10-03-Save on a Load Balancer By Using Client Side Load Balancing

10 0.56033653 1329 high scalability-2012-09-26-WordPress.com Serves 70,000 req-sec and over 15 Gbit-sec of Traffic using NGINX

11 0.55195987 14 high scalability-2007-07-15-Web Analytics: An Hour a Day

12 0.54662406 427 high scalability-2008-10-22-Server load balancing architectures, Part 2: Application-level load balancing

13 0.53617078 9 high scalability-2007-07-15-Blog: Occam’s Razor by Avinash Kaushik

14 0.53278053 1615 high scalability-2014-03-19-Strategy: Three Techniques to Survive Traffic Surges by Quickly Scaling Your Site

15 0.52163196 138 high scalability-2007-10-30-Feedblendr Architecture - Using EC2 to Scale

16 0.52144736 987 high scalability-2011-02-10-Dispelling the New SSL Myth

17 0.51652467 570 high scalability-2009-04-15-Implementing large scale web analytics

18 0.51338196 228 high scalability-2008-01-28-Product: ISPMan Centralized ISP Management System

19 0.50583047 117 high scalability-2007-10-08-Paper: Understanding and Building High Availability-Load Balanced Clusters

20 0.49788418 80 high scalability-2007-09-06-Product: Perdition Mail Retrieval Proxy

similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(1, 0.099), (2, 0.198), (61, 0.071), (79, 0.171), (85, 0.047), (93, 0.286)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.89088953 403 high scalability-2008-10-06-Paper: Scaling Genome Sequencing - Complete Genomics Technology Overview

Introduction: Although the problem of scaling human genome sequencing is not exactly about building bigger, faster and more reliable websites it is most interesting in terms of scalability. The paper describes a new technology by the startup company Complete Genomics to sequence the full human genome for the fraction of the cost of earlier possibilities. Complete Genomics is building the worldâ€™s largest commercial human genome sequencing center to provide turnkey, outsourced complete human genome sequencing to customers worldwide. By 2010, their data center will contain approximately 60,000 processors with 30 petabytes of storage running their sequencing software on Linux clusters. Do you find this interesting and relevant to HighScalability.com?

2 0.84982276 349 high scalability-2008-07-10-Can cloud computing smite down evil zombie botnet armies?

Introduction: In the more cool stuff I've never heard of before department is something called Self Cleansing Intrusion Tolerance (SCIT). Botnets are created when vulnerable computers live long enough to become infected with the will to do the evil bidding of their evil masters. Security is almost always about removing vulnerabilities (a process which to outside observers often looks like a dog chasing its tail ). SCIT takes a different approach, it works on the availability angle. Something I never thought of before, but which makes a great deal of sense once I thought about it. With SCIT you stop and restart VM instances every minute (or whatever depending in your desired window vulnerability).... This short exposure window means worms and viri do not have long enough to fully infect a machine and carry out a coordinated attack. A machine is up for a while. Does work. And then is torn down again only to be reborn as a clean VM with no possibility of infection (unless of course the VM

3 0.83428943 58 high scalability-2007-08-04-Product: Cacti

Introduction: Cacti is a network statistics graphing tool designed as a frontend to RRDtool's data storage and graphing functionality. It is intended to be intuitive and easy to use, as well as robust and scalable. It is generally used to graph time-series data like CPU load and bandwidth use. The frontend is written in PHP; it can handle multiple users, each with their own graph sets, so it is sometimes used by web hosting providers (especially dedicated server, virtual private server, and colocation providers) to display bandwidth statistics for their customers. It can be used to configure the data collection itself, allowing certain setups to be monitored without any manual configuration of RRDtool.

same-blog 4 0.82987016 168 high scalability-2007-11-30-Strategy: Efficiently Geo-referencing IPs

5 0.82707489 1450 high scalability-2013-05-01-Myth: Eric Brewer on Why Banks are BASE Not ACID - Availability Is Revenue

Introduction: In NoSQL: Past, Present, Future Eric Brewer has a particularly fine section on explaining the often hard to understand ideas of BASE (Basically Available, Soft State, Eventually Consistent), ACID (Atomicity, Consistency, Isolation, Durability), CAP (Consistency Availability, Partition Tolerance), in terms of a pernicious long standing myth about the sanctity of consistency in banking. Myth : Money is important, so banks must use transactions to keep money safe and consistent, right? Reality : Banking transactions are inconsistent, particularly for ATMs. ATMs are designed to have a normal case behaviour and a partition mode behaviour. In partition mode Availability is chosen over Consistency. Why? 1) Availability correlates with revenue and consistency generally does not. 2) Historically there was never an idea of perfect communication so everything was partitioned. Your ATM transaction must go through so Availability is more important than

6 0.81039363 1513 high scalability-2013-09-06-Stuff The Internet Says On Scalability For September 6, 2013

7 0.8080132 166 high scalability-2007-11-27-Solving the Client Side API Scalability Problem with a Little Game Theory

8 0.74648112 1330 high scalability-2012-09-28-Stuff The Internet Says On Scalability For September 28, 2012

9 0.74642915 1198 high scalability-2012-02-24-Stuff The Internet Says On Scalability For February 24, 2012

10 0.74054706 1279 high scalability-2012-07-09-Data Replication in NoSQL Databases

11 0.73312676 944 high scalability-2010-11-17-Some Services are More Equal than Others

12 0.72094762 1233 high scalability-2012-04-25-The Anatomy of Search Technology: blekko’s NoSQL database

13 0.72078073 573 high scalability-2009-04-16-Serving 250M quotes-day at CNBC.com with aiCache

14 0.71949005 1637 high scalability-2014-04-25-Stuff The Internet Says On Scalability For April 25th, 2014

15 0.7086826 538 high scalability-2009-03-16-Are Cloud Based Memory Architectures the Next Big Thing?

16 0.70281529 38 high scalability-2007-07-30-Build an Infinitely Scalable Infrastructure for $100 Using Amazon Services

17 0.70280761 366 high scalability-2008-08-17-Many updates against MySQL

18 0.70162302 687 high scalability-2009-08-24-How Google Serves Data from Multiple Datacenters

19 0.70069301 601 high scalability-2009-05-17-Product: Hadoop

20 0.69872558 1535 high scalability-2013-10-21-Google's Sanjay Ghemawat on What Made Google Google and Great Big Data Career Advice