Copyright Tenereillo, Inc. 2004
As the paper “Why DNS Based GSLB Doesn’t Work” makes clear, the use of multiple A records debilitates GSLB solutions.
The specific reason that multiple A records are needed is that client browsers and other applications ignore DNS TTLs, caching A records for some fixed amount of time. If the site that is pointed to by the cached A record (or IP address) becomes unavailable, the browser (or other application) will not re-query until the DNS cache is cleared.
The question has been raised “under what circumstances is the DNS cache cleared”? The implication is:
If we can count on user behavior to clear cached DNS responses, the requirement for multiple A records can be relaxed, and GSLB can function properly.
The purpose of this addendum is to address that question.
If a DNS based GSLB replies with the IP address of a site, and later the Internet connection or power or SLB/switching equipment a that site fails (or there is a total site loss), it is desirable for the GSLB device to be re-queried so that it can reply with the IP address of a site that is functioning, and users may continue to conduct business. DNS reply caching can prevent such re-querying. DNS replies may be cached in one or more places as shown in the following diagram:
1) At the client computer
2) At the client’s proxy server
3) At the client’s DNS server
To reduce the effect of “3”, most GSLB devices by default return very short TTLs (some return TTLs of zero). Most DNS servers honor low (or zero) TTLs, therefore a GSLB device has reasonably good control over caching in DNS servers.
Many proxy servers such as Squid and MS ISA perform DNS caching, and by default either ignore TTLs, or overwrite low or zero TTLs. In these cases the GSLB device has no control over DNS caching.
Many client computer operating systems have built-in DNS caches. These DNS caches may or may not observe TTLs. Users have control over operating system caching, including the ability to disable it completely (which would of course be the best possible scenario for GSLB H/A).
Client browser applications also have DNS caches. The details of DNS caching are different for various types of browsers, and the implications vary depending on user behavior. This addendum discusses aspects of browser caching pertinent to GSLB.
The Microsoft site mentions DNS caching in Internet Explorer (IE), for example:
It does not, however, go into further detail about the exact behavior. This section attempts to provide more details about DNS caching in IE. The reader is encouraged to experiment with these various scenarios.
In the case of IE, there is one DNS cache per browser process, as shown in the diagram below:
The blue box shows the IE process running in the client computer. Within the IE process, yellow and magenta boxes represent in-memory objects, such as the DNS cache and the session cookie cache. This example shows one IE process, and only one open window. There may be many browser processes, and each browser process may have many open windows, all which share the same DNS cache as shown in the following diagram:
Three browser instances are shown above.
1) An instance with windows open to CNN and Time
2) An instance with a Hotmail window
3) An instance with two windows open to Google
The same windows could be open, all sharing the same browser process (and therefore DNS caches) as shown below:
The diagram above shows windows to CNN, Time, Hotmail, and Google, all which share the same DNS cache.
So the question is, “under what circumstances do windows share instances of IE and therefore the DNS caches?” This depends on the method by which the window was created.
Cases where a new instance of IE is started and an independent DNS cache is used:
Cases where a new window is created using a shared instance:
Note: Most Web sites will launch a new window for external links. Some even explain why. Here is a tell-tale excerpt from the CNN site:
external sites will open in a new browser.
CNN.com does not endorse external sites.”
In the case of Netscape Navigator, the DNS cache is always shared between all windows (or instances). It does not matter how the user opened the windows.
The following based on recent test results from the author and others. To the best of my knowledge there is no formal documentation of this behavior, and all of this is subject to change with minor releases of the IE software.
The behavior of the internal DNS cache in IE is different for versions that run on Windows XP. Windows XP partially observes TTLs set by DNS servers. There is an apparent rolling (or inactivity) timer of approximately two minutes, such that if a client is unable to make a TCP connection with the IP address given in an A record, and the TTL for the A record is, say, zero, and the user does not intervene (i.e. does not click “Refresh” or again click a bookmark, link, etc.), the browser will request a new DNS resolution (of course solving the problem described here). Of course this does not help matters much, as a user would be quite likely to click “Refresh” or a different link on the site more often than once every two minutes (until finally giving up and surfing off to some other site). Every time the new link from the same site is clicked, another 2 minutes is added, thereby potentially extending the effective TTL to “forever”.
Here’s how to disable the DNS cache in WindowsXP:
WindowsXP users can also clear the DNS cache with the command line:
Nevertheless, the majority of users will never tamper with operating system DNS caching.
The behavior of IE as it relates to DNS caching changed as of SP2, such that the “2 minute rolling window” described above is gone. (i.e. if connection attempts to all IP addresses in A records for that FQDN fail, a new DNS resolution occurs immediately, regardless of user interaction).
As shown in the following diagram, a “dead” A record could potentially be cached in several places, depending on which windows were used to access the site.
That said, a user would likely see a failed connection in only one window. Simply closing the browser window which shows the error doesn't clear the DNS cache.
Take the following example: a user has windows open to several sites. These windows have been opened over the course of a workday. The user now clicks a bookmark to access online banking:
During the course of the online banking session, the datacenter the user is accessing fails:
The user now closes this browser window, and again uses the bookmark in a different window:
The online banking site still appears failed!
Clearly this example shows only one of an infinite number of possible scenarios, but it will suffice to say that:
User behavior cannot be relied upon for the purpose of clearing the browser DNS cache, therefore multiple A records are a critical component of any browser based multi-site high-availability application.
Note: If it can be assumed that all clients that might access a given site will behave as described in the section on Windows XP, SP2, then the return of a single A record with a low TTL may be a viable solution. Unfortunately, most commercial Web sites must also reliably serve clients other than Windows XP, SP2, and also clients behind Web caches.
Copyright Tenereillo, Inc. 2004
 There is an available add-on which allows older versions of Squid to observe TTLs. http://www.squid-cache.org/Doc/FAQ/FAQ-2.html. The latest versions of Squid will replace a zero TTL with the value in the variable “positive_dns_ttl” (default of 6 hours), and a TTL with a value of less than 1 minute with the value in “negative_dns_ttl” (default of 1 minute). Source: Duane Wessels, Squid cache developer and author of the book “Web Caching”.