DNS troubleshooting tips

DNS: What Can Go Wrong and How You Can Fix It

DNS is a core component of the Internet. This rather complex system exists to perform the basic function of translating website Domain names (ex: into an IP address. It acts as the liaison between the language that users can understand, and the address of the data center or destination in which the site is being hosted.

DNS is what shapes the modern Internet. It enables users of all skill sets and technical knowledge levels to use this incredibly advanced resource on a daily basis. Businesses that exist entirely on the Internet like online banking and ecommerce companies have not only emerged into the market, but have steadily gained a global presence.

Despite its critical role, DNS isn’t impenetrable; even with redundancies and a fast, efficient design, it’s still vulnerable to performance issues and failures. When these issues occur, it means that your applications and websites are inaccessible and users are likely feeling frustrated. Performance degradation can have serious consequences that can affect your revenue and brand reputation, so it’s crucial that your DNS is reliable and consistent.

Regardless of where your DNS is hosted (internally vs. third-party such as OpenDNS, NS1, etc.), reliability should always be a priority. To ensure it’s reliable, you must use a DNS monitor that allows you to test and analyze the results so you can make improvements. Monitoring also gives you the ability to recognize performance trends and patterns, and identify areas of concern.

Let’s take a look at some of the real world concerns that can cause DNS resolution issues.

Not using a multiple name server architecture: 

DNS name servers are very important; they are the primary source that directs to the actual hostnames, which is what a client is looking for. These name servers can go down at times due to an issue (i.e. DDoS attacks, network issues with the server, etc.). This can be cause for serious concern if there is only one name server in place because the DNS resolution will fail and the end user wouldn’t be able to visit the page at all. It’s always advisable to have multiple name servers to answer DNS queries so that even if one fails, others can answer the query.

Let’s say I have a hostname and I have configured only one name server to handle the DNS query. So when the request reach, it gets the following information in the zone file:   172,800   IN   NS

Now imagine if this name server goes down; the complete DNS query will fail and the client will not be able to reach because there isn’t a second name server to handle failure.

Let’s look at another example: has four name servers which enables them to handle a name server failure more effectively.   172,800   IN   NS   172,800   IN   NS   172,800   IN   NS   172,800   IN   NS

In this case, whenever any of the four name server fails, there are other three name servers which will be answering DNS queries, making sure there is not a complete outage of DNS resolution process for

Absence of glue records:

A glue record is the IP address of the name server that is present in the same zone file. Now, this is a very important resource record considering how the whole DNS resolution works. Let’s try to understand this with the help of an example:

Say I have a hostname and for that I have three name servers, which will be used in my zone file when a DNS query comes for 172,800 IN NS 172,800 IN NS 172,800 IN NS

Now consider the case in which I do not give information about all three of the name servers in the form of an IP address (or glue records). Next, the DNS resolver will query one of the name servers; say is chosen. Now since the IP address of NS1 is not present in that zone file, our DNS resolver will start resolving from root, then .com, and, after querying, again it will end up with the same three name servers’ information as shown above. The DNS resolution process will be stuck in this loop forever until finally it fails. Hence, it is very important to add glue record while configuring the name server information in the zone file.   172,800   IN   NS   172,800   IN   NS   172,800   IN   NS   800   IN   A   800   IN   A   800   IN   A

DNS cache poisoning:

To put it simply, DNS cache poisoning means incorrect DNS information. For example, if starts pointing to a Google IP, that means Facebook will tell its user to search on a Google machine for Facebook content. DNS cache poisoning can occur in multiple ways, like if an attacker (middle man) gets ahold of the zone file of the authoritative name servers, they could change the value of A records and can cause incorrect DNS mapping.

It can also spread because of ISP caching wrong DNS entry, which is obtained from some compromised servers and can virtually spread from one ISP to the other and cause serious poisoning issues.

Insecure Zone transfer:

As explained earlier, having multiple name servers for a domain gives it a better chance of handling any kind of name server failure. It’s also very important for all the name servers to have updated information of the zone file, which is why you have to make sure all the Name servers have same information when it comes to zone transfer.

An insecure zone transfer can reveal a lot of critical information about a domain. Although zone transfer takes place over TCP protocol to make sure information transferred is more reliable, sometimes improper configuration of a particular name server can allow zone transfer to a third party (middle man), which means the third party will have all the critical information related to that particular domain. Hence, transferring zone information should happen between the same domain name servers. For example, if there are two name servers for (NS1, NS2), then should allow zone transfer only to, and discard all other third-party requests for transfer.

These examples are just a few that demonstrate why monitoring is important for organizations to keep an eye on any anomalies to their DNS service. Stay tuned for a follow-up article that will examine other DNS scenarios, and how to handle the issues that can occur.