DNS Requests to the Internet Fail suddenly
Windows Server 2000 SP4
PIX 501 Firewall version 6.3(5)
Im experiencing a strange DNS resolution problem. Resolution of internet names from the LAN we are using ISP DNS servers as forwarders listed in the forwarders tab of DNS on the Win2k Server will suddenly stop working.
Here is some data Ive collected:
There are NO errors in the DNS event log. There does not seem to be any correlating events in any of the other logs either (app and system);
After enabling DNS logging via dns.log, I see several entries that seem to indicate problems:
o Some response queries show SERVFAIL with RCODE 2
o Some of the queries (it appears like the first one when using nslookup) made from the DNS server itself have the internal FQDN appended to it. Later queries in the log for the same name from the server do not have this. It appears that when using nslookup, this is normal.
o Overall, there appear to be more queries than query responses.
If I enter into nslookup and set the server to another internet facing DNS server, it will resolve external names. If you exit nslookup, queries continue to not work.
Packet captures also show the SERVFAIL RCODE2 responses. What appears interesting to me is that queries are made but there are no corresponding responses.
PIX firewall does not appear to reflect that DNS requests are blocked.
PIX Fixup has been adjusted to allow DNS packets up to 5000 bytes.
The other interesting thing is that both of the following will temporarily resolve the issue, but it always comes back:
Restart the Windows 2000 DNS Server service
It also appears that a restart of the PIX firewall will as well (Ive only tested this once)
Your help is appreciated.
July 30th, 2009 5:49pm
Hello,is the DNS server multihomed? Please post an unedited ipconfig /all from it and also a problem client.Best regards
Meinolf Weber
Disclaimer: This posting is provided "AS IS" with no warranties, and confers
no rights.
Free Windows Admin Tool Kit Click here and download it now
July 30th, 2009 6:21pm
It's not multihomed (aside from an unused PPP adapter). Here's the output. All data collected has been from the server itself so the output is from there.
Windows 2000 IP Configuration
Host Name . . . . . . . . . . . . : w2ksrv1Primary DNS Suffix . . . . . . . : geo-w-drummond.comNode Type . . . . . . . . . . . . : Hybrid
IP Routing Enabled. . . . . . . . : No
WINS Proxy Enabled. . . . . . . . : No
DNS Suffix Search List. . . . . . : geo-w-drummond.com
Ethernet adapter Broadcom NetXtreme Gigabit Ethernet Adapter - onboard 1:
Connection-specific DNS Suffix . : Description . . . . . . . . . . . : Broadcom NetXtreme Gigabit EthernetPhysical Address. . . . . . . . . : 00-0B-DB-AC-05-45
DHCP Enabled. . . . . . . . . . . : No
IP Address. . . . . . . . . . . . : 10.0.0.5
Subnet Mask . . . . . . . . . . . : 255.255.255.0
Default Gateway . . . . . . . . . : 10.0.0.1
DNS Servers . . . . . . . . . . . : 10.0.0.5
PPP adapter {66DBB24D-2AC5-4E1D-825E-2CB60F8C2606}:
Connection-specific DNS Suffix . : Description . . . . . . . . . . . : WAN (PPP/SLIP) Interface
Physical Address. . . . . . . . . : 00-53-45-00-00-00
DHCP Enabled. . . . . . . . . . . : No
IP Address. . . . . . . . . . . . : 192.168.234.235
Subnet Mask . . . . . . . . . . . : 255.255.255.255
Default Gateway . . . . . . . . . :
DNS Servers . . . . . . . . . . . : 127.0.0.1NetBIOS over Tcpip. . . . . . . . : Disabled
July 30th, 2009 8:31pm
Hello,even if not used complete remove the PPP adapter configuration. In the moment the server is multihomed. ALso make sure that in DNS only the 10.x.x.x is registered and not the other one from the PPPadapter.Best regards
Meinolf Weber
Disclaimer: This posting is provided "AS IS" with no warranties, and confers
no rights.
Free Windows Admin Tool Kit Click here and download it now
July 31st, 2009 1:57am
Not sure how to proceed with that. In Network and Dial-up connections, only 2 network interfaces are listed (the 2 that ship with the server): the main LAN interface and a disabled second adapter. No Dial-up adapter. However, I think this is for a RAC that ships with the server; it's listed under Modems and I don't think I should remove it.
July 31st, 2009 2:26am
Hello KevinTE,
Would you please contact ISP to check if their DNS is a bind server? If it is a bind server, please try to let them clear the cache.
Based on our experience, when Windows 2000 DNS server is forward to a bind server, it suddenly stops responding after several hours. When we stop and restart DNS the server works again.
To troubleshoot this issue, please refer to the following suggestion to see if it can be helpful.
1. Check to see that secure against cache pollution is set.
2. If possible, Check forwarders to clear cache on server we are forwarding to.
3. Please alsoclear the cache onthe local DNS server.dnscmd /clearcache
Hope it helps.This posting is provided "AS IS" with no warranties, and confers no rights.
Free Windows Admin Tool Kit Click here and download it now
July 31st, 2009 10:31am
Thanks David! I'm not sure how far I'll get with the ISP but I can try. I can certainly manage the rest. The "secure against cache pllution" value IS set. Is this a Windows 2000 problem or would it exist in 2003/2008 as well? I've never encountered it before.Thanks.
July 31st, 2009 3:21pm
Hi KevinTE,
Thank you for the reply.
Based on my research, this problem is probably related to the DNS server on ISP. The DNS servers that is based on Windows server 2003 or Windows Server 2008 with the same configuration as Windows 2000 could also be encounter this kind of issue if that the forwarder is a bind server.
Hope it helps. This posting is provided "AS IS" with no warranties, and confers no rights.
Free Windows Admin Tool Kit Click here and download it now
August 3rd, 2009 3:15pm
Thanks David. Can you point me to some reference to what the issue is with bind?Thanks.Kevin
August 4th, 2009 4:35pm
Hi Kevin,
As you mentioned:
After enabling DNS logging via dns.log, I see several entries that seem to indicate problems:
Some response queries show SERVFAIL with RCODE 2
Packet captures also show the SERVFAIL RCODE2 responses.
Based on our experience, what the ServFail tells us is that there was possibly an error experienced by the DNS server a timeout occurred during DNS request forwarding to ISP bind server.
For test purpose, you may use NSLookup tool on the DNS server and set the ISP DNS server as the server to see whether a name can be returned properly:
nslookup
server <ISP DNS server>
Please collect a network monitor trace while this issue is reproduced.
http://www.microsoft.com/downloads/details.aspx?FamilyID=983b941d-06cb-4658-b7f6-3088333d062f&displaylang=en
1. Enable the Capture Filter "IPv4.Address == <ip of the client>" and start capture.
2. Restart one of clients to reproduce the issue.
3. Stop capture and save to *.cap file.
How to use Network Monitor to capture network traffic
http://support.microsoft.com/kb/812953
Please send us the cap file via tfwst@microsoft.com I have found another reference for you.
bug in bind-9.3.2-P2 - SERVFAIL?
https://lists.isc.org/pipermail/bind-users/2007-August/067396.htmlHope it helps.
Disclaimer
This response contains a reference to a third party World Wide Web site. Microsoft can make no representation concerning the content of these sites. Microsoft is providing this information only as a convenience to you: this is to inform you that Microsoft has not tested any software or information found on these sites and therefore cannot make any representations regarding the quality, safety, or suitability of any software or information found there. There are inherent dangers in the use of any software found on the Internet, and Microsoft cautions you to make sure that you completely understand the risk before retrieving any software on the Internet.
This posting is provided "AS IS" with no warranties, and confers no rights.
Free Windows Admin Tool Kit Click here and download it now
August 5th, 2009 1:37pm
I'll be sure to go through the links you listed. In the meantime, a few things:As indicated in my original post, I did the nslookup tests you recommend. I wrote, "If I enter into nslookup and set the server to another internet facing DNS server, it will resolve external names. If you exit nslookup, queries continue to not work."To be even clearer, if I nslookupto the ISPs DNS serverthat appears in the forwarders list (24.153.23.114) I CANsuccessfully resolve names. If I exit and retry it doesn't work. This seems to indicate that the DNSserver isnot forwarding properly, but the packet capture shows the request being made, which tells me it is. So what does this mean? Well, Iwhen I turned to the PIX to see if it was dropping the requests, the logs made no indication of that but I'm no cisco guru so maybe it's just not logging it? Logging is enabled on the PIX but I don't know what level at which it will log dropped packets, can you tell me? Will it be at the Notification level?I'mreconfigured the dns serverusing root hints and not using forwarders. The issue still recured.I did some capturing directly on the PIX and the outside interface is showing the DNS requests. There were NO replies except when I went into nslookup, set the server to my ISPs DNS server IP, and queried a name, at least while the issue is experienced. After restarting the DNS server service, the same pix-based capture on the outside interface shows queries and responses without having to go into nslookup, setting the server, and querying a name.Any ideas what all this means? Using root hints, I can't see this being a bind issue unless the root servers also experience the same issues. But I don't experience this anywhere else. I think there's something wrong with the dns server or the pix. I'd tend to blame the PIX first as we recently changed ISPs and ran through the configuration wizard. Perhaps it changed something unexpected? How could I tell?Kevin
August 5th, 2009 4:49pm
I too have suddenly started to experience this problem in the past few days, on a Win2k server that has been running fine for years. We are also using ISP forwarders. Nothing about the server configuration has changed recently.Starting 3 days ago, the DNS service fails to resolve queries. You can use NSLOOKUP on the server, and no queries will resolve. Restarting the DNS server service immediately clears the problem, but within6 -24 hours the problem happens again - and persists until the DNS server service is restarted.Nothing in the system/app/dns logs or the dns.log file give any clue as to the cause of this trouble.Interesting to note, the DNS.LOG file stops being updated with any infoas soon as the problem occurs, it does not even log client requests. As soon as the DNS server service is restarted, the new DNS.LOG fileresumes updatesuntil the problem re-occurs.I should also point out that in my case, when the DNS server stops working, not only can it not resolve foreign hosts, but it also cannot resolve hosts for domains that it is authoritive for. Since it is a slave DNS server on an Active Directory domain, it is authoritive for the hosts in that domain and replicates the zones from the domain controllers. However when the DNS server stops working, it can no longer resolve any host namesin the domain that it is authoritive for.This manifests itself in all manner of trouble: Since the server is using itself for DNS queries,it can no longer doNTP updates with the DCs, it cannot contact the license logging server, it cannot contact a domain controller, etc etc. All ofthis is logged in the appropriate logs, but they are just symptoms of the DNS server dying off.For now I am using a script to detect persistent query failures on this server and automatically restart the DNS server service when failures are detected. It is not the preferred long term solution, but at least it keeps the users in this small, remote office from ringing our help desk lines off the hook every 6 - 24 hours.Any suggestions would be greatly appreciated!
Free Windows Admin Tool Kit Click here and download it now
August 6th, 2009 11:36pm
Hi Kevin,
To help you analyze the specific DNS issue, would you please capture network monitor file in the following 2 scenarios?
Scenario1. nslookupto that ISPs DNS serverthat list in the forwarders list (24.153.23.114)
Scenario2. Exit nslookup and try reproducing the issue and capture network monitor.
You can send us the cap files via tfwst@microsoft.com
I appreciate your time and effort.This posting is provided "AS IS" with no warranties, and confers no rights.
August 7th, 2009 2:11pm
And I appreciate yours David. I already have those captures so I'll send them to your email.In the meantime, since we still have the old ISP's business PPPoE connection, I've switched back to it and reconfigured the PIX accordingly. It has not yet been 24 hours so I'm going to leave it like this for the weekend and see if the issue happens. If it doesn't, than I'm inclined to think that it is a PIX configuration problem, though what specifically I couldn't say (I'll compare config files and see if something jumps out). With the new ISP (business cable) I have used both the old and new ISPs DNS servers as forwarders, and even configured the internal DNS server to use root hints and the issue always occurs still so it doesn't sound to me like it's an ISP issue or a DNS forwarding problem.
Free Windows Admin Tool Kit Click here and download it now
August 7th, 2009 3:46pm
Hi KevinTE,
Thank you for your attention.
You are right. From the DNSBad2.cap, it seems that there name resolution is working on the DNS server 10.X.X.X. However, when we check the network monitor packets Pixdnsadoutside and Pixdnsadoutside which captured from inside and outside interface from PIX, and we found that there is no response in most of the query server on Internet name.
Personally, I suspect the root cause of the issue is most probably resided on the PIX configuration. If possible, please roll back to its previous configuration on the PIX firewall to monitor if the issue can be fixed.
Hope this can be helpful.This posting is provided "AS IS" with no warranties, and confers no rights.
August 10th, 2009 11:38am
Hi,I want to see if the information provided was helpful. Please keep us posted on your progress and let us know if you have any additional questions or concerns.This posting is provided "AS IS" with no warranties, and confers no rights.
Free Windows Admin Tool Kit Click here and download it now
August 17th, 2009 9:27am
As mentioned we had switched back to the old ISP and restored the PIX using a backup from that configuration. After leaving it for a week we didn't have any DNS problems. I then switched them back to their new ISP (Rogers) but instead of running the PIX configuration wizard, I just changed the particulars: PPPoE to static IP, the IP, mask, gateway, and some inbound access rules. So far, no problems but I'll be keeping an eye on it.I also compared the PIX configuration files from the PPPoE and the recent "bad" configuration and there was nothing notable, just the PPPoE vs static differences so I see noodd configuration that would have caused this issue. Is there anything else I can do to try topin down what the root cause wasi.e if this was aproblem in the PIX, what specifically was the problem?Thanks again for all your help.Kevin
August 17th, 2009 7:40pm
Hi Kevin,
Thanks for update.
I am glad to hear that you have resolved this issue.
Regarding the issue is fixed now after you adjust PIX configuration, so i would like suggest you continue to use the current setting and keep it monitored.
If this issue re-occurs, please feel free to let me know.This posting is provided "AS IS" with no warranties, and confers no rights.
Free Windows Admin Tool Kit Click here and download it now
August 18th, 2009 12:27pm
Well, it recurred this morning. I'm going to speak with the ISP and see if they have any suggestions. I'm out of ideas unless you have any more?Kevin
August 18th, 2009 3:40pm
I think we need to try to answer the following question:What would cause DNS resolution to fail when using root hints OR forwarders but succeed when going into nslookup and assigningan externalDNS Server(the same as used when forwarders was used) for the "server" value?For Example.C:\Documents and Settings\Administrator>nslookup www.google.comServer: w2ksrv1.myinternaldomain.comAddress: 10.0.0.5
DNS request timed out. timeout was 2 seconds.*** Request to w2ksrv1.myinternaldomain.com timed-out
//nslookup>server 24.153.23.114>www.google.comServer: [24.153.23.114]Address: 24.153.23.114
Non-authoritative answer:Name: www.l.google.comAddresses: 209.85.225.147, 209.85.225.103, 209.85.225.104, 209.85.225.99Aliases: www.google.com//How can this occur?Thanks!Kevin
Free Windows Admin Tool Kit Click here and download it now
August 18th, 2009 4:01pm
In the meantime, I'm going to replace the PIX with another router and see what happens. I (still) welcome your responses and really appreciate your continued support.
Kevin
August 18th, 2009 6:26pm
I think we need to try to answer the following question:What would cause DNS resolution to fail when using root hints OR forwarders but succeed when going into nslookup and assigningan externalDNS Server(the same as used when forwarders was used) for the "server" value?For Example.C:\Documents and Settings\Administrator>nslookup www.google.comServer: w2ksrv1.myinternaldomain.comAddress: 10.0.0.5
DNS request timed out. timeout was 2 seconds.*** Request to w2ksrv1.myinternaldomain.com timed-out
//nslookup>server 24.153.23.114>www.google.comServer: [24.153.23.114]Address: 24.153.23.114
Non-authoritative answer:Name: www.l.google.comAddresses: 209.85.225.147, 209.85.225.103, 209.85.225.104, 209.85.225.99Aliases: www.google.com//How can this occur?Thanks!Kevin
Hi Kevin,As for your concern on why it is generating timeout and failed with name resolution when we use root hints or forwarders, based on my experience, if the DNS forward query to Root hint on the Internet or other DNS server, then it may get the timeout.
For example, if you input www.contoso.com, then the nslookup append the suffix contoso.com. at the end and send out the query to check the record under zone contoso.com. Then the query would be forwarded to a public DNS server on Internet if you have configured root hint or forwarder. Waiting for the response from Internet may take more than 2 seconds which may cause the problem. This is the usual cause of the timeout. So generally, a long suffix and a forwarder to other DNS may cause a longer delay than 2 seconds.
However, when you assign another external DNS server, that DNS server may give us the response within timeout. Thus, you can successfully get the result.
Hope this can be helpful.This posting is provided "AS IS" with no warranties, and confers no rights.
Free Windows Admin Tool Kit Click here and download it now
August 20th, 2009 6:22am
Thanks. So are we saying that my DNS suffix is too long? If this was the case, I'd expect it to be intermittent, not work fine for a while and then fail outright and not recover until either restarting DNS server or the PIX.I replaced the PIX and so far, 3 days later, no issue. The problem is, what's wrong with the PIX configuration with this ISP and not the previous one? I can't see a setting in the config file that would affect this. The new ISP maintains there is no problem on their end. I'd hate to get rid of the PIX because of this, but as I appear to have isolated the problem to the PIX iteself and further haven't found a PIX setting that causes/fixes the issue, I might have no choice.Any suggestions on why the PIX might cause this and what setting I could tweak/fix? Do I have to alter the ruleset for successful and consistent name resolution? Should I make a specific change to the inspection system and if so what?Kevin
August 21st, 2009 7:26pm
Just curious, what was the final resolution? I am experiencing similar problems but I do not have a PIX
Free Windows Admin Tool Kit Click here and download it now
July 6th, 2010 6:31pm