Strange Kerberos phenomenon
Hi,
we are using IIS 6.0 on W2k3 SP2 with Kerberos delegation to access file shares via CIFS as well as to authenticate users in one of intranet apps which connects to different SQL Servers using delegated authentication.
Round about every three months the delegation fails because IIS authentication falls back to NTLM for all clients. The next day authentication works as usual with Negotiate (Kerberos). The configuration of IIS seems to be correct (SPNs registered, Site set
up etc.) because it works 89 of 90 days. The application pool which hosts the app and the CIFS passthrough are both identified as local system. The clients as well as the server are on the same AD site and are using the same two DCs. I know this phenomenon
since round about 1 year and I guess the next breakdown will be in August.
I don't know how I can debug that problem because I only have a time window of some 8 hours every three months. I thought that it might be associated with a password change of the computer account as we use the local system account to host the applications.
Are there any means where I can verify the date of the last change?
I tried to enable Kerberos debugging as explained in
http://support.microsoft.com/kb/262177/en-us but the messages received didn't help me. Do you have any further pointer for me?
Thank you very much!
Bye
Daniel
June 7th, 2010 3:06pm
Verify the time is syncronized between the IIS server and the domain.
http://tldp.org/HOWTO/Kerberos-Infrastructure-HOWTO/time-sync.html
I hope this information is useful.
Free Windows Admin Tool Kit Click here and download it now
July 30th, 2010 11:57pm
Verify the time is syncronized between the IIS server and the domain.
http://tldp.org/HOWTO/Kerberos-Infrastructure-HOWTO/time-sync.html
I hope this information is useful.
July 30th, 2010 11:57pm
Verify the time is syncronized between the IIS server and the domain.
http://tldp.org/HOWTO/Kerberos-Infrastructure-HOWTO/time-sync.html
I hope this information is useful.
Free Windows Admin Tool Kit Click here and download it now
July 30th, 2010 11:57pm
Verify the time is syncronized between the IIS server and the domain.
http://tldp.org/HOWTO/Kerberos-Infrastructure-HOWTO/time-sync.html
I hope this information is useful.
July 30th, 2010 11:57pm
Verify the time is syncronized between the IIS server and the domain.
http://tldp.org/HOWTO/Kerberos-Infrastructure-HOWTO/time-sync.html
I hope this information is useful.
Free Windows Admin Tool Kit Click here and download it now
July 30th, 2010 11:57pm
Verify the time is syncronized between the IIS server and the domain.
http://tldp.org/HOWTO/Kerberos-Infrastructure-HOWTO/time-sync.html
I hope this information is useful.
July 30th, 2010 11:57pm
Hi Roy and other followers,
sorry for the late response but I didn't get any mail about your reply. I have to check my notification settings... But I still hadn't any news about that problem - until today.
Today we had the same issue. First let me answer to your reply: I know that all involved computers (client, DCs, mid tier and backend server) had synced clockes today - they are all members of the same active directory domain and
are located on the same site with the same NTP clock source.
But I could find some new information about the issue:
The machine account password has been changed that morning (verified via AD attribute pwdLastSet of the computer account).
I used DelegConfig (http://blogs.iis.net/bretb/archive/2008/03/27/How-to-Use-DelegConfig.aspx) to check what is going on and got the first error on the first step of
the http service test which told me:
Service account NETZWERKDIENST (Sxxx$) is
not a domain account.
(it's a German Win2003 SP 2 and NETZWERKDIENST translates to network service, Sxxx is the correct machine name).
All subsequent tests failed too because they depent on the AD account. Under Overall status the first two checks succeeded (Client user domain account, Authentication Method is Kerberos).
But the computer had still a valid machine account. E.g. I was able to browse file shares via unc path and the local delivery of web pages via IIS worked (our intranet is hosted on that server) from the same client where I was not able to open the applications
which use delegation.
I spent more or less my hole working day with Google and experimenting with diagnostic tools. For example I found nltest.exe (very nice tool) and tried nltest /sc_reset - without any change.
I found this kb arcticle (http://support.microsoft.com/kb/918442/en-us) which describes exactly my problem but the server has SP2 installed and the version string of kerberos.dll shows a greater
version than the one mentioned in the kb article.
Another piece of information I noticed today may be that it only seems to happen if we have active directory structure changes. Last time I saw that phenomenon we had some DC changes at branch offices (hardware replacements) and yesterday a colleague integrated
the first Windows 2008 DC in our environment (schema update has been done long before). It's very odd that we saw that problem every three months before my last post (I have tracked the dates).
Has someone any further pointers for me? Now I'll leave office and if it's business as usual (or expected? ;-)) our apps with delegation will work that evening or tomorrow in the morning as the did the last months.
Best regards
Daniel
Free Windows Admin Tool Kit Click here and download it now
March 15th, 2011 1:00pm
I don't think the KB you mention solves your problem as you are talking about Network Service account which is not Local System and which is not affected by the mentioned KB according to my understanding.
as you say that SC_RESET does not improve the things, have you run the SC_RESET several times? did you see that the DC name that you were connected to changed on every SC_RESET?
You also say that the "delegation" fails. You say that the AppPool is running under Network Service. So we can try to investigate further. What if that was due to the server's password change? Ok, I can imagine the following scenario:
- a user accesses the web site and the Network Service asks for the user's delegated TGS to be obtained from a DC.
- the Network Service session receives TGS whith the follwoing parameters: who=TheUser where=CanToDB, and the TGS is encrypted by the SERVERS! password (in this case password-A).
- the ticket is cached in the Network Service's AppPool's logon session for next 10 hours.
- further authetnications of the same user do not require the Network Service to go to DC again but can reuse the previously cached user TGS which is encrypted with password-A
- later the machine changes its own password to password-B
- everything works fine becuase its own tickets which are LATER generated are encrypted with the password-B
- and I can imagine that the machine also purges all its own tickets when it is chaning its password from A to B
- it may happen that the "delegated-tickets-cache" which contains some delegated TGS tickets still encrypted with the orignal password-A is not purged nor reencrypted.
so we can perform a simple test to confirm the behavior - just let a user delegate and enforce the computer password change (NETDOM RESETPWD) and try to access the web again with the same user and with onother one as well.
I am going to try it myself. this is veeery interesting.
ondrej.
March 17th, 2011 9:03am
machines are changing their passwords every month, so why the 3-month period? bacause the tickets expire in 10 hours. as only every third password change hits the user-business hours you observe the bahavior.
still moooooor interersting!
ondrej.
Free Windows Admin Tool Kit Click here and download it now
March 17th, 2011 9:06am
hi, so I am back with some really interesting results:
I have the following setup:
CLIENT (win7) ----- WEB (iis 7.5, 2008 r2, winauth, Kernel Mode Auth ON, apppool under Network Service) ------ DB (sql 2008, win auth)
The web application (classic ASP) is delegating (constrained delegation) to the DB
Everything works normally until:
1) access the WEB's site under a user-A, confirm that it delegates correctly
2) fact: the user-A's TGS for DB is cached in the Network Service's cache on the WEB
3) fact: the server WEB has a current computer password WEB-PWD-A
4) reset the WEB's machine password with NETDOM RESETPWD to something else (WEB-PWD-B)
5) try the delegation and it still works (the WEB computer remembers actually two passwords, both the current and the previous one)
6) reset the WEB's computer password again with NETDOM RESETPWD (now it will be WEB-PWD-C)
7) fact: the delegation stops working for user-A until his TGSforDB expires
8) fact: the delegation works for any new users, such as user-B who have not yet accessed the WEB server previously
9) try purging the Network Service's TGS cache (on the WEB computer) manually with KLIST -LI 3e4 PURGE
10) try the delegation again and it works again even for the original user-A
11) fact: after we have purged the WEB computers ticket cache, it must have obtained the delegated user tickets again
So to you own troubleshooting:
it may be, that your server sometimes experiences password change problems and retries the password change twice which then renders the cached delegated TGSs useless but does not purge the cache until expired.
YOUR ACTION:
enable NETLOGON logging on the WEB and either watch what happens with NETDOM RESETPWD, whether there are any errors comming from the password change or let it wait until next problem reappears.
To enable NETLOGON logging, you do this: NLTEST /dbflag:0x2080ffff (according to:
http://support.microsoft.com/kb/109626)
ondrej.
March 17th, 2011 11:37am
Hi Ondrej,
thank you very much for your responses. I don't think that we have a problem with cached tickets because the first thing I tried was to clean the ticket cache on my computer for HTTP/intranet... and HTTP/servername. After accessing the intranet site with
the IE again I had a new ticket but still a problem with the delegation.
Just to clearify: I can authenticate on the hosted website itself on that server but ONLY the delegation of CIFS and MSSQL (we don't use delegation for other services) fails.
I'll try your suggested tests with activated logging during the evening hours because I have to do the test on a production machine. I'll answer as soon as I have new information.
Just to add: the next morning after my last most the delegation worked as expected.
Best regards
Daniel
Free Windows Admin Tool Kit Click here and download it now
March 21st, 2011 3:38am
Hi Ondrej,
I have a problem. I'm on a Windows 2003 R1 SP2 domain member server and netdom tells me the following:
NETDOM RESETPWD Resets the machine account password for the domain controller
on which this command is run. Currently there is no support for resetting
the machine password of a remote machine or a member server. All parameters
must be specified.
Is there any other command line to reset the computer's password?
Thank you!
bye
Daniel
March 21st, 2011 1:34pm
you must be misunderstanding the parameters. the /Server should point to a DC. NETDOM resets password of the server on which it is running agains the DC as specified by the /Server parameter.
ondrej.
Free Windows Admin Tool Kit Click here and download it now
March 23rd, 2011 9:39am
yes, this exactly fits my previous thoughts. You should be able to authenticate the users as the USER tickets comming to the server are valid. While the user DELEGATED tickets that are stored in the SERVER's cache are encrypted with the previous-previous
password and the server cannot decrypt them to use them for the delegation.
ondrej.
March 23rd, 2011 9:41am
Hi Ondrej,
back again. I tried to follow your above description on our production server but I still have some problems because it's a Win 2003 system. I don't agree that it's possible to use resetpwd on a member server. I tried it and got the following error:
The machine account password for the local machine could not be reset.
followed by a german description which tells me that the specified domain isn't available or the connection to it isn't possible. As the netdom.exe help itself says that resetpwd is only for domain controllers I really doubt that I can use it on the
web server which is a member server.
I also tried to use klist -li 3e4 but it doesn't work with all the versions of klist.exe I tried. On a Win 2008 x64 it worked but I can't use that binary to list the tickets on the Win 2003 because it is a x86 installation - and I doubt that the
required dlls are the same on 2008 and 2003. Do you know any donwload links to the most current version of klist.exe for Win 2003? I downloaded and installed the Resource Kit Tools but klist only accepts tickets, tgt or purge. No other parameters.
I agree that your explaination of our problem sounds really appropriate but I need a version of klist for Win 2003 to verify this. Another thought about your explaination: the problem remains even after rebooting the web server. I think a reboot should clean
the ticket cache, or not?
Bye
Daniel
Free Windows Admin Tool Kit Click here and download it now
March 25th, 2011 9:51pm
hello,
I am not sure what could be the problem with NETDOM, but it should work. I sometimes had problems similar to yours, so some notes to test:
- call the /UserD parameter with the user's domain name: /UserD:mydomain\domainadmin
- use NETDOM from Win2003 Sp2 Support Tools exactly
- have NETLOGON logging enabled during the netdom call and look into the netlogon.log file for some details
What concerns the KLIST, that is actually a problem on 2k3. You may download psexec and start CMD under Network Service credentials and then the purge command should work from there: PSEXEC -I -D -U "NT Authority\Network Service" CMD
ondrej.
March 26th, 2011 2:20pm
Altough the problem seem to be very interesting , the quick solution would be to switch the IIS pools to a dedicated service account that doesn't have a password expiration, you just have to be careful with the SPNs
Free Windows Admin Tool Kit Click here and download it now
March 28th, 2011 5:04pm