cluster fails to reset CNO password in AD

We have a WS2012 Hyper-V cluster. The cluster has DNS name of hvcluster.domain.local, cluster CNO object in AD called hvcluster$, 2 nodes called node1.domain.local (computer account node1$) and node2.domain.local (computer account node2$)

The cluster CNO is in a failed state. As a consequence, its dynamic DNS record is missing and Live Migration doesn't work. The primary problem is that when I use the Repair option on the CNO, the repair will fail with the following error:

"There was an error repairing the active directory object for "Cluster Name'. Details: There was an error resetting the active directory password for 'Cluster name'. Error code: 0x80005000'

This isn't a new cluster, it's been running for about 2 years now, but this problem manifested recently. I'm aware of the AD requirements for the cluster and for testing purposes I've additionally granted Full Access on the hvcluster computer account to the cluster computer account itself and to both cluster nodes' computer objects (through a group that both nodes are members of).

The account I used for the Repair action (and all other actions) is a member of the Domain Admins group.

Since that didn't help, I've checked that Authenticated Users group is member of the local "Users" group on the cluster nodes. Additionally I've tried modifying local group policy per http://blogs.technet.com/b/askcore/archive/2013/04/04/new-network-name-resource-fails-to-come-online.aspx. That didn't help either.

I've also checked that http://support.microsoft.com/kb/2838043 is installed on both cluster nodes.

From the cluster log (excerpt):

000014a8.00001014::2015/03/03-12:52:32.368 INFO  [RES] Network Name <Cluster Name>: AccountAD: OU name for VCO is OU=Hyper-V,DC=domain,DC=local
000014a8.00001014::2015/03/03-12:52:32.383 INFO  [RES] Network Name:  [NN] Setting crypto access members for decrypt. New container = false.
000014a8.00001014::2015/03/03-12:52:32.383 INFO  [RES] Network Name: [NNLIB] Priming local KDC cache to \\DC01.domain.local for domain domain.local
000014a8.00001014::2015/03/03-12:52:32.383 INFO  [RES] Network Name: [NNLIB] PopulateKerbKDCLookupCache - DC flags 0
000014a8.00001014::2015/03/03-12:52:32.383 INFO  [RES] Network Name: [NNLIB] LsaCallAuthenticationPackage success with a request of size 100, result size 0 (status: 0, subStatus: 0)
000014a8.00001014::2015/03/03-12:52:32.383 INFO  [RES] Network Name: [NNLIB] Priming local KDC cache to \\DC01.domain.local for domain label domain
000014a8.00001014::2015/03/03-12:52:32.383 INFO  [RES] Network Name: [NNLIB] LsaCallAuthenticationPackage success with a request of size 78, result size 0 (status: 0, subStatus: 0)
000014a8.0000227c::2015/03/03-12:52:32.399 INFO  [RES] Network Name <Cluster Name>: Getting Read/Write private properties
000014a8.00001014::2015/03/03-12:52:32.414 WARN  [RES] Network Name: [NNLIB] LogonUserEx fails for user HVCLUSTER$: 1326 (useSecondaryPassword: 0)
000014a8.0000227c::2015/03/03-12:52:32.430 INFO  [RES] Network Name <Cluster Name>: Getting Read only private properties
000014a8.00001014::2015/03/03-12:52:32.446 WARN  [RES] Network Name: [NNLIB] LogonUserEx fails for user HVCLUSTER$: 1326 (useSecondaryPassword: 1)
000014a8.00001014::2015/03/03-12:52:32.446 INFO  [RES] Network Name: [NNLIB] Logon failed for user HVCLUSTER$ (Error 1326), DC \\DC01.domain.local, domain domain.local
000014a8.00001014::2015/03/03-12:52:32.446 ERR   [RES] Network Name:  [NN] GetToken - Logging on as the CNO failed with error 1326
000014a8.00001014::2015/03/03-12:52:32.446 INFO  [RES] Network Name <Cluster Name>: AccountAD: End of Slow Operation, state: Initializing/Writing, prevWorkState: Writing
000014a8.00001014::2015/03/03-12:52:32.446 WARN  [RES] Network Name <Cluster Name>: AccountAD: Slow operation has exception ERROR_INVALID_HANDLE(6)' because of '::ImpersonateLoggedOnUser( GetToken() )'
000014a8.0000227c::2015/03/03-12:52:32.446 INFO  [RES] Network Name: Agent: OnInitializeReply, Failure on (6b0ee668-0731-4252-b066-dd657fd23f25,AccountAD): 6
000014a8.0000227c::2015/03/03-12:52:32.446 INFO  [RES] Network Name <Cluster Name>: Configuration: InitializeReplyCreation of NetName (type Singleton), result: 6, IsCanceled: false
00001fdc.000018ac::2015/03/03-12:52:32.446 INFO  [GEM] Sending 1 messages as a batched GEM message
000014a8.0000227c::2015/03/03-12:52:32.446 INFO  [RES] Network Name <Cluster Name>: Configuration: Setting 'StatusKerberos' in clusdb returned status 0
000014a8.0000227c::2015/03/03-12:52:32.446 INFO  [RES] Network Name <Cluster Name>: Configuration: Deleting ResourceData, CreatingDC, ObjectGUID for a newly created netname from cluster database
00001fdc.000018ac::2015/03/03-12:52:32.446 INFO  [GEM] Sending 1 messages as a batched GEM message
000014a8.000021c4::2015/03/03-12:52:32.461 INFO  [RES] Network Name <Cluster Name>: Getting Read/Write private properties
00001fdc.000018ac::2015/03/03-12:52:32.461 INFO  [GEM] Sending 1 messages as a batched GEM message
000014a8.0000227c::2015/03/03-12:52:32.477 INFO  [RES] Network Name: Agent: OnInitializeReply, Failure on (6b0ee668-0731-4252-b066-dd657fd23f25,Configuration): 6
000014a8.0000227c::2015/03/03-12:52:32.477 INFO  [RES] Network Name <Cluster Name>: SyncReplyHandler Configuration, result: 6
000014a8.00001568::2015/03/03-12:52:32.477 INFO  [RES] Network Name <Cluster Name>: PerformOnline - Initialization of Configuration module finished with result: 6
000014a8.00001568::2015/03/03-12:52:32.477 ERR   [RES] Network Name <Cluster Name>: Online thread Failed: ERROR_SUCCESS(0)' because of 'Initializing netname configuration for Cluster Name failed with error 6.'
000014a8.00001568::2015/03/03-12:52:32.477 INFO  [RES] Network Name <Cluster Name>: All resources offline. Cleaning up.
000014a8.00001568::2015/03/03-12:52:32.477 ERR   [RHS] Online for resource Cluster Name failed.

Any ideas? Btw. I've been through many articles like: https://support.microsoft.com/kb/2838043/, https://social.technet.microsoft.com/forums/windowsserver/en-us/2ad0afaf-8d86-4f16-b748-49bf9ac447a3/ws2012-cluster-network-dns-issues, http://blogs.technet.com/b/askcore/archive/2013/04/04/new-network-name-resource-fails-to-come-online.aspx, http://blogs.technet.com/b/askcore/archive/2012/09/25/cno-blog-series-increasing-awareness-around-the-cluster-name-object-cno.aspx etc.

March 3rd, 2015 1:30pm

Hi MarkosP,

The  error 0x80005000 often occur when CNO is corrupt in AD or CNO and VCO not in same ou, please open the Active Directory users and computers and confirm whether the CNO was not under the same OU, if not please move them to the default Computer OU and gave the CNO full permission on the default computer OU.

More information:

Recovering a Deleted Cluster Name Object (CNO) in a Windows Server 2008 Failover Cluster

http://blogs.technet.com/b/askcore/archive/2009/04/27/recovering-a-deleted-cluster-name-object-cno-in-a-windows-server-2008-failover-cluster.aspx

Im glad to be of help to you!

Free Windows Admin Tool Kit Click here and download it now
March 4th, 2015 5:47am

Hi Alex. I've moved the cluster CNO object to the default Computers container in AD, gave the CNO full access (with full inheritance) on the container, but that didn't help either. I got the same error when I tried the Repair action or when I moved core cluster resources from node to node.
March 4th, 2015 7:50am

Did you tried to remove the CNO and prestage the CNO again?

https://technet.microsoft.com/en-us/library/dn466519.aspx

Free Windows Admin Tool Kit Click here and download it now
March 4th, 2015 10:44am

I'm not so sure you can just delete the CNO for an EXISTING cluster and create a new one (with different SID/objectGUID). Are you sure about that?

Why would there be articles about recovering deleted CNOs like http://blogs.technet.com/b/askcore/archive/2009/04/27/recovering-a-deleted-cluster-name-object-cno-in-a-windows-server-2008-failover-cluster.aspx
  • Edited by MarkosP Wednesday, March 04, 2015 11:09 AM
March 4th, 2015 11:08am

I'm sorry, don't remove it, it's a huge mis
Free Windows Admin Tool Kit Click here and download it now
March 4th, 2015 11:17am

You were using the cluster for 2 years and the issue started manifesting recently ?

Is there any change on your Active Directory platform ? : Upgrade, Updates, Group policy...

March 4th, 2015 11:24am

I have been a lot of times in such issues. With Windows Server 2012 , things changed and the Active Directory configuration have to meet the fail over cluster requirements.

Usually it's :

- Permissions on the CNO object

- Caused by group policies

Did you tried this: Place the CNO and the nodes in a new OU and block the inheritance, GPO update and retest

http://blogs.technet.com/b/askcore/archive/2012/03/27/why-is-the-cno-in-a-failed-state.aspx

Free Windows Admin Tool Kit Click here and download it now
March 4th, 2015 11:32am

I've tried this:

 - created new OU in the AD

 - granted Full Access permissions on this OU (with full inheritance) to the CNO and cluster nodes (computer accounts)

 - moved the CNO and nodes computer accounts to this OU

 - blocked GPO inheritance on this OU

 - ran gpupdate /force on both nodes

Then I re-ran the Repair action and also tried to move the core cluster resources from node to node, still getting the same error.

There was a problem with CAU update of the cluster approx. 2 weeks ago and I had to go through several reboot cycles to get the cluster working properly and that's when I noticed the problem with the CNO being in a failed state.

March 4th, 2015 12:59pm

So this problem appeared after issues with Windows updates ?

Are all the cluster nodes affected by this update issue ?

What are the logged cluster console events ? only this one ?

"There was an error repairing the active directory object for "Cluster Name'. Details: There was an error resetting the active directory password for 'Cluster name'. Error code: 0x80005000'

Free Windows Admin Tool Kit Click here and download it now
March 4th, 2015 1:06pm

Not sure if the problem appeared after the failed CAU update or before.

Actually the Cluster Events log view in FCM doesn't even contain the error I described above. That can be only seen when you use the "Repair" action on the CNO and it fails and you can view the "Information details" of that event. Found this event in the FailoverClustering-Manager/Admin log actually, it seems the Cluster events view in FCM doesn't contain events from this log. The cluster events log only logs the generic problem with the CNO going online:

 - Cluster resource 'Cluster Name' of type 'Network Name' in clustered role 'Cluster Group' failed. (eventid 1069)

 - The Cluster service failed to bring clustered service or application 'Cluster Group' completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered service or application. (eventid 1205)

 - followed by (not surprisingly) Clustered role 'Cluster Group' has exceeded its failover threshold.  It has exhausted the configured number of failover attempts within the failover period of time allotted to it and will be left in a failed state.  (eventid 1254)

There are also 2 other problems logged, both regarding DNS registration - once for the CAU VCO (hvcluscau) and once for a FileServer cluster group VCO (FS01):

 - Cluster network name resource 'hvcluscau' failed registration of one or more associated DNS name(s) for the following reason: The handle is invalid. (eventid 1196)

 - Cluster network name resource 'FS01' failed registration of one or more associated DNS name(s) for the following reason: The handle is invalid. (eventid 1196)

Records for both (hvcluscau and fs01) are actually in the DNS. Before you ask, DNS is working fine though and there are no errors - regular domain members server can register and update records just fine.

March 4th, 2015 1:21pm

Hi MarkosP,

Please install the following update then monitor this issue again.

Recommended hotfixes and updates for Windows Server 2012-based failover clusters

http://support.microsoft.com/kb/2784261

Im glad to be of help to you!

Free Windows Admin Tool Kit Click here and download it now
March 9th, 2015 8:09am

Hi Alex.

I've installed missing updates from that list (some were already installed, 1 was not applicable) on both nodes, however the issue didn't go away and I still get the same error.

March 9th, 2015 10:23pm

Hi MarkosP,

Which update "was not applicable" we must narrow down the issue area, please offer us which update you can not installed and the related error information, most times folks can not insatll a update because they don't have the don't meet that update dependent requirement, you can search the internet then find out the update requirement and fix it.

If you can not install any update you also can choose reset your Update Commponents.

How do I reset Windows Update components?

http://support.microsoft.com/kb/971058

Best Regards,

Free Windows Admin Tool Kit Click here and download it now
March 12th, 2015 1:22am

KB976424 reported error "Installer encountered an error: 0x80096002. The certificate for the signer of the message is invalid or not found". I've tried redownloading this KB, but got the same error.

KB2913695 reported "The update was not applicable to your computer"

Following KBs were installed: KB2878635-v3, KB2894464, KB2916993, KB2929869-v2, KB3004098-v2

March 17th, 2015 1:26am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics