Windows 2012 Cluster with Exchange 2013 DAG
Active/Active or Active/Standby?
  • Edited by Shamne_g Tuesday, June 16, 2015 4:24 PM grammar
June 16th, 2015 4:24pm

Hi,

I have a question regarding Windows 2012 Failover clustering.

We have Windows 2012 running with Exchange 2013 DAG. 8 nodes 1 witness server 

There are few instances where one of the nodes lost the quorum due to network issues.  When ever that happens cluster service goes in restarting (crashing).  I tried to change Cluster service to manual and then start it but, it just keep crashing until I restart the server after that it works fine that node once again gets added into the quorum without any issues.

My question - Is it normal behavior if node lose the quorum cluster service keep restarting until you restart the server?  Or is there any way to bring back that server in the quorum without restart of the server.

clussvc.exe version 6.2.9200.21268

Error

The Cluster Service service terminated unexpectedly.  It has done this 15 time(s).  The following corrective action will be taken in 60000 milliseconds: Restart the service.
Free Windows Admin Tool Kit Click here and download it now
June 17th, 2015 12:17pm

Active/Active or Active/Standby?
  • Edited by Shamne_g 14 hours 52 minutes ago grammar
June 17th, 2015 12:38pm

Hi,

I have a question regarding Windows 2012 Failover clustering.

We have Windows 2012 running with Exchange 2013 DAG. 8 nodes 1 witness server 

There are few instances where one of the nodes lost the quorum due to network issues.  When ever that happens cluster service goes in restarting (crashing).  I tried to change Cluster service to manual and then start it but, it just keep crashing until I restart the server after that it works fine that node once again gets added into the quorum without any issues.

My question - Is it normal behavior if node lose the quorum cluster service keep restarting until you restart the server?  Or is there any way to bring back that server in the quorum without restart of the server.

clussvc.exe version 6.2.9200.21268

Error

The Cluster Service service terminated unexpectedly.  It has done this 15 time(s).  The following corrective action will be taken in 60000 milliseconds: Restart the service.
Free Windows Admin Tool Kit Click here and download it now
June 17th, 2015 12:59pm

Well it always happened on VM (VMware) running 2012.  Exchange 2013 CU7.  It only happened once on physical server which has teaming and on Active/Active.  
June 17th, 2015 1:23pm

Hi,

Is there any error message when restart Cluster service? This issue might be caused by some Exchange relevant component works incorrectly.
If no, we can check the below point to monitor the health for this problematic server:
1. Run Test-ServiceHealth and Test-ReplicationHealth
2. Run Get-HealthReport -Server "MB1" | where {$_.HealthSet -eq "Search"} | FL
3. Run Get-ServerComponentstate -Identity servername

If the state is Inactive, please run below command to active relevant component:
Set-ServerComponentState <Identity> -Component component name -Requester HealthAPI -State Active
More details about Server Component States in Exchange 2013, please refer to:
http://blogs.technet.com/b/exchange/archive/2013/09/26/server-component-states-in-exchange-2013.aspx

Thanks

Free Windows Admin Tool Kit Click here and download it now
June 18th, 2015 2:27am

Hello Allen,

Thanks for your response.

After every incident of losing quorum and cluster service repeatedly crashing I checked the health and component state which is always Active.  I am certain this is not related with Exchange.  It is windows 2012 or cluster service itself.  When ever there is a slight glitch in network cluster service goes into crashing.  It starts then again stops and then again starts and keep crashing, until I restart the server.

Log Name:      System
Source:        Service Control Manager
Date:          6/10/2015 11:08:54 AM
Event ID:      7031
Task Category: None
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      Mailboxserver.company.com
Description:
The Cluster Service service terminated unexpectedly.  It has done this 1 time(s).  The following corrective action will be taken in 60000 milliseconds: Restart the service.

Thanks,

June 18th, 2015 12:55pm

Hi Allen,

Here are other errors: 

Error:

Log Name:      Microsoft-Windows-FailoverClustering/Diagnostic

Source:        Microsoft-Windows-FailoverClustering

Date:          6/10/2015 11:08:42 AM

Event ID:      2051

Task Category: None

Level:         Error

Keywords:     

User:          SYSTEM

Computer:      MailboxExchange2013.company.com

Description:

[NODE] Node 6: Connection to Node 5 is broken. Reason Closed(1236)' because of 'channel to remote endpoint fe80::a427:7751:9098:beef%22:~6160~ has failed with status WSAETIMEDOUT(10060)'

======================

Log Name:      Microsoft-Windows-FailoverClustering/Diagnostic

Source:        Microsoft-Windows-FailoverClustering

Date:          6/10/2015 11:08:54 AM

Event ID:      2051

Task Category: None

Level:         Error

Keywords:     

User:          SYSTEM

Computer:      Mailbox2013server.company.com

Description:

[QUORUM] Node 6: Lost quorum (6)

Free Windows Admin Tool Kit Click here and download it now
June 18th, 2015 4:38pm

Hi,

Basic on the event log, it might be caused by software or device compatible problem.Do you have upgraded the BIOS/firmware and device drivers to latest version?

If not, please try it for testing. Besides, please generate Cluster.log and search "ProcessingFailure" to get details about this issue.

June 18th, 2015 10:40pm

Hi Allen,

It happened last night at 22:16

I tried to find "ProcessFailure"  but it was not there.

Log lines=========

00000f20.000041d0::2015/06/18-22:16:20.573 INFO  </vector>
00000f20.000041d0::2015/06/18-22:16:20.573 INFO  .
00000f20.000041d0::2015/06/18-22:16:20.573 INFO  [CHM] Sending route weight vector for nodes (1 2 3 4 5 6 7 8) to nodes (1 2 3 5 6 7)
00000f20.00001210::2015/06/18-22:16:20.589 INFO  [IM] got event: LocalEndpoint 192.168.15.22:~3343~ has missed two consecutive heartbeats from 192.168.15.23:~3343~
00000f20.00001210::2015/06/18-22:16:20.589 INFO  [CHM] Received notification for two consecutive missed HBs to the remote endpoint 192.168.15.23:~3343~ from 192.168.15.22:~3343~
00000f20.000041d0::2015/06/18-22:16:20.589 INFO  [CHM] Sending route weight vector for nodes (1 2 3 4 5 6 7 8) to nodes (1 3 5 6 8)
00000f20.00001210::2015/06/18-22:16:20.605 INFO  [IM] got event: LocalEndpoint 192.168.15.22:~3343~ has missed two consecutive heartbeats from 192.168.15.26:~3343~
00000f20.00001210::2015/06/18-22:16:20.605 INFO  [CHM] Received notification for two consecutive missed HBs to the remote endpoint 192.168.15.26:~3343~ from 192.168.15.22:~3343~
00000f20.00001210::2015/06/18-22:16:20.605 INFO  [IM] got event: LocalEndpoint 192.168.15.22:~3343~ has missed two consecutive heartbeats from 192.168.15.24:~3343~
00000f20.00001210::2015/06/18-22:16:20.605 INFO  [CHM] Received notification for two consecutive missed HBs to the remote endpoint 192.168.15.24:~3343~ from 192.168.15.22:~3343~
00000f20.00001210::2015/06/18-22:16:20.605 INFO  [IM] got event: LocalEndpoint 192.168.15.22:~3343~ has missed two consecutive heartbeats from 192.168.15.25:~3343~
00000f20.00001210::2015/06/18-22:16:20.605 INFO  [CHM] Received notification for two consecutive missed HBs to the remote endpoint 192.168.15.25:~3343~ from 192.168.15.22:~3343~
00000f20.000041d0::2015/06/18-22:16:20.605 INFO  [CHM] Sending route weight vector for nodes (1 2 3 4 5 6 7 8) to nodes (1 2 5 6 7)
00000f20.000013e4::2015/06/18-22:16:21.119 INFO  [CHM] Incoming seq no is better than mine for node 1. Merging data
00000f20.000013e4::2015/06/18-22:16:21.119 INFO  [CHM] Incoming seq no is better than mine for node 2. Merging data
00000f20.000013e4::2015/06/18-22:16:21.119 INFO  [CHM] Incoming seq no is better than mine for node 3. Merging data
00000f20.000013e4::2015/06/18-22:16:21.119 INFO  [CHM] Incoming seq no is better than mine for node 5. Merging data
00000f20.000013e4::2015/06/18-22:16:21.119 INFO  [CHM] Incoming seq no is better than mine for node 6. Merging data
00000f20.000013e4::2015/06/18-22:16:21.119 INFO  [CHM] Incoming seq no is better than mine for node 7. Merging data
00000f20.000041d0::2015/06/18-22:16:21.119 INFO  [CHM] Sending route weight vector for nodes (1 2 3 4 5 6 7 8) to nodes (1 2 5 7 8)
00000f20.00001210::2015/06/18-22:16:21.135 INFO  [IM] got event: LocalEndpoint 192.168.15.22:~3343~ has missed two consecutive heartbeats from 192.168.15.21:~3343~
00000f20.00001210::2015/06/18-22:16:21.135 INFO  [CHM] Received notification for two consecutive missed HBs to the remote endpoint 192.168.15.21:~3343~ from 192.168.15.22:~3343~
00000f20.000041d0::2015/06/18-22:16:21.135 INFO  [CHM] Sending route weight vector for nodes (1 2 3 4 5 6 7 8) to nodes (1 2 5 7 8)
00000f20.00001210::2015/06/18-22:16:22.087 INFO  [IM] got event: LocalEndpoint 192.168.15.22:~3343~ has missed two consecutive heartbeats from 192.168.15.28:~3343~
00000f20.00001210::2015/06/18-22:16:22.087 INFO  [CHM] Received notification for two consecutive missed HBs to the remote endpoint 192.168.15.28:~3343~ from 192.168.15.22:~3343~
00000f20.0000850c::2015/06/18-22:16:22.087 INFO  [CHM] Sending route weight vector for nodes (1 2 3 4 5 6 7 8) to nodes (1 2 3 5 6 7)
00000f20.0000850c::2015/06/18-22:16:23.101 INFO  [CHM] Sending route weight vector for nodes (1 2 3 4 5 6 7 8) to nodes (1 3 5 6 8)
00000f20.000041d0::2015/06/18-22:16:25.113 INFO  [CHM] Sending route weight vector for nodes (1 2 3 4 5 6 7 8) to nodes (1 2 3 5 6 8)
00000f20.0000850c::2015/06/18-22:16:29.122 INFO  [CHM] Sending route weight vector for nodes (1 2 3 4 5 6 7 8) to nodes (1 2 5 6 7)
00000f20.000013bc::2015/06/18-22:16:30.979 WARN  [CHANNEL fe80::7d09:3165:7686:eaa1%17:~3343~] failure, status WSAETIMEDOUT(10060)
00000f20.000013bc::2015/06/18-22:16:30.979 INFO  [PULLER serverMAIL04] Parent stream has been closed.
00000f20.000013bc::2015/06/18-22:16:30.979 ERR   [NODE] Node 4: Connection to Node 8 is broken. Reason Closed(1236)' because of 'channel to remote endpoint fe80::7d09:3165:7686:eaa1%17:~3343~ has failed with status WSAETIMEDOUT(10060)'
00000f20.000013bc::2015/06/18-22:16:30.979 WARN  [NODE] Node 4: Initiating reconnect with n8.
00000f20.000013bc::2015/06/18-22:16:30.979 INFO  [MQ-serverMAIL04] Pausing
00000f20.0000850c::2015/06/18-22:16:30.979 INFO  [Reconnector-serverMAIL04] Reconnector from epoch 1 to epoch 2 waited 00.000 so far.
00000f20.000041d0::2015/06/18-22:16:32.991 INFO  [Reconnector-serverMAIL04] Reconnector from epoch 1 to epoch 2 waited 02.000 so far.
00000f20.00007dd0::2015/06/18-22:16:34.988 INFO  [Reconnector-serverMAIL04] Reconnector from epoch 1 to epoch 2 waited 04.000 so far.
00000f20.0000140c::2015/06/18-22:16:36.080 WARN  [CHANNEL fe80::a427:7751:9098:beef%17:~3343~] failure, status WSAETIMEDOUT(10060)
00000f20.0000140c::2015/06/18-22:16:36.080 INFO  [PULLER serverMAIL03] Parent stream has been closed.
00000f20.0000140c::2015/06/18-22:16:36.080 ERR   [NODE] Node 4: Connection to Node 5 is broken. Reason Closed(1236)' because of 'channel to remote endpoint fe80::a427:7751:9098:beef%17:~3343~ has failed with status WSAETIMEDOUT(10060)'
00000f20.0000140c::2015/06/18-22:16:36.080 WARN  [NODE] Node 4: Initiating reconnect with n5.
00000f20.0000140c::2015/06/18-22:16:36.080 INFO  [MQ-serverMAIL03] Pausing
00000f20.000073f0::2015/06/18-22:16:36.080 INFO  [Reconnector-serverMAIL03] Reconnector from epoch 1 to epoch 2 waited 00.000 so far.
00000f20.00001230::2015/06/18-22:16:36.548 DBG   [NETFTAPI] Signaled NetftRemoteUnreachable event, local address 192.168.15.22:3343 remote address 192.168.15.27:3343
00000f20.00001210::2015/06/18-22:16:36.548 INFO  [IM] got event: Remote endpoint 192.168.15.27:~3343~ unreachable from 192.168.15.22:~3343~
00000f20.00001210::2015/06/18-22:16:36.548 INFO  [IM] Marking Route from 192.168.15.22:~3343~ to 192.168.15.27:~3343~ as down
00000f20.00001210::2015/06/18-22:16:36.548 INFO  [NDP] Checking to see if all routes for route (virtual) local fe80::8551:33c:18d5:df80:~0~ to remote fe80::e890:12d3:4612:8310:~0~ are down
00000f20.00001210::2015/06/18-22:16:36.548 INFO  [NDP] Route local 10.90.105.146:~3343~ to remote 10.90.105.153:~3343~ is up
00000f20.00001b48::2015/06/18-22:16:36.548 INFO  [DCM] HandleNetftRemoteRouteChange
00000f20.00001210::2015/06/18-22:16:36.548 INFO  [IM] Route history 1: Old: 21.996, Message: Response, Route sequence: 478779, Received sequence: 478779, Heartbeats counter/threshold: 10/10, Error: Success, NtStatus: 0 Timestamp: 2015/06/18-22:16:14.552, Ticks since last sending: 0
00000f20.00001210::2015/06/18-22:16:36.548 INFO  [IM] Route history 2: Old: 22.339, Message: Request, Route sequence: 478778, Received sequence: 478773, Heartbeats counter/threshold: 10/10, Error: Success, NtStatus: 0 Timestamp: 2015/06/18-22:16:14.208, Ticks since last sending: 106
00000f20.00001210::2015/06/18-22:16:36.548 INFO  [IM] Route history 3: Old: 23.993, Message: Response, Route sequence: 478778, Received sequence: 478778, Heartbeats counter/threshold: 10/10, Error: Success, NtStatus: 0 Timestamp: 2015/06/18-22:16:12.555, Ticks since last sending: 0
00000f20.00001210::2015/06/18-22:16:36.548 INFO  [IM] Route history 4: Old: 24.336, Message: Request, Route sequence: 478777, Received sequence: 478772, Heartbeats counter/threshold: 10/10, Error: Success, NtStatus: 0 Timestamp: 2015/06/18-22:16:12.212, Ticks since last sending: 106
00000f20.00001210::2015/06/18-22:16:36.548 INFO  [IM] Route history 5: Old: 25.990, Message: Response, Route sequence: 478777, Received sequence: 478777, Heartbeats counter/threshold: 10/10, Error: Success, NtStatus: 0 Timestamp: 2015/06/18-22:16:10.558, Ticks since last sending: 0
00000f20.00001210::2015/06/18-22:16:36.548 INFO  [IM] Route history 6: Old: 26.333, Message: Request, Route sequence: 478776, Received sequence: 478771, Heartbeats counter/threshold: 10/10, Error: Success, NtStatus: 0 Timestamp: 2015/06/18-22:16:10.215, Ticks since last sending: 107
00000f20.00001210::2015/06/18-22:16:36.548 INFO  [IM] Route history 7: Old: 28.002, Message: Response, Route sequence: 478776, Received sequence: 478776, Heartbeats counter/threshold: 10/10, Error: Success, NtStatus: 0 Timestamp: 2015/06/18-22:16:08.546, Ticks since last sending: 0
00000f20.00001210::2015/06/18-22:16:36.548 INFO  [IM] Route history 8: Old: 28.345, Message: Request, Route sequence: 478775, Received sequence: 478770, Heartbeats counter/threshold: 10/10, Error: Success, NtStatus: 0 Timestamp: 2015/06/18-22:16:08.202, Ticks since last sending: 106
00000f20.00001210::2015/06/18-22:16:36.548 INFO  [IM] Route history 9: Old: 29.999, Message: Response, Route sequence: 478775, Received sequence: 478775, Heartbeats counter/threshold: 10/10, Error: Success, NtStatus: 0 Timestamp: 2015/06/18-22:16:06.549, Ticks since last sending: 0
00000f20.00001210::2015/06/18-22:16:36.548 INFO  [IM] Route history 10: Old: 30.342, Message: Request, Route sequence: 478774, Received sequence: 478769, Heartbeats counter/threshold: 10/10, Error: Success, NtStatus: 0 Timestamp: 2015/06/18-22:16:06.206, Ticks since last sending: 106
00000f20.00001210::2015/06/18-22:16:36.548 INFO  [IM] Adding information for route Route from local 192.168.15.22:~0~ to remote 192.168.15.27:~0~, status: false, attributes: 0
00000f20.00001210::2015/06/18-22:16:36.548 INFO  [IM] Adding information for route Route from local 192.168.15.22:~0~ to remote 192.168.15.23:~0~, status: true, attributes: 0
00000f20.00001210::2015/06/18-22:16:36.548 INFO  [IM] Adding information for route Route from local 192.168.15.22:~0~ to remote 192.168.15.24:~0~, status: true, attributes: 0
00000f20.00001210::2015/06/18-22:16:36.548 INFO  [IM] Adding information for route Route from local 192.168.15.22:~0~ to remote 192.168.15.25:~0~, status: true, attributes: 0
00000f20.00001210::2015/06/18-22:16:36.548 INFO  [IM] Adding information for route Route from local 192.168.15.22:~0~ to remote 192.168.15.26:~0~, status: true, attributes: 0
00000f20.00001210::2015/06/18-22:16:36.548 INFO  [IM] Adding information for route Route from local 192.168.15.22:~3343~ to remote 192.168.15.21:~3343~, status: true, attributes: 0
00000f20.00001210::2015/06/18-22:16:36.548 INFO  [IM] Adding information for route Route from local 192.168.15.22:~0~ to remote 192.168.15.28:~0~, status: true, attributes: 0
00000f20.00001210::2015/06/18-22:16:36.548 INFO  [IM] Sending connectivity report to leader (node 1): <class mscs::InterfaceReport>
00000f20.00001210::2015/06/18-22:16:36.548 INFO    <fromInterface>1dd99b6b-2f3a-4f0d-8a14-f936b086e80b</fromInterface>
00000f20.00001210::2015/06/18-22:16:36.548 INFO    <upInterfaces><vector len='7'>
00000f20.00001210::2015/06/18-22:16:36.548 INFO      <item>1dd99b6b-2f3a-4f0d-8a14-f936b086e80b</item>
00000f20.00001210::2015/06/18-22:16:36.548 INFO      <item>76579c42-8884-420f-9d9c-33bb468439f1</item>
00000f20.00001210::2015/06/18-22:16:36.548 INFO      <item>d5dbe17a-6bb9-48f4-91f1-54181d628756</item>
00000f20.00001210::2015/06/18-22:16:36.548 INFO      <item>d186a86a-257f-4ed5-94a1-42c0d0a16fef</item>
00000f20.00001210::2015/06/18-22:16:36.548 INFO      <item>2d6419a0-2259-4033-86b4-115bc77b3177</item>
00000f20.00001210::2015/06/18-22:16:36.548 INFO      <item>ca8f0d50-a700-471e-93c0-1a8ddf43bca5</item>
00000f20.00001210::2015/06/18-22:16:36.548 INFO      <item>43b1848e-a914-4c3e-b22b-a42015eae306</item>
00000f20.00001210::2015/06/18-22:16:36.548 INFO  </vector>
00000f20.00001210::2015/06/18-22:16:36.548 INFO  </upInterfaces>
00000f20.00001210::2015/06/18-22:16:36.548 INFO    <downInterfaces><vector len='1'>
00000f20.00001210::2015/06/18-22:16:36.548 INFO      <item>8884e678-51d7-4e0f-a226-1ba5865d345f</item>
00000f20.00001210::2015/06/18-22:16:36.548 INFO  </vector>
00000f20.00001210::2015/06/18-22:16:36.548 INFO  </downInterfaces>
00000f20.00001210::2015/06/18-22:16:36.548 INFO    <upRoutesType><vector len='6'>
00000f20.00001210::2015/06/18-22:16:36.548 INFO      <item>1</item>
00000f20.00001210::2015/06/18-22:16:36.548 INFO      <item>1</item>
00000f20.00001210::2015/06/18-22:16:36.548 INFO      <item>1</item>
00000f20.00001210::2015/06/18-22:16:36.548 INFO      <item>1</item>
00000f20.00001210::2015/06/18-22:16:36.548 INFO      <item>1</item>
00000f20.00001210::2015/06/18-22:16:36.548 INFO      <item>1</item>
00000f20.00001210::2015/06/18-22:16:36.548 INFO  </vector>
00000f20.00001210::2015/06/18-22:16:36.548 INFO  </upRoutesType>
00000f20.00001210::2015/06/18-22:16:36.548 INFO    <downRoutesType><vector len='1'>
00000f20.00001210::2015/06/18-22:16:36.548 INFO      <item>1</item>
00000f20.00001210::2015/06/18-22:16:36.548 INFO  </vector>
00000f20.00001210::2015/06/18-22:16:36.548 INFO  </downRoutesType>
00000f20.00001210::2015/06/18-22:16:36.548 INFO    <viewId>128401</viewId>
00000f20.00001210::2015/06/18-22:16:36.548 INFO    <localDisconnect>false</localDisconnect>
00000f20.00001210::2015/06/18-22:16:36.548 INFO  </class mscs::InterfaceReport>
00000f20.00001330::2015/06/18-22:16:36.548 INFO  [DCM] HandleRequest: dcm/netftRouteChange
00000f20.00001330::2015/06/18-22:16:36.548 INFO  [DCM] MultichannelManager::PoisonImp - Entering, local 192.168.15.22:3343, remote 192.168.15.27:3343, match source true
00000f20.00001330::2015/06/18-22:16:36.548 INFO  [DCM] MultichannelManager::PoisonImp - NsiAllocateAndGetTable returned status 0
00000f20.00001330::2015/06/18-22:16:36.548 INFO  [DCM] Skipping client access network 7eaebfa7-61b6-4906-aa52-ec9bc2ac1495 for multichannel
00000f20.00001330::2015/06/18-22:16:36.548 INFO  [DCM] Skipping client access network 7eaebfa7-61b6-4906-aa52-ec9bc2ac1495 for multichannel
00000f20.00001330::2015/06/18-22:16:36.548 INFO  [DCM] Skipping client access network 7eaebfa7-61b6-4906-aa52-ec9bc2ac1495 for multichannel
00000f20.00001330::2015/06/18-22:16:36.548 INFO  [DCM] Skipping client access network 7eaebfa7-61b6-4906-aa52-ec9bc2ac1495 for multichannel
00000f20.00001330::2015/06/18-22:16:36.548 INFO  [DCM] Skipping client access network 7eaebfa7-61b6-4906-aa52-ec9bc2ac1495 for multichannel
00000f20.00001330::2015/06/18-22:16:36.548 INFO  [DCM] Skipping client access network 7eaebfa7-61b6-4906-aa52-ec9bc2ac1495 for multichannel
00000f20.00001330::2015/06/18-22:16:36.548 INFO  [DCM] Skipping client access network 7eaebfa7-61b6-4906-aa52-ec9bc2ac1495 for multichannel
00000f20.00001230::2015/06/18-22:16:36.579 DBG   [NETFTAPI] Signaled NetftRemoteUnreachable event, local address 192.168.15.22:3343 remote address 192.168.15.23:3343
00000f20.00001210::2015/06/18-22:16:36.579 INFO  [IM] got event: Remote endpoint 192.168.15.23:~3343~ unreachable from 192.168.15.22:~3343~
00000f20.00001210::2015/06/18-22:16:36.579 INFO  [IM] Marking Route from 192.168.15.22:~3343~ to 192.168.15.23:~3343~ as down
00000f20.00001210::2015/06/18-22:16:36.579 INFO  [NDP] Checking to see if all routes for route (virtual) local fe80::8551:33c:18d5:df80:~0~ to remote fe80::a427:7751:9098:beef:~0~ are down
00000f20.00001210::2015/06/18-22:16:36.579 INFO  [NDP] Route local 10.90.105.146:~3343~ to remote 10.90.105.147:~3343~ is up
00000f20.00001b48::2015/06/18-22:16:36.579 INFO  [DCM] HandleNetftRemoteRouteChange
00000f20.00001210::2015/06/18-22:16:36.579 INFO  [IM] Route history 1: Old: 21.996, Message: Response, Route sequence: 478779, Received sequence: 478779, Heartbeats counter/threshold: 10/10, Error: Success, NtStatus: 0 Timestamp: 2015/06/18-22:16:14.583, Ticks since last sending: 0
00000f20.00001330::2015/06/18-22:16:36.579 INFO  [DCM] HandleRequest: dcm/netftRouteChange
00000f20.00001210::2015/06/18-22:16:36.579 INFO  [IM] Route history 2: Old: 22.043, Message: Request, Route sequence: 478778, Received sequence: 478778, Heartbeats counter/threshold: 10/10, Error: Success, NtStatus: 0 Timestamp: 2015/06/18-22:16:14.536, Ticks since last sending: 125
00000f20.00001330::2015/06/18-22:16:36.579 INFO  [DCM] MultichannelManager::PoisonImp - Entering, local 192.168.15.22:3343, remote 192.168.15.23:3343, match source true
00000f20.00001210::2015/06/18-22:16:36.579 INFO  [IM] Route history 3: Old: 23.993, Message: Response, Route sequence: 478778, Received sequence: 478778, Heartbeats counter/threshold: 10/10, Error: Success, NtStatus: 0 Timestamp: 2015/06/18-22:16:12.586, Ticks since last sending: 0
00000f20.00001210::2015/06/18-22:16:36.579 INFO  [IM] Route history 4: Old: 24.055, Message: Request, Route sequence: 478777, Received sequence: 478777, Heartbeats counter/threshold: 10/10, Error: Success, NtStatus: 0 Timestamp: 2015/06/18-22:16:12.524, Ticks since last sending: 124
00000f20.00001210::2015/06/18-22:16:36.579 INFO  [IM] Route history 5: Old: 25.990, Message: Response, Route sequence: 478777, Received sequence: 478777, Heartbeats counter/threshold: 10/10, Error: Success, NtStatus: 0 Timestamp: 2015/06/18-22:16:10.589, Ticks since last sending: 0
00000f20.00001210::2015/06/18-22:16:36.579 INFO  [IM] Route history 6: Old: 26.052, Message: Request, Route sequence: 478776, Received sequence: 478776, Heartbeats counter/threshold: 10/10, Error: Success, NtStatus: 0 Timestamp: 2015/06/18-22:16:10.527, Ticks since last sending: 125
00000f20.00001210::2015/06/18-22:16:36.579 INFO  [IM] Route history 7: Old: 28.002, Message: Response, Route sequence: 478776, Received sequence: 478776, Heartbeats counter/threshold: 10/10, Error: Success, NtStatus: 0 Timestamp: 2015/06/18-22:16:08.577, Ticks since last sending: 0
00000f20.00001210::2015/06/18-22:16:36.579 INFO  [IM] Route history 8: Old: 28.049, Message: Request, Route sequence: 478775, Received sequence: 478775, Heartbeats counter/threshold: 10/10, Error: Success, NtStatus: 0 Timestamp: 2015/06/18-22:16:08.530, Ticks since last sending: 125
00000f20.00001210::2015/06/18-22:16:36.579 INFO  [IM] Route history 9: Old: 29.999, Message: Response, Route sequence: 478775, Received sequence: 478775, Heartbeats counter/threshold: 10/10, Error: Success, NtStatus: 0 Timestamp: 2015/06/18-22:16:06.580, Ticks since last sending: 0
00000f20.00001210::2015/06/18-22:16:36.579 INFO  [IM] Route history 10: Old: 30.046, Message: Request, Route sequence: 478774, Received sequence: 478774, Heartbeats counter/threshold: 10/10, Error: Success, NtStatus: 0 Timestamp: 2015/06/18-22:16:06.533, Ticks since last sending: 125
00000f20.00001210::2015/06/18-22:16:36.579 INFO  [IM] Adding information for route Route from local 192.168.15.22:~0~ to remote 192.168.15.27:~0~, status: false, attributes: 0
00000f20.00001210::2015/06/18-22:16:36.579 INFO  [IM] Adding information for route Route from local 192.168.15.22:~0~ to remote 192.168.15.23:~0~, status: false, attributes: 0
00000f20.00001210::2015/06/18-22:16:36.579 INFO  [IM] Adding information for route Route from local 192.168.15.22:~0~ to remote 192.168.15.24:~0~, status: true, attributes: 0
00000f20.00001210::2015/06/18-22:16:36.579 INFO  [IM] Adding information for route Route from local 192.168.15.22:~0~ to remote 192.168.15.25:~0~, status: true, attributes: 0
00000f20.00001210::2015/06/18-22:16:36.579 INFO  [IM] Adding information for route Route from local 192.168.15.22:~0~ to remote 192.168.15.26:~0~, status: true, attributes: 0
00000f20.00001210::2015/06/18-22:16:36.579 INFO  [IM] Adding information for route Route from local 192.168.15.22:~3343~ to remote 192.168.15.21:~3343~, status: true, attributes: 0
00000f20.00001210::2015/06/18-22:16:36.579 INFO  [IM] Adding information for route Route from local 192.168.15.22:~0~ to remote 192.168.15.28:~0~, status: true, attributes: 0
00000f20.00001210::2015/06/18-22:16:36.579 INFO  [IM] Sending connectivity report to leader (node 1): <class mscs::InterfaceReport>
00000f20.00001210::2015/06/18-22:16:36.579 INFO    <fromInterface>1dd99b6b-2f3a-4f0d-8a14-f936b086e80b</fromInterface>
00000f20.00001210::2015/06/18-22:16:36.579 INFO    <upInterfaces><vector len='6'>
00000f20.00001210::2015/06/18-22:16:36.579 INFO      <item>1dd99b6b-2f3a-4f0d-8a14-f936b086e80b</item>
00000f20.00001210::2015/06/18-22:16:36.579 INFO      <item>d5dbe17a-6bb9-48f4-91f1-54181d628756</item>
00000f20.00001210::2015/06/18-22:16:36.579 INFO      <item>d186a86a-257f-4ed5-94a1-42c0d0a16fef</item>
00000f20.00001210::2015/06/18-22:16:36.579 INFO      <item>2d6419a0-2259-4033-86b4-115bc77b3177</item>
00000f20.00001210::2015/06/18-22:16:36.579 INFO      <item>ca8f0d50-a700-471e-93c0-1a8ddf43bca5</item>
00000f20.00001210::2015/06/18-22:16:36.579 INFO      <item>43b1848e-a914-4c3e-b22b-a42015eae306</item>
00000f20.00001210::2015/06/18-22:16:36.579 INFO  </vector>
00000f20.00001210::2015/06/18-22:16:36.579 INFO  </upInterfaces>
00000f20.00001210::2015/06/18-22:16:36.579 INFO    <downInterfaces><vector len='2'>
00000f20.00001210::2015/06/18-22:16:36.579 INFO      <item>8884e678-51d7-4e0f-a226-1ba5865d345f</item>
00000f20.00001210::2015/06/18-22:16:36.579 INFO      <item>76579c42-8884-420f-9d9c-33bb468439f1</item>
00000f20.00001210::2015/06/18-22:16:36.579 INFO  </vector>
00000f20.00001210::2015/06/18-22:16:36.579 INFO  </downInterfaces>
00000f20.00001210::2015/06/18-22:16:36.579 INFO    <upRoutesType><vector len='5'>
00000f20.00001210::2015/06/18-22:16:36.579 INFO      <item>1</item>
00000f20.00001210::2015/06/18-22:16:36.579 INFO      <item>1</item>
00000f20.00001210::2015/06/18-22:16:36.579 INFO      <item>1</item>
00000f20.00001210::2015/06/18-22:16:36.579 INFO      <item>1</item>
00000f20.00001210::2015/06/18-22:16:36.579 INFO      <item>1</item>
00000f20.00001210::2015/06/18-22:16:36.579 INFO  </vector>
00000f20.00001210::2015/06/18-22:16:36.579 INFO  </upRoutesType>
00000f20.00001210::2015/06/18-22:16:36.579 INFO    <downRoutesType><vector len='2'>
00000f20.00001210::2015/06/18-22:16:36.579 INFO      <item>1</item>
00000f20.00001210::2015/06/18-22:16:36.579 INFO      <item>1</item>
00000f20.00001210::2015/06/18-22:16:36.579 INFO  </vector>
00000f20.00001210::2015/06/18-22:16:36.579 INFO  </downRoutesType>
00000f20.00001210::2015/06/18-22:16:36.579 INFO    <viewId>128401</viewId>
00000f20.00001210::2015/06/18-22:16:36.579 INFO    <localDisconnect>false</localDisconnect>
00000f20.00001210::2015/06/18-22:16:36.579 INFO  </class mscs::InterfaceReport>
00000f20.00001330::2015/06/18-22:16:36.579 INFO  [DCM] MultichannelManager::PoisonImp - NsiAllocateAndGetTable returned status 0
00000f20.00001330::2015/06/18-22:16:36.579 INFO  [DCM] Skipping client access network 7eaebfa7-61b6-4906-aa52-ec9bc2ac1495 for multichannel
00000f20.00001330::2015/06/18-22:16:36.579 INFO  [DCM] Skipping client access network 7eaebfa7-61b6-4906-aa52-ec9bc2ac1495 for multichannel
00000f20.00001330::2015/06/18-22:16:36.579 INFO  [DCM] Skipping client access network 7eaebfa7-61b6-4906-aa52-ec9bc2ac1495 for multichannel
00000f20.00001330::2015/06/18-22:16:36.579 INFO  [DCM] Skipping client access network 7eaebfa7-61b6-4906-aa52-ec9bc2ac1495 for multichannel
00000f20.00001330::2015/06/18-22:16:36.579 INFO  [DCM] Skipping client access network 7eaebfa7-61b6-4906-aa52-ec9bc2ac1495 for multichannel
00000f20.00001330::2015/06/18-22:16:36.579 INFO  [DCM] Skipping client access network 7eaebfa7-61b6-4906-aa52-ec9bc2ac1495 for multichannel
00000f20.00001330::2015/06/18-22:16:36.579 INFO  [DCM] Skipping client access network 7eaebfa7-61b6-4906-aa52-ec9bc2ac1495 for multichannel
00000f20.00001230::2015/06/18-22:16:36.595 DBG   [NETFTAPI] Signaled NetftRemoteUnreachable event, local address 192.168.15.22:3343 remote address 192.168.15.25:3343
00000f20.00001230::2015/06/18-22:16:36.595 DBG   [NETFTAPI] Signaled NetftRemoteUnreachable event, local address 192.168.15.22:3343 remote address 192.168.15.24:3343
00000f20.00001210::2015/06/18-22:16:36.595 INFO  [IM] got event: Remote endpoint 192.168.15.25:~3343~ unreachable from 192.168.15.22:~3343~
00000f20.00001210::2015/06/18-22:16:36.595 INFO  [IM] Marking Route from 192.168.15.22:~3343~ to 192.168.15.25:~3343~ as down
00000f20.00001210::2015/06/18-22:16:36.595 INFO  [NDP] Checking to see if all routes for route (virtual) local fe80::8551:33c:18d5:df80:~0~ to remote fe80::bd58:b930:d7da:718f:~0~ are down
00000f20.00001210::2015/06/18-22:16:36.595 INFO  [NDP] Route local 10.90.105.146:~3343~ to remote 10.90.105.149:~3343~ is up
00000f20.00001b48::2015/06/18-22:16:36.595 INFO  [DCM] HandleNetftRemoteRouteChange
00000f20.00001210::2015/06/18-22:16:36.595 INFO  [IM] Route history 1: Old: 21.996, Message: Response, Route sequence: 478779, Received sequence: 478779, Heartbeats counter/threshold: 10/10, Error: Success, NtStatus: 0 Timestamp: 2015/06/18-22:16:14.598, Ticks since last sending: 0
00000f20.00001210::2015/06/18-22:16:36.595 INFO  [IM] Route history 2: Old: 22.105, Message: Request, Route sequence: 478778, Received sequence: 478778, Heartbeats counter/threshold: 10/10, Error: Success, NtStatus: 0 Timestamp: 2015/06/18-22:16:14.489, Ticks since last sending: 121
00000f20.00001330::2015/06/18-22:16:36.595 INFO  [DCM] HandleRequest: dcm/netftRouteChange
00000f20.00001210::2015/06/18-22:16:36.595 INFO  [IM] Route history 3: Old: 23.993, Message: Response, Route sequence: 478778, Received sequence: 478778, Heartbeats counter/threshold: 10/10, Error: Success, NtStatus: 0 Timestamp: 2015/06/18-22:16:12.602, Ticks since last sending: 0
00000f20.00001330::2015/06/18-22:16:36.595 INFO  [DCM] MultichannelManager::PoisonImp - Entering, local 192.168.15.22:3343, remote 192.168.15.25:3343, match source true
00000f20.00001210::2015/06/18-22:16:36.595 INFO  [IM] Route history 4: Old: 24.102, Message: Request, Route sequence: 478777, Received sequence: 478777, Heartbeats counter/threshold: 10/10, Error: Success, NtStatus: 0 Timestamp: 2015/06/18-22:16:12.492, Ticks since last sending: 121
00000f20.00001210::2015/06/18-22:16:36.595 INFO  [IM] Route history 5: Old: 25.990, Message: Response, Route sequence: 478777, Received sequence: 478777, Heartbeats counter/threshold: 10/10, Error: Success, NtStatus: 0 Timestamp: 2015/06/18-22:16:10.605, Ticks since last sending: 0
00000f20.00001210::2015/06/18-22:16:36.595 INFO  [IM] Route history 6: Old: 26.099, Message: Request, Route sequence: 478776, Received sequence: 478776, Heartbeats counter/threshold: 10/10, Error: Success, NtStatus: 0 Timestamp: 2015/06/18-22:16:10.496, Ticks since last sending: 121
00000f20.00001210::2015/06/18-22:16:36.595 INFO  [IM] Route history 7: Old: 27.986, Message: Response, Route sequence: 478776, Received sequence: 478776, Heartbeats counter/threshold: 10/10, Error: Success, NtStatus: 0 Timestamp: 2015/06/18-22:16:08.608, Ticks since last sending: 0

Free Windows Admin Tool Kit Click here and download it now
June 19th, 2015 1:01pm

Hi Allen,

So even if there is a problem in replication network would server get removed from cluster?

We have 2 networks one for MAPI and another for Replication.

What would be the process in case I need to change IP address for Replication network?  Should I just change it from Nic properties or do I need to run any command from Exchange shell?

Thanks,

June 19th, 2015 7:32pm

Hi,
Sorry for delay.
Please refer to below steps to change IP address on DAG replication NIC's:
1. configure a new IP-Address for replication networkinterface
2. configure the replication network for the DAG
3. remove the old IP from the replication network interface.
After each step please check replication if everything works fine.

Here's an blog about "Changing DAG & DAG members IP addresses", for your reference:
http://blogs.technet.com/b/pfemsgil/archive/2012/08/02/changing-dag-amp-dag-members-ip-addresses.aspx

Free Windows Admin Tool Kit Click here and download it now
June 23rd, 2015 1:26am

Hello Allen,

Thanks for your reply.

I have already read this Blog and the steps.  But still I am not clear what command I need to run for Step 2 and 3 just for Replication network only.  


June 23rd, 2015 2:04pm

Few things:

1. What do you mean by "server lost quorum". If it is an 8 node DAG, you can have multiple server failures and the DAG should still work. Losing the witness server wont/shouldnt crash the cluster.

2. There is no need for a dedicated replication network in 2013, the recommendation is to have a single NIC (1 or 10Gbps) handle both MAPI and replication networks.

Free Windows Admin Tool Kit Click here and download it now
June 23rd, 2015 4:09pm

Hello Rajith,

I meant I saw "Lost quorum" errors on that particular node which goes out of fail over cluster.  Other nodes keep running but that node which is out of cluster also shows those error "Lost quorum" that is why I used that term.

Losing the witness does not effect any nodes but I have noticed even if I restart witness server during that time PAM changes to different node.  I was always under impression that in DAG of 8 nodes witness server restart or going offline should not make any difference.  Again I do not see any issues it is just PAM changes to different NODE when I see via Failover Cluster console.

In case we would like to remove Replication Network from DAG and only use MAPI network what I may need to do?

Thanks,


June 24th, 2015 12:33am

Hi Raman,

We need Replication Network and MAPI network both or use a single network path to ensure DAG work properly. It has descripted in Network Requirements section: https://technet.microsoft.com/en-us/library/dd638104(v=exchg.150).aspx#NR

Meanwhile, PAM will change to other DAG member via failover. Please refer to "Switchovers and Failovers" to get more details about switch process: https://technet.microsoft.com/en-us/library/dd298067(v=exchg.150).aspx

Thanks

Free Windows Admin Tool Kit Click here and download it now
June 27th, 2015 12:11am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics