Good Day!!!.
We are facing an very hectic problem with Exchange 2013 with VM Ware
We have 2 Exchange server 2013 which hold all roles and for witness share we have another 3rd server.also we have 2 edge exchange 2010 which resides on DMZ zone.
All servers are running on Windows Server 2012 R2 std.
VM Ware ESX I 5.5 Update 1 running on 2 Hosts
EXCH-01
EXCH-02
EDGE-01
EDGE-02
DAGFS
DC-01
DC-02
All server are equally shared in both ESXI Server like
Host1 : DC-01 / EXCH-01 / EDGE-01 / DAGFS
Host1 : DC-02 / EXCH-02 / EDGE-02
Coming to the issue : Any one of the exchange server is leave from DAG Automatically after subsequent packet drops of Replication / Production Network validation .
Ethernet Card is VMNET3 for all VM's in VM Ware
Now one of the exchange server shows down in failover cluster manager , but other services are running on same server like MAPI / IMAP / POP3.
We have change the file share witness server as well after got event in failover cluster but still the event is generating frequently.
Attached sheet of events
Troubleshoot action taken :
changed all server network adaptor from E1000a to VMNET3
DAG file share witness server changed
Removed one node from DAG and re-joined
Microsoft help on this very much thank full.
What OS are you running Exchange on? When this happens are you running backups?
Hi kesa,
Thank you for your question.
By my understanding, this issue was caused by the network issue, please contact network administrator to make sure network connectivity is well.
Notice: check the heartbeats work correctly and network stability.
Did you install NLB and DAG?
NLB and DAG are not both configured on all-in-one.
Did you have configure cluster IP address and if this IP was conflicted to other?
If there are any questions regarding this issue, please be free to let me know.
Best Regard,
Hi All,
Good Day!!!,
Thanks for your faster response first.
Answers are below:
Are you sure the virtual networks are set up correctly?
--Yes , we have 4 Physical nic in ESXI, [2-for Exchange Production network with high availability mode, 1-for dedicated DAG replication, 1 for DMZ EDGE]
Can you ping each Exchange server from the other Exchange server using the IP of the replication network?
--Yes All Exchange servers can able to ping seamlessly
What OS are you running Exchange on?
--All Server are Windows Server 2012 R2 DC
When this happens are you running backups?
--No Issues while taking backup
Any issue in Validate a Configuration wizard of Fail over Clustering?
--Yes, we got Network -Warning on Validate Multiple Sub net Properties
Message is : The HostRecordTTL property for network name 'Name: DAG' is set to 300 ( 5 minutes). For local clusters the suggested value is 1200 (20 minutes).
check the heartbeats work correctly and network stability.
-- When we check heart beat network ping continuously is fine, but when we check Windows/failover cluster event log : Cluster has missed two consecutive heartbeats for the local endpoint 172.19.100.4:~3343~ connected to remote endpoint 172.19.100.3:~3343~.
Did you install NLB and DAG?
--Only DAG is configured, there is no NLB, we used for that DNS round robin method for Client access.
NLB and DAG are not both configured on all-in-one.
--yah it true only but that is not in our cause
Did you have configure cluster IP address and if this IP was conflicted to other?
-- Yah we are got for this IP conflict message long back (4 months) after that now did not see at all in fail over cluster event
thanks for solution in advance.
Hi kesa,
Did the issue solve?
Could you tell us more details which is about Exchange 2013 removed automatically from DAG.
Best Regard,
Jim
Hi Jim,
thanks for your reply.
Actually the second node is not removed, but it shows Down in DAG as well as fail over cluster manager console.
This is was happen previously also but it will come up after some times when the heart beat or production network handshake is happen using UDP port 3343, this we found in fail over cluster event in application log .
Now the scenario is very worst UDP communication is not happening , node 2 is shown in down very long days and every day the sever getting rebooted as well.
thanks in advance.
Hi ,
No firewall between server..it in same data center location.
thanks
Hi ,
No firewall between server..it in same data center location.
Hi kesa,
We could refer to the following link to check if the issue persist:
https://technet.microsoft.com/en-us/library/cc773498%28v=ws.10%29.aspx
If there are any questions regarding this issue, please be free to let me know.
Best Regard,
Jim
Hi Jim,
thanks for your post.
Can we re configure quorum of failvoer cluster if the cluster is managed by Exchange DAG.
Please confirm .
thank you
Hi kesa,
Yes, you could do that.
You could move FWS to another server to check if the issue persist.
In addition, is the link helpful which I supply?
If there are any questions regarding this issue, please be free to let me know.
Best Regard,
Jim
Hi Jim,
This activity of change FSW to Another server already done, but after that also we are getting the same event there is not change.
Please provide some other links for solutions..
thanks.
Hi kesa,
Will Exchange really down? Or just show service down in failover cluster manager? Exchange 01 or Exchange 02? Or just random Exchange server?
Tell us more details which is subsequent packet drops of Replication / Production Network validation .
By your sentence which is but other services are running on same server like MAPI / IMAP / POP3. Did you means the Exchange server is online in fact?
If there are any questions regarding this issue, please be free to let me know.
Best Regard,
Jim
Hi Jim,
Thanks for your reply!!!,
Will Exchange really down?
--No, but it reboots in every day in certain time periods
just show service down in failover cluster manager?
--Yes
Exchange 01 or Exchange 02?
--Only Exchange 2
just random Exchange server?
--No random
Tell us more details which is subsequent packet drops of Replication / Production Network validation.
--Both Production & Replication -Adapter pocket gets drops
By your sentence which is but other services are running on same server like MAPI / IMAP / POP3. Did you means the Exchange server is online in fact?
--Yes , Exchange is online only.
Thanks in Advance
Hi All,
It there any help from the below info, details are collected from Cluster\report folder
File Name: ValidateStorage.log
m_FindFileOnSmbShare: EXIT: hr 0x80070035
CprepConnectToNewSmbShares3: ERROR: Failed calling FindFirstFile on share
NetFt has 0 existing routes
thanks
Hi kesa,
We could do some basic checks on Exchange 02:
1. All the drivers are up to date
2. Servers are properly patched
3.Necessary exceptions are made in Antivirus for mailbox server
If there are any questions regarding this issue, please be free to let me know.
Best Regard,
Jim
Hi Jim,
Thanks for your reply,
Find the answers:
1. All the drivers are up to date
---All Drivers are up to date
2. Servers are properly patched
--- All Patches are updated
3.Necessary exceptions are made in Antivirus for mailbox server
---There is no antivirus soaftware installed on both servers
FYI, there is ValidateStorage.log available in cluster\report folder on both exchange servers
in this we can able to see some error lines
0000827c.00000794::2015/07/07-04:14:20.947 m_FindFileOnSmbShare: ERROR: Failed to open file enum for {\\[IPV6 address available here%14]\ClusterTestShare_{af797d7e-eb28-4da6-bb3f-3aa96bdef18d}\*}, error=800700350000827c.00000794::2015/07/07-04:14:20.949 m_FindFileOnSmbShare: EXIT: hr 0x80070035
0000827c.00000794::2015/07/07-04:14:20.951 CprepConnectToNewSmbShares3: ERROR: Failed calling FindFirstFile on share {\\[IPV6 address available here]}, adjusted pathname {\\[fe80::54b6:c3c3:eec8:a6ab%14]\ClusterTestShare_{af797d7e-eb28-4da6-bb3f-3aa96bdef18d}}, error=80070035.
0000827c.00000794::2015/07/07-04:14:20.953 m_FindFileOnSmbShare: ENTER
0000827c.00000794::2015/07/07-04:14:51.989 m_FindFileOnSmbShare: ERROR: Failed to open file enum for {\\169.254.2.206\ClusterTestShare_{af797d7e-eb28-4da6-bb3f-3aa96bdef18d}\*}, error=80070035
0000827c.00000794::2015/07/07-04:14:51.992 m_FindFileOnSmbShare: EXIT: hr 0x80070035
0000827c.00000794::2015/07/07-04:14:51.994 CprepConnectToNewSmbShares3: ERROR: Failed calling FindFirstFile on share {\\169.254.2.206}, adjusted pathname {\\169.254.2.206\ClusterTestShare_{af797d7e-eb28-4da6-bb3f-3aa96bdef18d}}, error=80070035.
0000827c.00000794::2015/07/07-04:14:51.996 m_FindFileOnSmbShare: ENTER
0000827c.00000794::2015/07/07-04:15:23.037 m_FindFileOnSmbShare: ERROR: Failed to open file enum for {\\169.254.166.171\ClusterTestShare_{af797d7e-eb28-4da6-bb3f-3aa96bdef18d}\*}, error=80070035
0000827c.00000794::2015/07/07-04:15:23.040 m_FindFileOnSmbShare: EXIT: hr 0x80070035
0000827c.00000794::2015/07/07-04:15:23.042 CprepConnectToNewSmbShares3: ERROR: Failed calling FindFirstFile on share {\\169.254.166.171}, adjusted pathname {\\169.254.166.171\ClusterTestShare_{af797d7e-eb28-4da6-bb3f-3aa96bdef18d}}, error=80070035.
0000827c.00000794::2015/07/07-04:15:23.044 CprepConnectToNewSmbShares3: EXIT: hr 0x80070035
0000827c.00000794::2015/07/07-04:15:23.052 CprepCreateNewSmbShares3 ENTER
If possible please clarify why some test is running on different IP address which is not available in our data center.
thanks in advance
Hi Andy David,
Good Day!!!,
Sorry for delay reply.
As per your link we have changed the belowe after check our setup
SameSubnetThreshold = 10 (from 5 )
RouteHistoryLength = 20
But it also not solved our issue.
Is there any other link we can check the issues.
thanks