Failover event: Expected user experience

Hi Guys,

I have 4 Exchange 2013 servers, 2 per site with CAS and MBX roles install on all of them. Single DAG, 10 databases. In front of the servers I have an ARR server for reverse proxy and load balancing (actually have 4, with 2 clustered per site) and a shared namespace. Clients are Outlook 2013.

We recently had a failover event where an administrator incorrectly tagged the port of one of the Exchange servers.

The databases on the server were dismounted and mounted on other nodes within 9 seconds - happy with that, all good.

However the end user experienced was very mixed, with some clients barely noticing an impact, and other were reporting up to 15 minutes of disconnect before reconnecting...

Can anyone help me work out why it took some clients, so long to reconnect again? ARR notices via health checks that a server is offline within 2 minutes, so I'd expect up to 5 minutes of downtime for clients whist ARR drops the server and clients establish a new connection - but 15 minutes is far longer than I would expect.

Any ideas, or suggestions on how I can investigate further would be great. I'm assuming it's related more to the CAS roles than the DB's due to speed of mounting...

Thanks - Steve

April 17th, 2015 4:06am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics