Availability in a 2 DAGs scenarios

Hi dear Exchange community members !

I'm working on a 2 DAG active/active scenarios where 2 DAG are stretched between 2 datacenters, using one namespace for both datacenters. DNS Round Robin is used to provide load balancing and availability.

I understood that if the datacenter link fail, every active mailbox remain active on each datacenter (reason to use 2 DAGs with each having active database on different datacenter)

However, if the link fail, what is the behaviour of the Outlook or OWA client ?
If the client connect to the dc where the active mbx is, no problem, client get access to his mbx.
However, if the client connect to the dc and the mbx is in the other dc (which is not available because link between the dc failed), what happens ? The CAS is unable to proxy to the MBX server right ?

Thank you for your help!

Wish you a nice day !

May 7th, 2015 1:02pm

Hi,

According to your description, I understand you want know the behavior of Outlook or OWA client when DAG failover.
If I misunderstand your concern, please do not hesitate to let me know.

If the server that owns the cluster quorum, the PAM role automatically moves to a surviving server that takes ownership of the cluster quorum resource.
As you mentioned above, this failover is it just for the database instead of CAS server. As far as I know, the automatically redirection behavior when single database fails over out of site has been improved in Exchange 2010 service pack 2. Please apply SP2 if you havent yet. For your reference: https://technet.microsoft.com/en-us/library/bb310763(v=exchg.141).aspx
For all mailbox-related MAPI client connectivity goes through the RPC Client Access service on the Client Access server role, please upgrade to Exchange 2010 SP2 CU3 then system can change the RPCClientAccessServer value automatically. More details about it, please refer to: http://blogs.technet.com/b/exchange/archive/2012/05/30/rpc-client-access-cross-site-connectivity-changes.aspx

Additional, heres an similar thread about DAG Quorum scenario query, for your reference:
https://social.technet.microsoft.com/Forums/office/en-US/b924ae58-e09f-4397-8324-89bcc3d75a3e/dag-quorum-scenario-query?forum=exchangesvravailabilityandisasterrecovery

Thanks

Free Windows Admin Tool Kit Click here and download it now
May 7th, 2015 11:22pm

Hello Alen and thank you for your answer.

Unfortunately, my question is about Exchange 2013 not 2010.

What I would like to understand is the behaviour of clients when the link between datacenters goes down in a 2 DAGs active/active configuration.

Reards.

May 8th, 2015 2:50am

Hi,

Sorry for my misunderstanding. Please post more details about your environment.
If you have 2 node DAG and WLAN outage, we need at least 2 votes to maintain quorum(using the formula V/2 + 1), one DAG member will retain quorum and one DAG member will lose quorum.
More details about High Availability Misconceptions Addressed, for you reference: http://blogs.technet.com/b/exchange/archive/2011/05/31/exchange-2010-high-availability-misconceptions-addressed.aspx

For OWA client, it will be out of connect and need login again to proxy or redirect to available server and connect to database.
For Outlook client, it will lose connection for a while then reconnect to new CAS server and database with new RPCClientAccessServer value.

Thanks

Free Windows Admin Tool Kit Click here and download it now
May 10th, 2015 9:47pm

Thanks Allen for your feedback.

The link you sent in your reply is about Exchange 2010 and not 2013.
In my setup with 2 datacenters connected via high speed links, I suppose that if the link goes down, each datacenter will remain active because there are 2 DAGs here in Active Active.

But with Round Robin : what is happening if an OWA/OA client comes to the datacenter that host Exchange CAS server (so accept connectivity) but where mailbox is in the other datacenter (and not reachable because inter-datacenter link is down) ?

May 11th, 2015 7:26am

Hi Jmo91,

It will be easier if you let us know more about the DAG setup.

Do you have passive DB across sites, where do you have the Witness servers. Have you considered for Split-Brain scenario. How are you ensuring you are maintaining Quorum when the WAN link is down. (This could be seen in the Allen's post) Based on your input we can talk more.

For now based on the info provided by you I would assume both sites have Internet connectivity apart from interSite WAN link.

Now if the WAN link goes down, you are partially correct, request from the Public DNS that will land on the wrong CAS site which doesn't have the user's active mailbox would fail to proxy the request to the mailbox server of the other site and would fail initially.

But Outlook and modern clients have the capacity to hold multple IP addresses in them if one fails after 21 sec it automatically tries on the next IP, which would succeed in your case.

Hence, hopfully your WAN failure would not effect much, other than interSite mail flows and DAG replications if there. Similar if your WAN is up and one of the internet link fails, similar case would keep your site alive.

"One of the changes in Exchange 2013 is to enable clients to have more than one place to go. Assuming the client has the ability to use more than one place to go (almost all the client access protocols in Exchange 2013 are HTTP based (examples include Outlook, Outlook Anywhere, EAS, EWS, OWA, and EAC), and all supported HTTP clients have the ability to use multiple IP addresses), thereby providing failover on the client side. You can configure DNS to hand multiple IP addresses to a client during name resolution. The client asks for mail.contoso.com and gets back two IP addresses, or four IP addresses, for example. However many IP addresses the client gets back will be used reliably by the client. This makes the client a lot better off because if one of the IP addresses fails, the client has one or more other IP addresses to try to connect to. If a client tries one and it fails, it waits about 20 seconds and then tries the next one in the list. Thus, if you lose the VIP for the Client Access server array, recovery for the clients happens automatically, and in about 21 seconds."

References:

NameSpace Planning:

http://blogs.technet.com/b/exchange/archive/2014/02/28/namespace-planning-in-exchange-2013.aspx

Exchange 2013 Client Access Server Role:

http://blogs.technet.com/b/exchange/archive/2013/01/25/exchange-2013-client-access-server-role.aspx

Free Windows Admin Tool Kit Click here and download it now
May 11th, 2015 8:06am

Can I ask why you are using 2 DAGs and not one stretched across the datacenters?
May 11th, 2015 8:08am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics