Global Catalog/DR Issue
Hi all
I hope you will be able to help
We are doing a DR exercise to see how/if we can access our applications. All our servers are VM's and we use VMWare so the Storage guys take a snapshot of the storage and present it at the location where we do our DR exercise. So out of all the Servers they
make available 3 Exchange 2003 Servers (we have 7 in production), 2 child domain DC's (we have 4 in production)and 1 Root domain DC's (we have 2 in production). So the Exchange which are available are up to date up till a specific point
aka when the snapshot were taken
I think we are having issues related to lookups to the GC's. All our child and root domain DC's are GC's (sites and services). First we could'nt look up the member off of AD accounts. After some cleaning up of DNS records i can now lookup the member off.
The problem is that we cannot create mailboxes. It complains about the server is not operational which refers to the/a GC. How do i get the isolated network there to look at one of the DC's that are available in that site for GC lookups. It looks to me as
if GC lookup requests are going to a GC which are not available
Are anyone able to help me? PLEASE?
April 18th, 2012 7:28am
Hi
Can you paste the output of event 2080 from one of your Exchange servers?
Cheers, Steve
Free Windows Admin Tool Kit Click here and download it now
April 18th, 2012 7:46am
Hi
I'm not at the site at the moment but i have requested the event logs. The administrator there currently says he do not see 2080 but he see 2081 and the description is :
Process inetinfo.exe (1392) dsaccess will use this service from the following lift of domain controllers <name of DC> Global Catalog <name of GC> The config DC is set to <name of DC>
All 3 the names between brackets above are the same name. Is that event ID enough for you or must i wait for the event viewer extract and look through there myself?
April 18th, 2012 8:13am
The 2080 event shows which DCs the Exchange server is aware of and provides some very basic availability information.
Free Windows Admin Tool Kit Click here and download it now
April 18th, 2012 8:22am
I don't know what you are saying? Is the info i give above not enough - do you want event id 2080 and not 81 or are you saying the DC/GC in my previous post is the one the Exchange Server is trying to use?
April 18th, 2012 8:27am
There is an example of event 2080 on this page:
http://www.howexchangeworks.com/2010/07/exchange-2010-error-process.html
It would be helpful to have that information to help you to troubleshoot this issue.
Free Windows Admin Tool Kit Click here and download it now
April 18th, 2012 8:36am
Ok, i will wait until i get the eventviewer extract and see if there is a 2080 and post it.
So far, he just got event id 2081 which i posted allready
Just remember, we have Exchange 2003 SP2 and not 2010 like in the screenshot
April 18th, 2012 8:41am
Just received the whole log now. There is no event ID 2080. Only 2081 which if i search for looks like this :
The description for Event ID 2081 from source MSExchangeDSAccess cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.
If the event originated on another computer, the display information had to be saved with the event.
The following information was included with the event:
INETINFO.EXE
1392
<dc name>
<dc name>
<dc name>
Free Windows Admin Tool Kit Click here and download it now
April 18th, 2012 8:53am
Sorry, here is the correct description of 2081. On the Exchange there is no event id 2080
Process INETINFO.EXE (PID=1392). DSAccess will use the servers from the following list:
Domain Controllers:
<dc name>
Global Catalogs:
<dc name>
The Configuration Domain Controller is set to <dc name>
April 18th, 2012 9:20am
You may need to increase the diagnostic level for dsaccess for your 2003
Click the Diagnostics Logging tab, click MSExchangeDSAccess Service in the left pane, and then click
Topology in the right pane. Set to at least medium
Also have you tried to restart your Exchange services to re-initate the discovery?
net stop msexchangesa /ynet stop iisadmin /ynet stop winmgmt /y
James Chong MCITP | EA | EMA; MCSE | M+, S+ Security+, Project+, ITIL msexchangetips.blogspot.com
Free Windows Admin Tool Kit Click here and download it now
April 18th, 2012 10:27am
As James said you may need to increase the logging level, Exchange 2003 does generate event 2080s.
April 18th, 2012 10:48am
Thanks guys, I will only be able to do this tomorrow ie. increase log level and then reproduce the issue. Will post back the results
No, i haven't restarted the services but I have restarted the server which i guess come down to the same thing?
In the meanwhile I have voted your posts as helpfull
Free Windows Admin Tool Kit Click here and download it now
April 18th, 2012 11:18am
Yes a restart is the same thing, not sure exactly sure how you're simulating the DR but maybe Exchange is not liking something with the snapshots or the process of the DR simulation; post the results tomorrow so we can see the reachability tests.James Chong MCITP | EA | EMA; MCSE | M+, S+ Security+, Project+, ITIL msexchangetips.blogspot.com
April 18th, 2012 11:34am
I personally feel it is the same the DR is setup that is the problem but how do i show/prove that. This is what we do when doing a DR test
We use VMWare and EVA Storage so obviously all our servers is storage based. At some stage the VMWare/Storage guys make a snapsnot of the VM's and present them at the location where we perform the DR. The Servers they present is not all the servers in production
- only the critical ones. For example we have 2 root DC's they only present 1; we have 6 Exch Servers, they only present 3 and the same goes for child domain DC's so a Exchange Server may have used a specific server as a GC and now when the Exchange is started
up in the DR environment the GC may not be one of the server which is available - what then?
Does that Exch automatically choose a available GC to lookup against or can this be the cause of my issues?
Anyway, I hope you understand how we do our DR's. Will post the logs tommorrow
Free Windows Admin Tool Kit Click here and download it now
April 18th, 2012 12:09pm
Did you try to make all your DCs available for your isolated DR environment?
And try to test it out?
I am not sure how your AD is laid out but, forget about Exchange if you cannot bring AD healthy you will run into man y issues.
Have a look below articles to make sure there are no surprises moving ahead when you snapshot DCs
and use them in vitalized environment USN rollback etc. If I am not mistaken MS does not support your scenario, rolling back to snapshot as far as Active Directory goes. Reverting a snap of a DC may leave you in a world of pain and headaches and your
DR scenario might fail.
As good practice you should stay in supported scenarios and adjust your practices to accommodate this. You could perfectly have live DC in the second location ( DR location) and use SRM ( site recovery manager) to fail your servers onto DR location. This
would make more sense and it would work when DR happens (-:
Things to consider when you host Active Directory domain controllers in virtual hosting environments
http://support.microsoft.com/kb/888794
Good luck
ocd
Oz Casey, Dedeal MCITP (EMA), MCITP (EA), MCITP (SA)
Visit smtp25.blogspot.com Visit Telnet25.wordpress.com
This posting is provided AS-IS with no warranties or guarantees and confers no rights.
April 18th, 2012 12:12pm
On the question of making all DC's available the answer is NO, at this stage they won't make everything avaialble. I will need to do it as a seperate test to see if results are different
I'm starting to realise that it may not be a supported scenario but then i need to know what is and take that to the business and see what they say. Live DC hmmmm, yes we have a live DC at the DR site but even that is taken a snapshot of and then made available
in the DR location which you say is not supported. Let me get the logs and post them
Free Windows Admin Tool Kit Click here and download it now
April 19th, 2012 2:40am
Ok set the logging level to Max and tried to create a mailbox again. Obvioulsy get the error of server not operational. Only errors in Event log is 2107 and 2123
2107
Process MAD.EXE (PID=2532). DSAccess failed to obtain an IP address for DS server <root domain DC which is available> , error 11004. This host will not be used as a DS server by DSAccess.
2123
Process MAD.EXE (PID=2532). DSAccess is unable to connect to the Domain Controller <Root DC which is NOT available> although its service location (SRV) resource record was found in the DNS
The query was for the SRV record for _ldap._tcp.dc._msdcs.<root domain>
The following domain controllers were identified by the query:
<root DC available>
<root DC NOT available>
Common causes of this error include:
- Host (A) records that map the name of the domain controller to its IP addresses are missing or contain incorrect addresses.
- Domain controllers registered in DNS are not connected to the network or are not running.
For information about correcting this problem, type in the command line:
hh tcpip.chm::/sag_DNS_tro_dcLocator_messageHa.htm
April 19th, 2012 3:37am
Tried to create another mailbox and asked for more logs and got event id's
2085, 2084 and 2080
2085
Process EMSMTA.EXE (PID=3556). No Global Catalog server is up in the local site 'TS'. DSAccess will use the following out of site Global Catalog servers:
<DRDC>
2084
Process STORE.EXE (PID=1672). No Domain Controller server is up in the local site 'TS'. DSAccess will use the following out of site Domain Controller servers:
<DRDC>
2080
Process MAD.EXE (PID=2532). DSAccess has discovered the following servers with the following characteristics:
(Server name | Roles | Reachability | Synchronized | GC capable | PDC | SACL right | Critical Data | Netlogon | OS Version)
In-site:
<01 root DC NOT avilable> CD- 0 0 0 0 0 0 0 0
<03 DHCP DC avilable> CDG 7 7 1 0 1 1 0 1
<02 Child DC NOT available> CDG 0 0 1 0 0 0 0 0
01 Child DC available> CDG 7 7 1 0 1 1 0 1
Out-of-site:
<DR DC> CDG 7 7 1 0 1 1 7 1
Free Windows Admin Tool Kit Click here and download it now
April 19th, 2012 4:19am
Hi
Both the servers that you say are available have a 0 for Netlogon which means that DSAccess could not log on to that DC. Clearly your AD is not working in this DR site and you should follow Oz's advice above.
Cheers, Steve
April 19th, 2012 4:31am
Hi Steve
Thanks for your response and willlingness to help but the last post from you tells me nothing new. I know there is a problem but why? and what may be causing the issues?
What does it mean when the DC's that are available have a 0 for Netlogon?
Based on other things like password resets, changing memberships of users AD is working so I need to know more about what you mean when you say it is not working
I have also done a quick search on 0 for Netlogon on DC's and it mostly refers to Netlogon Shares which are missing on DC's but on the DC's that are available in the site the Netlogon share is present. Just for interesting sake
Free Windows Admin Tool Kit Click here and download it now
April 19th, 2012 4:51am
You need to look at your domain controllers to find out why they are not accepting logons from the Exchange servers, I imagine that their event logs would have a few errors in them. Restart the netlogon service on one of the DCs and check to see
which messages it logs. If you have reverted to a snaphot on your DCs then they will not be functioning and you will see error 2095 in their logs.
April 19th, 2012 5:05am
Thanks, will check the logs and revert back and yes we do work from snapshots. Its strange how you say they won't work but yet there are so many things which does work.
Have you had experience any spedific things that does not work or do you think it is replication in general?
Free Windows Admin Tool Kit Click here and download it now
April 19th, 2012 6:24am
The error I mentioned above (2095) is a USN Rollback error. This will prevent replication as the Update Sequence Numbers (USNs) will not correlate between your DCs. If you haven't seen it already this article discusses USN Rollbacks:
http://support.microsoft.com/kb/875495
I have seen a domain before where they have revered the PDC/Schema master to a snapshot and Exchange stopped working.
If you have a live DC that has all the FSMO roles then shut it down and take a copy of the VHD/VMDK (or P2V it if it is physical) and use that in your DR site. If you were planning for DR you will need this type of backup anyway.
April 19th, 2012 6:35am
So, what would you say is the proper way for us to do this exercise?
In our environment we have 3 or 4 DC's and the roles are distrubuted all over them. Must all the roles be transfered to one DC and that DC be shut down, snapshot taken and then the snapshot to be taken to the DR site to be started up?
Must the same be done to the root domain?
Free Windows Admin Tool Kit Click here and download it now
April 23rd, 2012 7:04am
Hi,
Microsoft has written related KB articles, I recommend you to check it:
How to detect and recover from a USN rollback in Windows Server 2003, Windows Server 2008, and Windows Server 2008 R2
http://support.microsoft.com/kb/875495
Besides, I recommend you to post the issue in Windows forum to fix the Active Directory related issue first. Since Exchange has a lot of dependency on Active Directory.Xiu Zhang
TechNet Community Support
April 24th, 2012 3:28am