Exchange 2007 CCR missing email
I have an Exchange 2007 CCR Cluster. The roles break down like this: 1 Edge, 2 HT/CAS, and 2 MBX. I think the active node last night went down due to updates. When it came back up the cluster didn't come back online. When I first noticed this, I looked at the event viwer and you'll see the events below.My problem is that now I have users who are missing email from yesterday. I attempted to update-storagegroupcopy from the passive node, but it failed with this error: Log files already exist at 'location' they must be removed before storage group seeding or reseeding can be performed. It suggests to use the -deleteexistingfiles switch, but when I attempted that it said I had to use restore-storagegroupcopy. When attempting to use restore-storagegroupcopy it errored saying the databases had to be mounted (even though that command is supposed to flag the databases to be mounted). The reason noted was "this copy is the last mounted copy of the database in this storage group, to make this copy available use mount-database or move the clustered mailbox server to exmbx01. I think I then restored-storagegroupcopy on the current active node and then mounted the databases on the passive node and ran update-storagegroupcopy. The update failed saying it already had log files and I had to add the -deleteexstingfiles switch. I did and the passive node came back up and the CCR shows as healthy, but I have data loss!Do I have anyway of getting back the mail missing from the users mailboxes? Why did this happen when only 1 of my mbx roles failed? What am I doing wrong?Log Name: ApplicationSource: MSExchangeReplDate: 10/15/2009 5:27:30 AMEvent ID: 2092Task Category: ActionLevel: WarningKeywords: ClassicUser: N/AComputer: EXMBX01.aph.localDescription:Clustered Mailbox Server: MBXPhysical Server: EXMBX01
The database in storage group MBX\First Storage Group will not be automatically mounted because the number of logs lost was greater than the amount specified by "AutoDatabaseMountDial".* The log file generated before the move operation or failover was: 17247* The log file successfully replicated to the passive node was: 16330* "AutoDatabaseMountDial" is set to: BestAvailability
Attempts to copy the log files from the active node were not successful. The specific error returned is: The Microsoft Exchange Replication Service was unable to perform an incremental re-seed of the passive node for the clustered mailbox server 'MBX\First Storage Group' because the log files on the active node have diverged too widely from the log files on the passive node. A full re-seed of the passive node for this storage group is required. Re-seeding can be done by using the Update-StorageGroupCopy cmdlet in the Exchange Management Shell.. To resolve this error, do one of the following (1) use Restore-StorageGroupCopy to mount the database (this will eventually require a re-seed of the database on the passive node); (2) wait for the passive node to come online, which may make the missing log files available for copying; or (3) examine the event description above and see if you can manually copy the missing log files from the failed node.Event Xml:<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event"> <System> <Provider Name="MSExchangeRepl" /> <EventID Qualifiers="32772">2092</EventID> <Level>3</Level> <Task>5</Task> <Keywords>0x80000000000000</Keywords> <TimeCreated SystemTime="2009-10-15T12:27:30.000Z" /> <EventRecordID>17529</EventRecordID> <Channel>Application</Channel> <Computer>EXMBX01.aph.local</Computer> <Security /> </System> <EventData> <Data>MBX\First Storage Group</Data> <Data>MBX</Data> <Data>EXMBX01</Data> <Data>17247</Data> <Data>16330</Data> <Data>BestAvailability</Data> <Data>The Microsoft Exchange Replication Service was unable to perform an incremental re-seed of the passive node for the clustered mailbox server 'MBX\First Storage Group' because the log files on the active node have diverged too widely from the log files on the passive node. A full re-seed of the passive node for this storage group is required. Re-seeding can be done by using the Update-StorageGroupCopy cmdlet in the Exchange Management Shell.</Data> </EventData></Event>I then went back in the logs and saw this warning and errors like the one below...Log Name: ApplicationSource: MSExchange Search IndexerDate: 10/15/2009 3:08:48 AMEvent ID: 107Task Category: GeneralLevel: WarningKeywords: ClassicUser: N/AComputer: EXMBX01.aph.localDescription:Exchange Search Indexer has temporarily disabled indexing of the Mailbox Database First Storage Group\Mailbox Database (GUID = 374f9946-fb18-4703-b81e-5e210e431976) due to an error (Microsoft.Mapi.MapiExceptionNetworkError: MapiExceptionNetworkError: Unable to read events. (hr=0x80040115, ec=-2147221227)Diagnostic context: ...... Lid: 9624 dwParam: 0x6BE Msg: EEInfo: Detection location: 292 Lid: 13720 dwParam: 0x6BE Msg: EEInfo: Flags: 0 Lid: 11672 dwParam: 0x6BE Msg: EEInfo: NumberOfParameters: 0 Lid: 16280 dwParam: 0x6BE Msg: EEInfo: ComputerName: n/a Lid: 8600 dwParam: 0x6BE Msg: EEInfo: ProcessID: 2864 Lid: 12696 dwParam: 0x6BE Msg: EEInfo: Generation Time: 2009-10-15 10:08:48:878 Lid: 10648 dwParam: 0x6BE Msg: EEInfo: Generating component: 8 Lid: 14744 dwParam: 0x6BE Msg: EEInfo: Status: 64 Lid: 9624 dwParam: 0x6BE Msg: EEInfo: Detection location: 290 Lid: 13720 dwParam: 0x6BE Msg: EEInfo: Flags: 0 Lid: 11672 dwParam: 0x6BE Msg: EEInfo: NumberOfParameters: 1 Lid: 12952 dwParam: 0x6BE Msg: EEInfo: prm[0]: Long val: 0 Lid: 55369 Lid: 28777 StoreEc: 0x80040115 Lid: 20098 Lid: 20585 StoreEc: 0x80040115 at Microsoft.Mapi.MapiExceptionHelper.ThrowIfError(String message, Int32 hresult, Object objLastErrorInfo) at Microsoft.Mapi.MapiEventManager.ReadEvents(Int64 startCounter, Int32 eventCountWanted, Int32 eventCountToCheck, Restriction filter, ReadEventsFlags flags, Int64& endCounter) at Microsoft.Mapi.MapiEventManager.ReadEvents(Int64 startCounter, Int32 eventCountWanted) at Microsoft.Exchange.Search.RetriableOperations.ReadEvents(ThreadLocalCrawlData unused1, Guid unused2, MapiEventManager eventManager, Int64 watermark, Int32 eventCount) at Microsoft.Exchange.Search.RetriableOperations.DoRetriableMapiOperation[SourceType,ReturnType,Parameter1Type,Parameter2Type](ThreadLocalCrawlData crawlData, Guid mailboxGuid, SourceType source, Parameter1Type parameter1, Parameter2Type parameter2, MapiOperationDelegate`4 operationDelegate) at Microsoft.Exchange.Search.NotificationWatcher.GetMapiEvents(Int32 maxEvents, NotificationQueue notificationQueue) at Microsoft.Exchange.Search.NotificationWatcher.NotificationWatcherThread()).Event Xml:<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event"> <System> <Provider Name="MSExchange Search Indexer" /> <EventID Qualifiers="32772">107</EventID> <Level>3</Level> <Task>1</Task> <Keywords>0x80000000000000</Keywords> <TimeCreated SystemTime="2009-10-15T10:08:48.000Z" /> <EventRecordID>15792</EventRecordID> <Channel>Application</Channel> <Computer>EXMBX01.aph.local</Computer> <Security /> </System> <EventData> <Data>First Storage Group\Mailbox Database</Data> <Data>374f9946-fb18-4703-b81e-5e210e431976</Data> <Data>Microsoft.Mapi.MapiExceptionNetworkError: MapiExceptionNetworkError: Unable to read events. (hr=0x80040115, ec=-2147221227)Diagnostic context: ...... Lid: 9624 dwParam: 0x6BE Msg: EEInfo: Detection location: 292 Lid: 13720 dwParam: 0x6BE Msg: EEInfo: Flags: 0 Lid: 11672 dwParam: 0x6BE Msg: EEInfo: NumberOfParameters: 0 Lid: 16280 dwParam: 0x6BE Msg: EEInfo: ComputerName: n/a Lid: 8600 dwParam: 0x6BE Msg: EEInfo: ProcessID: 2864 Lid: 12696 dwParam: 0x6BE Msg: EEInfo: Generation Time: 2009-10-15 10:08:48:878 Lid: 10648 dwParam: 0x6BE Msg: EEInfo: Generating component: 8 Lid: 14744 dwParam: 0x6BE Msg: EEInfo: Status: 64 Lid: 9624 dwParam: 0x6BE Msg: EEInfo: Detection location: 290 Lid: 13720 dwParam: 0x6BE Msg: EEInfo: Flags: 0 Lid: 11672 dwParam: 0x6BE Msg: EEInfo: NumberOfParameters: 1 Lid: 12952 dwParam: 0x6BE Msg: EEInfo: prm[0]: Long val: 0 Lid: 55369 Lid: 28777 StoreEc: 0x80040115 Lid: 20098 Lid: 20585 StoreEc: 0x80040115 at Microsoft.Mapi.MapiExceptionHelper.ThrowIfError(String message, Int32 hresult, Object objLastErrorInfo) at Microsoft.Mapi.MapiEventManager.ReadEvents(Int64 startCounter, Int32 eventCountWanted, Int32 eventCountToCheck, Restriction filter, ReadEventsFlags flags, Int64& endCounter) at Microsoft.Mapi.MapiEventManager.ReadEvents(Int64 startCounter, Int32 eventCountWanted) at Microsoft.Exchange.Search.RetriableOperations.ReadEvents(ThreadLocalCrawlData unused1, Guid unused2, MapiEventManager eventManager, Int64 watermark, Int32 eventCount) at Microsoft.Exchange.Search.RetriableOperations.DoRetriableMapiOperation[SourceType,ReturnType,Parameter1Type,Parameter2Type](ThreadLocalCrawlData crawlData, Guid mailboxGuid, SourceType source, Parameter1Type parameter1, Parameter2Type parameter2, MapiOperationDelegate`4 operationDelegate) at Microsoft.Exchange.Search.NotificationWatcher.GetMapiEvents(Int32 maxEvents, NotificationQueue notificationQueue) at Microsoft.Exchange.Search.NotificationWatcher.NotificationWatcherThread()</Data> </EventData></Event>Log Name: ApplicationSource: MSExchangeReplDate: 10/15/2009 3:08:55 AMEvent ID: 2147Task Category: ServiceLevel: ErrorKeywords: ClassicUser: N/AComputer: EXMBX01.aph.localDescription:There was a problem with 'exmbx02', which is an alternate name for 'EXMBX02'. The list of aliases is now 'exmbx02', and the alias 'was' removed from the list. The specific problem is 'The directory name \\exmbx02\8c54f77b-6e78-4fad-ad8b-d29ac98d3055$ is invalid.'.Event Xml:<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event"> <System> <Provider Name="MSExchangeRepl" /> <EventID Qualifiers="49156">2147</EventID> <Level>2</Level> <Task>1</Task> <Keywords>0x80000000000000</Keywords> <TimeCreated SystemTime="2009-10-15T10:08:55.000Z" /> <EventRecordID>15793</EventRecordID> <Channel>Application</Channel> <Computer>EXMBX01.aph.local</Computer> <Security /> </System> <EventData> <Data>exmbx02</Data> <Data>EXMBX02</Data> <Data>exmbx02</Data> <Data>was</Data> <Data>The directory name \\exmbx02\8c54f77b-6e78-4fad-ad8b-d29ac98d3055$ is invalid.</Data> </EventData></Event>
October 15th, 2009 6:19pm
No, you likely can't get the mail back unless you were lucky enough to have a backup run after the mail came in and before the failure. if so, you could restore to an RSG and get export the mail where critical.Are you saying that updates applied to a node of the cluster that was active and you didn't know about it (auto-install)? Updates are definitely a scenario you want to handle manually and in a planned fashion (as I'm sure you are rueing today).Once the system has brought a copy of the database back online and started recording new transactions into it, getting data that was in uncommitted logs back is really really tough to say the least.
Free Windows Admin Tool Kit Click here and download it now
October 15th, 2009 10:32pm
If you absolutely require no data to be lost, you need to modify your AutoDatabaseMountDial parameter so only lossless failover occurs. The downside to this is that if all log files are not copied to the passive node, if your source server fails, the passive will not mount until the source comes back up so it can finish copying those missing logs. You could override this behavior with the ForcedDatabaseMountAfter though. More information about the AutoDatabaseMountDial parameter is located here . To minimize data loss, ensure that your Transport Dumpster is optimally configured depending on your needs. There's some information here on Transport Server Storage Design which includes the Transport Dumpster which essentially retains transport data in case a server fails over, the Mailbox Server can obtain information from the Transport Dumpster so no data appears to be lost in a user's mailbox.MVP | MCSE:M | MCITP: Enterprise Messaging Administrator | MCTS: OCS + Voice Specialization | http://www.shudnow.net
October 15th, 2009 11:30pm
Generally, I don’t think you did anything wrong. The mail missing is determined by your BestAvailability
Per my knowledge, after the event 2092, the lossy-ness setting (AutoDatabaseMountDial) on your SG is compared that to that number to determine whether it can mount automatically. If you cannot mount a specific SG, the replication service will run on the active (which was the old passive). It will "wake up" everything once in a while, try to contact the passive (which was the old active), and copy the missing log files. If it can copy enough log files to reduce the "lossy-ness" to an acceptable amount, then the SG will come online. Since your setting is BestAvailability (By default), the lost is 6 logs
So, agree with Elan. If you need no data loss, please choose Lossless. Then, the passive node will become the active node, but the database won't come online. Until the original active appear, its log files will be copied, and one-by-one, the storage groups will start coming online. In order to recover the lost mails, you need to quantify the loss. The replication service keeps track of the last log that the store generated. Run the Get-StorageGroupCopyStatus, check the value of LastLogGenerated, and then compare the last log generation with the last log that was copied. The gap between them is how many log files you just lost and need to be copied
Resources:
Lost Log Resilience and Transaction Log Activity in Exchange 2007
How to Tune Failover and Mount Settings for Cluster Continuous Replication
Free Windows Admin Tool Kit Click here and download it now
October 16th, 2009 10:46am
Andy, I've notice the drives are filling up on the 2 mailbox servers. I need to perform some backups. Could you make a recommendation?
October 19th, 2009 11:55pm
Data Protection Manager would be the good choice to backup exchange data
Protecting Exchange data with Microsoft System Center Data Protection Manager
Continuous Backup for Exchange Server with DPM 2007
Protect exchange server
Free Windows Admin Tool Kit Click here and download it now
October 20th, 2009 4:02am
I will probably setup a DPM backup solution, but I noticed I have less than 100GB left on both of my MBX servers and it is filling up rather quickly. Is there any other internal simple solution like with Exchange 2k3 and ntbackup? It was rather simple. Run ntbackup, specify the exchange storage groups, and you were set..
October 20th, 2009 9:17am
Internally, you may refer the resources below:
Details of Exchange 2007 SP2 in-box backup when running on Windows Server 2008
Uncovering the new Exchange 2007 SP2 Volume Snapshot (VSS) Plug-in
How to Upgrade a Clustered Mailbox Server in a CCR Environment to Exchange 2007 SP1 or SP2
Free Windows Admin Tool Kit Click here and download it now
October 20th, 2009 9:36am