Windows Server 2008 R2 DFS replication doesn't start initial replication

Hi there,

We have 2 datacenter locations and with 1 fileserver (Windows Server 2008 R2) in each datacenter. We decided to add another fileserver in each datacenter and to replicate the data on the fileservers between the datacenters by using DFS replication. When the replication is running smooth we can start using DFS namespaces.

Fileserver1 is in datacenter1
Fileserver2 is in datacenter2
Fileserver3 (new) is in datacenter2
Fileserver4 (new) is in datacenter1

We would like create 2 DFS replicationgroups; Fileserver1 with Fileserver3 and Fileserver2 with Fileserver4. The fileservers have a C:\ partition which contains the Windows installation and a D:\ partition which contains the data. On both Fileserver1 and Fileserver2 we have the folder D:\Homedirectories. We have approximately 40 customers on each fileserver. The folder D:\Homedirectories is approximately 300GB on each fileserver.

The replicationgroup for the fileservers in datacenter1 works, finally. I figured there was an active share on Fileserver1 for a folder that did no longer exist. With the Share and Storage manager I could remove the share. After that point Fileserver1 starting the initial replication to Fileserver3. That was 3 weeks ago and it's still running smooth.

I created the same replicationgroup for Fileserver2 and Fileserver4 in datacenter2. This replicationgroup won't start the initial replication. First I ran a DFS Diagnostic Report. It showed that there were more than 100 files which could not be replicated. I found out these were temporary files. All of these files seemed to be pictures or other kind of images (.png, .tif, .jpg). I received a Powershell scripts to check all files on D:\homedirectories and it's subfolders and to remove the temporary attribute. This solved the problem for the >100 files who could not be replicated.

The Diagnostic Report showed another error. This error was quite specific. The error tells me there is a folder with "repeated sharing violations". See the content of the corresponding Windows Event Viewer event below; (the *** are censored customer names)

=====

The DFS Replication service failed to get folder information when walking the file system on a journal wrap or loss recovery due to repeated sharing violations encountered on a folder. The service cannot replicate the folder and files in that folder until the sharing violation is resolved.

Additional Information:

Folder: D:\Homedirectories\***\Tijdelijk\

Replicated Folder Root: D:\Homedirectories

File ID: {00000000-0000-0000-0000-000000000000}-v0

Replicated Folder Name: Homedirectories

Replicated Folder ID: CA05B7CE-AF89-4E43-8797-CDB9F871B778

Replication Group Name: Datacenter2

Replication Group ID: 2C182A82-041D-4F5A-A147-FE9107E954DE

Member ID: 4DD60FC5-4E4B-4FCE-929E-17995616F83C

=====

I checked the Share and Storage Management to see if there are "corrupt" shares as there were for the replicationgroup1. I did not see any corrupt shares but just to be sure I deleted the share on the folder as given in the Windows Event. I restarted the DFS replication service but after 45 minutes the same Windows Event shows up, telling me there are "repeated sharing violations" on the folder D:\Homedirectories\***\Tijdelijk (keep in mind the *** is just to censor our customer's name. In our situation it's a foldername consisting of 3 alphabetic characters).

I would really like to know what else can cause the "repeated sharing violations" for this exact folder. It's the only error left for me before the initial replication can start. I google'd a lot to see if anyone has encountered the same problem and there were some things I tried that unfortunately did not work for me;
- Event 4312 can show up when the pagefile is on the same partition as the folder you want to replicate.
This did not work for me because the pagefile is located on the C:\ partition and the folder to be replicated is located on the D:\ partition.

- Event 4312 can show up when you have third party software trying to replicate the folder.
This did not work for me because I do not have third party software trying to replicate the folder. Also there are no active Robocopy scripts and there is no back-up software running on that time. Even if this would be the case this would not explain why this exact folder keeps giving "repeated sharing violations".

- Event 4312 can show up when you do not have sufficient rights to access the folder.
This did not work for me because the sharing and NTFS permissions are the same as the other folders which do not give errors and also this folder has the same sharing and NTFS permissions as folders on the fileserver1, which is actually replicating. When I copy this folder with Robocopy I do not get any errors. SYSTEM and Administrators have full control on this folder.

So again; what else can cause the error of "repeated sharing violations"?

Thanks in advantage for any suggestions.
Roy

October 10th, 2013 5:25am

Hi,

Is there an Event ID: 4004 on the DFSR debug log? If so, you can install the hotfixes in the kb article below on both servers to see if the issue still exists.

The DFS Replication service may stop responding when it initializes the replication process for the replicated folders on a computer that is running Windows Server 2003 R2, Windows Server 2008, or Windows Server 2008 R2
http://support.microsoft.com/kb/977381

CAUSE:

When the DFS Replication service initializes the replicated folders for the replication process, it traverses all related paths to check whether the replicated folders are reparse points that act as symbolic links or that act as mount points. 
The DFS Replication service expects to open synchronous handles to access these paths. However, it uses the asynchronous handles incorrectly. The DFS Replication service cannot handle the I/O requests that are held by a filter driver. Therefore, the DFS Replication service stops responding.

In the meantime, you could try to move the affected folder out of the replication group to see if replication will back to work without that folder. Sharing violation could be caused if file is locked when DFSR trying to replicate.

Regards,

Free Windows Admin Tool Kit Click here and download it now
October 14th, 2013 6:34am

Hi Mandy Ye,

Thank you for your reply. I did found a 4004 error in the DFS Replication log. The last time this error occurred was 3 weeks ago. After that I restarted the DFS Replication service multiple times. Now the only error that still occurs is the 4312.

I will try the hotfix you proposed. The only problem is that I have to restart the fileserver after applying this hotfix. The fileserver is not redundant (yet.. because of DFS Replication is not working) and our customers will experience some downtime because of this restart. We cannot afford this so I will have to wait until we have maintenance.

Regards,
Roy

October 14th, 2013 10:25am

Hi Mandy Ye,

As I figured fileserver2 would not be restarted in the near future because of inconvenience for our customers I toke another look at the "Sharing Violations". It had to do something with that and the given folder, D:\Homedirectories\***\Tijdelijk. Someone proposed to use the tool "Handle.exe" on the folder D:\Homedirectories\***\Tijdelijk. There was an executable file in use by the process "SYSTEM".

I investigated the file. I could not see it's attributes, and I, as an Administrator, could not see the NTFS permissions. Also the owner information of this file could not be retrieved and I could not change the owner. I'm 99% sure this "corrupted" file is the root cause of DFS not starting the replication.

I used several methods and tools trying to move, rename or delete the corrupted file but every time it tells me I do not have sufficient rights to do so. I also used the tool Process Explorer to close the handle in Process SYSTEM. Again no success. I also stopped the DFS Replication service to see if this is the process that's "locking" the file but again, no success.

It seems like rebooting the server is the fastest way to unlock the file so I can delete or move it. I still not prefer rebooting this server as it is essential for our customers and some of them work at night too.

Maybe someone has been in this situation before and knows how to unlock this file. To be honest I doubt anybody knows how to unlock this file since I can't even close the handle using Process Explorer.

Kind Regards,
Roy

Free Windows Admin Tool Kit Click here and download it now
October 14th, 2013 4:19pm

Hi,

Since you dont want to restart fileserver2, please refer to the similar thread below to help troubleshoot this issue:

Event 4312 - DFSR Corrput files stopping replication

http://social.technet.microsoft.com/Forums/windowsserver/en-US/42166341-5d44-4c87-a42e-295f1a3335dd/event-4312-dfsr-corrput-files-stopping-replication?forum=winserverfiles

Regards,

October 21st, 2013 5:57am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics