Exchange 2007 SP1 in Hyper-V
I created my first Hyper-V server in a Dev Environment, but it was just a stand-alone desktop that supported Hardware Virtualization with a single IDE disk (only a singlepartitions. In installed Server Core 2008 w/ Hyper-V. Then installed Exchange 2007 SP1 (Mailbox Role)into a Guest VM on a single Fixed VHD. Migrated all my mailboxes, and used this Dev environment in production for about two weeks; and it worked GREAT.Once my new ProductionServer was ready to go with teamed NIC's and Hyper-V installed; just the same as in the Dev environment, I migrated the VM/VHD from the Dev PC to the new Server. I should mention the server has one large RAID array across 8 HD's. Just like in the Dev environment, there is virtually only one disk. The Host OS is on the disk, and the Guest VHD's are just files in the "ProgramData\...\Hyper-V" folder on that disk.Every few days the new Server Locks up. I cannot RDP to it, log onto the console, and Hyper-V Manager from a remote workstation cannot connect to the Hyper-V RPC service. The Guest VM's work for a while, but then also lock up. After restarting the server I see alot of Event ID 57 (VolMgr) and Event ID 51 (Disk) errors in the Host OS Event Logs; and little issues in the Guest VM's Event Logs.The server restarts fine, without errors, and runs for a few more days, and then locks up again.I can't seem to pin-down why this is happening, or what to look at. VSSAdmin of course after reboot does not report issues with any of the VSS Writers, and I can't get to looking at VSSAdmin prior to the hang/crash.Where do I start here? How do we identify what is causing the VolMgr issues.Directory of Technology
October 30th, 2009 10:13pm
This sounds like a Hardware issue on the Host system. I would run some hardware diagnostics to see if anything shows up.ThanksWillWill Shepherd - MCSE/MCITP/MCTS (Windows 2008,Exchange 2007,OCS 2007)
Free Windows Admin Tool Kit Click here and download it now
October 30th, 2009 10:30pm
Check and update the drivers of NIC. Also run a diagnostics utility from the HW vendor. i have seen this issue and got resolved with drivers update on NIC.Raj
October 30th, 2009 11:52pm
Rajnish - It's interesting that you suspect the NIC; because that is something 'unique' on this system. The server is a Dell R710 (brand new). Dell did/does not have animplementation suggestion for NIC Teaming between the Host OS and the Core Network on Server 2008 Core (no .NET). I 'ideally' wanted to Team all 4 GB NIC's back to the Core Network; and then setup one "virtual NIC" in Hyper-V bound only to this Virtual Team Interface NIC. I contacted BroadCOM (the NIC manufacture for Dell)and was directed to their cmd-line syntax for setting up the NIC Team with their latest drivers on Server 2008 Core. As a resultI have the network now implemented as I wanted it.Just before publishing this post, I finally installed the ethernet cables on two of the GB Interfaces on the system. While they are members of the team, they had not yet been terminated. In a normal stand-alone server, this is usually of no consequence,as the team setup isjust "Load Balancing and Failover".What is the dependancy between the NIC architecture and the Host OS disk operations, that could explain the relationship of this with my issue?Directory of Technology
Free Windows Admin Tool Kit Click here and download it now
November 2nd, 2009 6:16pm
Yesterday at about 1pm, the Error 51's began again. The server had been running for3 hoursand 20 min. It was completely unresponsive at the console; but the VM's were as usual, running without error.Today I installed the Dell SAS 6i RAID driver on the server, to see if that would improve the situation. This was Vol_Lic EE 2008, not Dell OEM media. We purchased the serverwithout the OS. So to date, the only drivers installed beyond Windows Setup are the NIC drivers, and now this Storage Controller driver.I also considered updating the firmware on the RAID card, but upon launching the FW batch file, the .exe that performs the update fail for incompatibility issues with x64 bit windows. So the firmware remains original from ship date.I will report in a few days is this has helped stabilize things.
Directory of Technology
November 4th, 2009 6:00pm
I have an idea, to aid in troubleshooting, is there a way to 'force' the event that is causing my error.Event 51 states: An error was detected on device \Device\Harddisk0\DR0 during a paging operation.Event 57 that directly follows states: The system failed to flush data to the transaction log. Corruption may occur.Is there a command I can run to "flush data to the transaction log"?Directory of Technology
Free Windows Admin Tool Kit Click here and download it now
November 4th, 2009 6:11pm
I checked couple of suggestions on this, and they suggest (to my surprise), Its not HW, probabaly the OS or the application. http://www.hardwareanalysis.com/content/topic/12946/?o=80Not sure if the link can provide a soluion, but there are lot of suggestions that might help you to resolve the issue.Raj
November 5th, 2009 2:15pm
I'm having this exact problem.. It started October 23rd. I have two virtual hosts a Dell PE2950 and a Dell R805.. Both have been running for nearly a year and both on the week of the 23rd Started this exact problem.. The PE2950 is running a PERC 5/E and the R805 is running aPERCE 6/E.. Both have Dell MD1000 DAS connected to them with just a pair of mirror drives for the Host OS. I upgraded the R805 to Windows 2008 R2 including updating the Tools on All instances and it still has the same problem.. The R805 has this problem once a day and the PE2950 has it once a week but it definately started on Both Machines on the Week of October 23rd.. No changes have been made and no updates were installed prior to this problem appearingand i already swapped out one of the PERC Controllers with a brand new Controll but that has not made any difference.. Any suggestions?the internet? whats the internet?
Free Windows Admin Tool Kit Click here and download it now
November 16th, 2009 8:22pm
mental note.. Mine aren't running exchange so i do not think exchange is the problem..the internet? whats the internet?
November 16th, 2009 9:38pm
HI,Seem you problem more close to Server issues or hyper-v issues. So please repost your problem in below forums as well.ResourceWindows Server 2008 R2 Hyper-VWindows Serverregards
Chinthaka Shameera | MCITP: EA | MCSE: M |
http://howtoexchange.wordpress.com/
Free Windows Admin Tool Kit Click here and download it now
November 18th, 2009 6:55am
Here is where this issue is at for me. My exchange VM has been running on a Dell Optiplex Workstation with no RAID or NIC teaming for the last 3 months; and my Dell R710 is the most expensive paper-weight I have ever had.At this moment I have supplied many DSET (Dell Diagnostic reports) to Dell; and have upgraded the entire UAC software (Universal (Remote) Access Controller) which is supposed to ensure all of the latest updates on the hardware/firmware. We have additionally removed the BroadCOM drivers from Broadcom.com from the Windows 2008 Core Server Host OS; and replaced with now certified DELL drivers for BroadCOM NICS (I believe the same driver, just different package), and re-created the teams.At this point I'm not interested in the 'risks' of returning my production Exchange VM to my server. However I know for a fact that if the server does not have any load on it; the error won't occur (or at least won't for a very long time). So, at the recommendation of my 'paid M$ case support engineer' I have downloaded an Exch 2010 Beta/RC VHD, and the LoadGen for Exch 2010 app which I intend to run on a Vista VM on the same server.The Exchange 2010 VM is installed and running and I have not had the issue yet after three days; which as I stated is expected. The LoadGen has not been installed yet to tax the server.This issue has exhausted alot of my time, and has been pushed to a back burner as a result; because I just don't have the time to invest this much 'testing' toward it; I'd run Linux if I wanted this much responsibility at this level. The addage that it's not anyone else's problem, so it doesn't hurt them or pressure them apparently keeps my 'paid' support engineers from doing more than asking me what the status is.I will do my best to report back to you folks after the LoadGen testing is complete.Directory of Technology
January 23rd, 2010 10:09pm