Sudden issue with client communications in SCCM 2012
I'm not quite sure where to begin but here is some background info:
SCCM 2012 started with initial release (clean install) and was upgraded to SP1 a little more than a month ago.
A large deployment (Cas, 2 primary sites (with dedicated DB servers), and currently around 30 DPs).
Up until last Friday things were going well. Then starting last Monday (2/18/2013) I initially got reports that endpoint that were trying to be imaged via PXE and bootable media were failing to find task sequences. The machines would successfully
boot into WinPE but report no task sequences are available. In reviewing the SMSTS.log file on a endpoint, it shows that it successfully finds the preexisting SCCM computer object (SCCM object GUID and name displayed in the log matches what is shown
in the SCCM console). It also shows a successful policy assignment retrieval but 0 policy assignments, 0 collection variables, 0 machine variables even though the SCCM object has more than one TS deployed to it and a machine variable. The object
has both the MAC address and I added the BIOS GUID to make sure it isn't a GUID conflict (which is doubtful since it affects other endpoints) and it still had the same result.
Any other ideas on what is going on?
February 21st, 2013 7:45pm
what does the smspxe logfile on the server tell you about the problem ?
February 21st, 2013 8:16pm
the SMSPXE log contains the following:
<![LOG[Client lookup reply: <ClientIDReply><Identification Unknown="0" ItemKey="16783246" ServerName="" ServerRemoteName=""><Machine><ClientID/><NetbiosName/></Machine></Identification></ClientIDReply>
]LOG]!><time="14:07:00.823+360" date="02-21-2013" component="SMSPXE" context="" type="1" thread="1888" file="libsmsmessaging.cpp:6363">
<![LOG[D4:BE:D9:1B:E2:3A, 4C4C4544-0032-5910-8035-C6C04F425331: device is in the database.]LOG]!><time="14:07:00.823+360" date="02-21-2013" component="SMSPXE" context="" type="1" thread="1888" file="database.cpp:479">
<![LOG[Client boot action reply: <ClientIDReply><Identification Unknown="0" ItemKey="16783246" ServerName="" ServerRemoteName=""><Machine><ClientID>GUID:3F19803F-8F84-4CF0-8D5C-DB8FE9C83784</ClientID><NetbiosName/></Machine></Identification><PXEBootAction
LastPXEAdvertisementID="" LastPXEAdvertisementTime="" OfferID="CM1200A9" OfferIDTime="1/16/2013 10:50:00 AM" PkgID="CM10018E" PackageVersion="" PackagePath="http://DP_FQDN/SMS_DP_SMSPKG$/CM10018C" BootImageID="CM10018C" Mandatory="0"/></ClientIDReply>
]LOG]!><time="14:07:01.088+360" date="02-21-2013" component="SMSPXE" context="" type="1" thread="1888" file="libsmsmessaging.cpp:6561">
<![LOG[D4:BE:D9:1B:E2:3A, 4C4C4544-0032-5910-8035-C6C04F425331: found optional advertisement CM1200A9]LOG]!><time="14:07:01.088+360" date="02-21-2013" component="SMSPXE" context="" type="1" thread="1888" file="database.cpp:479">
<![LOG[Client boot action reply: <ClientIDReply><Identification Unknown="0" ItemKey="16783246" ServerName="" ServerRemoteName=""><Machine><ClientID>GUID:3F19803F-8F84-4CF0-8D5C-DB8FE9C83784</ClientID><NetbiosName/></Machine></Identification><PXEBootAction
LastPXEAdvertisementID="" LastPXEAdvertisementTime="" OfferID="CM1200A9" OfferIDTime="1/16/2013 10:50:00 AM" PkgID="CM10018E" PackageVersion="" PackagePath="http://DP_FQDN/SMS_DP_SMSPKG$/CM10018C" BootImageID="CM10018C" Mandatory="0"/></ClientIDReply>
]LOG]!><time="14:07:02.555+360" date="02-21-2013" component="SMSPXE" context="" type="1" thread="1888" file="libsmsmessaging.cpp:6561">
<![LOG[D4:BE:D9:1B:E2:3A, 4C4C4544-0032-5910-8035-C6C04F425331: found optional advertisement CM1200A9]LOG]!><time="14:07:02.555+360" date="02-21-2013" component="SMSPXE" context="" type="1" thread="1888" file="database.cpp:479">
<![LOG[Looking for bootImage CM10018C]LOG]!><time="14:07:02.555+360" date="02-21-2013" component="SMSPXE" context="" type="1" thread="1888" file="bootimagemgr.cpp:1760">
<![LOG[BootImage CM10018C needs to be updated (new packageID=CM10018C) VersionUpdate=true]LOG]!><time="14:07:02.555+360" date="02-21-2013" component="SMSPXE" context="" type="1" thread="1888" file="bootimagemgr.cpp:1772">
<![LOG[PXE::CBootImageInfo::CBootImageInfo: key=CM10018C]LOG]!><time="14:07:02.555+360" date="02-21-2013" component="SMSPXE" context="" type="1" thread="1888" file="bootimagecache.cpp:60">
<![LOG[Saving Media Variables to "D:\RemoteInstall\SMSTemp\2013.02.21.14.07.04.0003.{8C634EA6-CEEF-45F6-894B-EBFEFF482C86}.boot.var"]LOG]!><time="14:07:04.739+360" date="02-21-2013" component="SMSPXE" context="" type="1" thread="1888" file="tsremovablemedia.cpp:186">
<![LOG[Looking for bootImage CM10018C]LOG]!><time="14:07:22.179+360" date="02-21-2013" component="SMSPXE" context="" type="1" thread="1888" file="bootimagemgr.cpp:1760">
I'm not sure what the update boot image message is about since the bootimage is based on the new one (post-SP1) version 6.2.9200.16384.
February 21st, 2013 8:20pm
interesting
<![LOG[Looking for bootImage CM10018C]LOG]!><time="14:07:02.555+360" date="02-21-2013" component="SMSPXE" context="" type="1" thread="1888" file="bootimagemgr.cpp:1760">
<![LOG[BootImage CM10018C needs to be updated (new packageID=CM10018C) VersionUpdate=true]LOG]!><time="14:07:02.555+360" date="02-21-2013" component="SMSPXE" context="" type="1" thread="1888" file="bootimagemgr.cpp:1772">
have you tried redploying the task sequence (see
here)
February 21st, 2013 8:25pm
So I deleted all the TS targeting the collection
Re-imported boot images based on the base one contained on the CAS
Deployed the new boot image after the necessary changes (debugging enabled and PXE distribution eligible)
Updated the TS with the new boot image and redeployed the TS
Restarted the WDS service on the DP.
The endpoint still fails to find a task sequence but the boot image update message is gone. The SMSPXE log shows that an optional advertisement is found but the SMSTS log file shows no policy assignments.
February 21st, 2013 9:06pm
So it turns out the issue was more than just imaging, clients in general were failing to get policies (they could send requests, just got empty responses). The issue was tracked down to two corrupt policies (they contained null values) that caused
the client to dump the entire policy table. Once those policies were removed from the database, things were flowing as normal. We are still tracking down how the policies were corrupted in the first place but at least others with this issue may
be able to get a clue to what is wrong.
The log that had the error in it was the MP_Policy.log (I believe this is on the MP server) and the error was that it "detected at least one row i the result set from policyassignment table which does not have a signature, rejecting all rows."
-
Marked as answer by
Keith P. Sumlar
Wednesday, February 27, 2013 7:39 PM
February 27th, 2013 7:39pm
So it turns out the issue was more than just imaging, clients in general were failing to get policies (they could send requests, just got empty responses). The issue was tracked down to two corrupt policies (they contained null values) that caused
the client to dump the entire policy table. Once those policies were removed from the database, things were flowing as normal. We are still tracking down how the policies were corrupted in the first place but at least others with this issue may
be able to get a clue to what is wrong.
The log that had the error in it was the MP_Policy.log (I believe this is on the MP server) and the error was that it "detected at least one row i the result set from policyassignment table which does not have a signature, rejecting all rows."
-
Marked as answer by
Keith P. Sumlar
Wednesday, February 27, 2013 7:39 PM
February 27th, 2013 7:39pm
Hi Keith,
Could you tell us how did you find the corrupted policies?
March 23rd, 2013 8:08pm
Hey Keith,
I get the same error message in the MP_Policy.log on two remote management points.
"detected at least one row in the result set from PolicyAssignment table which does not have a Signature, rejecting all rows."
Installation is RTM updated to SP1 and then CU1
Same effect clients dont get any policies
Can you please let me know what exactly you deleted and in which table (policy/policy assignment)?
Also, it would be good to know how you identified the corrupted policies
Thanks a lot!
Silvan
June 14th, 2013 7:38am
Hi Silvan,
Are you still having this issue? Not sure since it was posted in June. But FYI: We had the same issue, and after endless days/nights of troubleshooting, we finally engaged MSFT support. This was what we did to see if we had any bad policies, and if
so, we then deleted them. You'll need to log into the SQL server that houses the SCCM DB, make sure (of course) that you're running the query against the database, and run the following queries:
Here's the query to check for any bad policies:
SELECT * FROM ResPolicyMap WHERE machineid = 0 and PADBID IN (SELECT PADBID FROM PolicyAssignment WHERE BodyHash IS NULL)
To delete:
Delete FROM ResPolicyMap WHERE machineid = 0 and PADBID IN (SELECT PADBID FROM PolicyAssignment WHERE BodyHash IS NULL)
Hope this helps, but can only imagine that if you didn't find this resolution yet, you may have reinstalled SCCM.
Cheers,
Megs
-
Proposed as answer by
Silvan Maeder
Thursday, October 17, 2013 9:43 PM
October 17th, 2013 6:24pm
Hi Silvan,
Are you still having this issue? Not sure since it was posted in June. But FYI: We had the same issue, and after endless days/nights of troubleshooting, we finally engaged MSFT support. This was what we did to see if we had any bad policies, and if
so, we then deleted them. You'll need to log into the SQL server that houses the SCCM DB, make sure (of course) that you're running the query against the database, and run the following queries:
Here's the query to check for any bad policies:
SELECT * FROM ResPolicyMap WHERE machineid = 0 and PADBID IN (SELECT PADBID FROM PolicyAssignment WHERE BodyHash IS NULL)
To delete:
Delete FROM ResPolicyMap WHERE machineid = 0 and PADBID IN (SELECT PADBID FROM PolicyAssignment WHERE BodyHash IS NULL)
Hope this helps, but can only imagine that if you didn't find this resolution yet, you may have reinstalled SCCM.
Cheers,
Megs
-
Proposed as answer by
Silvan Maeder
Thursday, October 17, 2013 9:43 PM
October 17th, 2013 6:24pm
Hi Megs,
Thanks a lot for sharing this! Much appreciated.
Since my installation was only a test environment I ended up deleting records form the PolicyAssignment table which eventually resolved the issue. But the queries you posted make it easy to identify and delete bad policies and can save a lot of time I suppose
;-)
Have you been able to verify what cause the issue in the first place?
Thanks a lot.
Silvan
October 17th, 2013 9:43pm
Hi Silvan,
Glad to help. Unfortunately, we do not have the exact root cause. However, with the help and information from our support case with the Microsoft engineer, we do know that the policy was not signed for some reason, which was probably the result of some
sort of hiccup or timing/network between ConfigMgr and SQL.
Cheers!
Megs
December 3rd, 2013 12:49pm
Thanks Megs!
Just want to say thank you for this post. We were having this exact same issue and deleting the bad policy records worked.
Cheers!
Dan. :-)
December 4th, 2013 2:29pm
Excellent news. Thanks for letting me know, Dan! Glad to help! :)
January 10th, 2014 9:46pm
Thank you very much,
Keith P. Sumlar!
Your answer helped us!
August 14th, 2014 9:54am
This resolved our issue also. Is there a better explanation as to why this happens and what exactly we are deleting?
Thank you
November 17th, 2014 7:47pm
Thanks for the query, this saved my day.
April 20th, 2015 7:44am
Thanks for the graet help :)
April 29th, 2015 1:18am