Forefront TMG Schannel memory leak when exchange is down

We are having an issue with our forefront TMG array that only happens when our exchange server is down (ie. for updates). As soon as the exchange server is unreachable, all TMG servers in the array start getting flooded with SCHANNEL errors (100+ a second) and will quickly leak memory until there is no more available ram, then the server becomes unresponsive and stops handling any requests. During this time, the w3p process also spikes to 100% on all servers. Once the exchange server is reachable again, the CPU spikes immediately stop, and within ~10 minutes the ram usage goes back to normal.

The errors received are:

36874:An unknown connection request was received from a remote client application, but none of the cipher suites supported by the client application are supported by the server. The SSL connection request has failed.

36888: The following fatal alert was generated: 10. The internal error state is 10.

This issue is 100% repeatable, and happens immediately when Exchange is shut down. It even happens if I disable all web listener for exchange (OWA, RPC and ActiveSync). I can reproduce this issue in a completely separate domain/environment as well.

The TMG servers are running SP2 with all but the most recent CU installed (7.0.9193.500).

Any thoughts on what is causing this and how to resolve? And please do not just suggest I just disable SCHANNEL logging in the registry, because that is not the issue. Thanks.

*edit* I have also completed the steps to disable harden SSL from this guide as they were causing our PCI tests to fail (http://www.isaserver.org/articles-tutorials/configuration-security/improving-ssl-security-forefront-threat-management-gateway-tmg-2010-published-web-sites.html), along with adjusting the cipher suites in the guide linked from that article.

  • Edited by Jsilveri Thursday, April 16, 2015 7:34 PM Additional Info
April 16th, 2015 7:30pm

It appears the issue is not just with Exchange, but any server using SSL. We did maintenance on our sharepoint server over the weekend and within minutes of it going down the same issues started occurring.

This is very bad that a web server going down can cause an entire production TMG array to stop responding....

Free Windows Admin Tool Kit Click here and download it now
April 20th, 2015 9:45am

I will be glad to research this and see if it is a known issue. If you can reproduce this issue 100% of the time it may be worthwhile opening a support case with us. If this turns out to be a code defect it would be free of charge. Since it involves SCHANNEL it may be an OS related issue. Issues like this can be quite involved and would likely need memory dumps once in an unresponsive state.

If you do open a support case let me know the number and I will take ownership of it.

April 20th, 2015 12:33pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics