High CPU TrustedInstaller in a VDI Environment

A client of mine recently reported regular occurrences of high CPU usage in their virtual desktops as a consequence of TrustedInstaller appearing. They couldn’t fathom out why it would crop up and it seemed to be at random.

It didn’t take long to track down offending machines as their performance metrics in vCenter would put their CPU usage at the top of the list so as soon as one cropped up, I started remotely querying the machine and trawling through the Application Event Logs.

Each of the VMs were 2vCPU Windows 7 machines and as can be seen in the image above, the process would effectively take an entire vCPU (50% of the total available to the machine in this case)

After doing so for about 5 machines and correlating the CPU Time for the executable back to the Application Event Log time (in the example above, I worked back 49 minutes) they all pointed back to the execution of a Bomgar Support Customer Client that was being installed when the support team were actually helping customers for other issues. The irony was, that once the support engineer had resolved the users issue, they were effectively leaving the user with half of their CPU because the installation of the remote control agent, triggered the TrustedInstaller.exe

I validated my findings by asking to be sent a remote support request and lo and behold the same problem appeared. At least now we had a way to make the problem repeatable. After that, I ran another user permitted executable that triggers the Windows Module Installer service (TrustedInstaller.exe) and precisely the same problem appeared. I now knew it wasn’t limited to the support tool, but a generic problem at the OS layer.

Knowing that TrustedInstaller writes to the following log file C:\Windows\Logs\CBS\CBS.log, I took a look inside to find there were no specific issues reported other than a mention of “Scavenge”

2017-02-09 12:45:14, Info                  CBS    Scavenge: Begin CSI Store
2017-02-09 12:45:14, Info                  CSI    00000012 Performing 1 operations; 1 are not lock/unlock and follow:  Scavenge (8): flags: 00000017

This task however didn’t ever seem to stop and despite leaving a machine for 24 hours, it never got anywhere to gracefully terminate the start of the service in the first instance. At that point I realised something was possibly out of place with respect to the Windows Update process and in my frantic search stumbled across the Windows Update Troubleshooter.

https://support.microsoft.com/en-gb/instantanswers/512a5183-ffab-40c5-8a68-021e32467565/windows-update-troubleshooter

I downloaded the CAB file and ran it against the Master image which in fact triggered TrustedInstaller.exe. At the end of the process, it stated it had fixed the Service Registration.

I checked the CBS.log file again and found the following repair entry:-

2017-02-09 22:24:14, Info                  CSI    0000000a [SR] Beginning Verify and Repair transaction
2017-02-09 22:24:14, Info                  CSI    0000000b Repair results created:POQ 0 starts:     0: Move File: Source = [l:192{96}]”\SystemRoot\WinSxS\Temp\PendingRenames\a5c17e3e2383d20103000000a8043012._0000000000000000.cdf-ms”, Destination = [l:104{52}]”\SystemRoot\WinSxS\FileMaps\_0000000000000000.cdf-ms”

After hitting “Close” on the troubleshooter, I noticed that TrustedInstaller.exe was still running so left the computer for about 10 minutes at which point the process closed. I went back to the CBS.log file and found that a subsequent Scavenge job had run again but this time had completed.

2017-02-09 22:34:35, Info                  CBS    Scavenge: Begin CSI Store
2017-02-09 22:34:35, Info                  CSI    00000012 Performing 1 operations; 1 are not lock/unlock and follow:  Scavenge (8): flags: 00000017           CBS    Reboot mark refs: 0
2017-02-09 22:34:35, Info                  CBS    Idle processing thread terminated normally
2017-02-09 22:34:35, Info                  CBS    Ending the TrustedInstaller main loop.
2017-02-09 22:34:35, Info                  CBS    Starting TrustedInstaller finalization.
2017-02-09 22:34:35, Info                  CBS    Ending TrustedInstaller finalization

For good measure, I rebooted the Master Image and took a new snapshot, recomposed the pool and tested.

Problem solved! Phew…

Let me know if the post has been useful and/or fixed any similar problems you may have had with TrustedInstaller.

VMware Horizon View, Session in Session leads to poor performance

A client of mine wanted to dip their toe into the Cloud and what better way than to start delivering applications from the cloud into an existing on-prem VDI environment.

The on-prem setup comprised of nothing more complicated than a Windows 7 VDI and the proposed application was a more recent flavour of the Microsoft Office suite with Windows 2012 under the hood. The environment was spun up painlessly as an extension of the existing Horizon 7.0.2 deployment and integrated as a separate site off the Cloud Pod Architecture capability of VMware Horizon. Application delivery was initially tested by launching the app from a desktop PC geared up with VMware Client 4.3 and it worked flawlessly as expected. Superb, or so I thought, until however I tried to repeat the same application launch from the VDI platform.

The application started, but interactivity seemed sluggish so I figured I’d misconfigured something on the VMware client, perhaps a mismatch in Protocol or something or some sort of contention from the client on its journey into the cloud. After playing around with the various protocols, bearing in mind the VDI session was established using PCoIP and application publishing was also configured to use PCoIP, I tested every possible option to include BLAST to the VDI, PCoIP to the app, PCoIP to the VDI and BLAST to the app etc, but with no success. (note it is not possible to use RDP for application publishing. I’m not entirely sure why – please leave me a comment if you know the reason, but at the time of writing, RDP as a protocol is only usable with desktop publishing). I then RDP’d directly to the Windows 2012 server within the VDI session using the native MSTSC client and performance was absolutely fine (also tested with RemoteApp and it too was fine).

This left me no option but to log a ticket with VMwares GSS and their initial acknowledgement suggested this behaviour wasn’t unexpected. Whilst there is no definitive answer to the problem, it appears to be linked to the behaviour on a device that is installed with both the Horizon Agent in combination with the Horizon Client. This does therefore seem to be a fairly big problem for companies looking to stagger their application migration into the cloud, certainly when using VMware as the core and sole EUC platform.

Update: 26/01/17

After setting up a Citrix bare bones environment and repeating the same application publishing exercise, it was identified that the application was still subject to poor interactive performance!!  This was now despite running a PCoIP session and running an HDX delivered published application within it! 

At that point I figured the problem must be down to the source VDI, rather than the destination app. I subsequently configured a new test pool and started playing around with rendering options to include more VRAM and fewer monitors until eventually, the optimum configuration involved completely dropping 3D rendering altogether. 

Oh no! Turning this off means the loss of Aero for Windows 7, which was the sole purpose for it being enabled in the first instance! Without GPU cards in the hosts (BL460 blades) software rendering is the only choice to permit the clients required configuration, so it seems that session in session publishing using anything other than RemoteApp simply isn’t a possibility to meet their needs.