Instability issues will kill a virtual desktop deployment. It doesn’t matter how great the video playback is, or how practical it is to access your desktop on any device from anywhere at any time. If the environment doesn’t run perfectly all of the time, it will never be used.
In order to squash any bugs, I need to know what’s going on all of the time, not waiting for an error to re-appear. This means detailed info from:
- ESXi hosts
- VMware View Connection server(s)
- VMware View Composer server(s)
- App Volumes server(s)
- Domain controller(s)
- All virtual desktops
- Windows Event logs
- App Volumes Agent logs
- User Environment manager logs
- VMware View agent logs
If you open a VMware support case, you’ll be required to gather some or all of these logs. And that sucks…big time. Or you can use Log Insight and figure out the issue on your own with minimal effort.
Let’s start with an example.
Problem: My desktop doesn’t come back when I get disconnected
Where does the troubleshooting process start? Think of all the questions that come up:
- When does this happen?
- It this a recurring issue?
- Why are other users unaffected? Or…
- Are other users having the same issue and just not saying anything?
- When does this happen?
- Is this simply user error (accidental logoff or something else)?
- Are there any corresponding Horizon events?
- Can I retrieve logs from the guest VM to see if anything stands out?
- Is there an automatic logoff policy in place or something else that triggers the event?
All of this takes time. How much time? Probably a week or two minimum to address these questions….and a lot of waiting for it to happen again. Meanwhile, your users are asking for their old PCs back while you wait for your case to be escalated to the next level of support.
When I was unable to determine the root cause of this issue, I setup Log Insight in trial mode and added my ESXi host to it. Then, I waited for the issue to happen again. When it did, I took the machine name that had failed along with an estimated time that it happened and went into Interactive Analysis to see if anything would stick out. I typed in the name of the VM and sure enough, I found this:
PhysPageFault failed Failure: pgNum=0x3d108, mpn=0x3fffffffff
That looks like a serious problem. Then, copy/paste into Google. The first result was a VMware KB with a workaround to resolve the issue. And just like that, problem solved.
I can’t even imagine the reduction in support calls and the increase in user satisfaction if this product was included with the Horizon Enterprise suite, or even better, free with any VMware product. I have spent days, weeks and months on stability issues both on the desktop and server side, manually digging for log files, blindly looking for something that might stick out, and spending way too much time on Google looking for anything even just slightly relevant that might provide a fix. Log Insight removes this cumbersome process entirely by easily indexing all log files so their searchable by literally any term. All you need is an error and a timestamp to solve your problem.
Seriously, just make this product free already.