Simplify GPU virtualization

Here’s my take on GPU virtualization: XenApp (XenDesktop Server OS) or Horizon View with RDSH can cover user requirements 95% of the time. Why? Does one user really hammer a GPU every hour of the day, every day of the week? Want to find out? Download GPU-Z on the machine and watch the GPU load for yourself. If they really do peg the sucker, then you’re one of the few that should consider giving a full blown GPU to one user. If not, use RDSH with vDGA (GPU passthrough) and keep reading…

A NVIDIA GRID K1/K2 card, vDGA or equivalent graphics passthrough enabled, and Server 2012 R2 with the RDSH role is the ideal VDI deployment model if you’re looking at over 40 users. Why?

Let’s start with the physical card itself. There are plenty of options for GPU virtualization…so why GRID?

First, the amount of actual GPUs per physical board is up to 4 if you choose the K1 model. Second, NVIDIA has focused on hardware and software certification for the GRID lineup, ensuring not only that applications will work to their full potential, but the hardware holding the GRID cards is compatible as well. That’s no easy task, but it provides an unrivaled guarantee that everything will just work.

Secondly, GPU passthrough is ideal for two reasons: 1.) GPU drivers are designed by the GPU vendor. 2.) Choice between extreme scalability and extreme dedication. After running GPU-Z, you’ll come to one of two conclusions. The first: this user really needs a dedicated GPU. Or the second: these users use their GPU, but it’s only fully used a couple of times per day, if that. Rather than evaluate all the different GPU options (SVGA, vGPU, vDGA), focus on how that user will use a GPU in a virtual session. If they use it 100% of the time all of the time (conclusion #1), assign the user a Windows Desktop (7/8) OS with a GPU in passthrough mode. Otherwise (conclusion #2), assign a Windows Server (2012 R2 if you’re smart) OS with a GPU in passthrough mode.

Most likely, you’ll decide to go with extreme scalability over extreme dedication. If that’s the case, here are additional benefits:
1.) Balance GPU workload evenly across all GPUs
2.) Reduce wasted CPU and RAM
3.) Go beyond the 2-socket Desktop OS limitation
4.) Scale it out to “the cloud”

Consider a server that has two GRID K1 cards in it. Each physical GRID K1 board has 4 GPUs on it. That means 8 GPUs for that one server. To ensure that load is distributed evenly, deploy 8 Windows Server VMs, preferably using a linked clone deployment, and attach a GPU to each one. Add them all to the same delivery group (XenDesktop) or desktop pool (Horizon View). The first 8 users that log on have a dedicated GPU. Add an additional 8 users and 2 users now share a physical GPU. That’s still overkill for most scenarios.

With only 8 VMs running on a physical server, make sure to use all available CPU and memory too. Consider a quad socket 12-core server with 512 GB of RAM. Each VM should have 48 vCPUs to ensure processing is evenly distributed. Good luck trying that with a Windows 7/Windows 8 deployment This makeshift equation seems to work well for memory allocation: Total Server Memory / (# of GPUs + 2). In this case, each VM would have around 50 GB of memory. Finally, due to Microsoft licensing, your only real option to scale your deployment to a public cloud provider, such as Azure or AWS, is using a Windows Server OS instance. AWS already has GPU options available for deployment.

Call me biased, but I can’t justify the use of vGPU or SVGA when there is so much potential with direct passthrough that just needs to be implemented properly.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s