From way too much testing, I decided to mess with VMware recommendations and keep the capacity tier behind RAID instead of using HBA-mode (passthrough). Why? NV cache on the PERC card significantly improves I/O in nearly every performance test. These are observations from the unofficial testing:
- NV cache is essential in RAID 10 mode
- VMDK clone time is more than halved with cache on
- Write performance is exponentially higher with cache on
- A single SSD is fast, and super fast with cache on
- A single SSD with cache on is as good if not better than 4 SSDs in either RAID 0 or RAID 10
- This isn’t the first test to show this. Here’s another much more detailed one.
So here’s the new VSAN disk config per host:
- RAID-mode enabled, 4 logical disks
- SSD 0 (logical disk 1)
- RAID 0, cache on
- SSD 1 (logical disk 2)
- RAID 0, cache on
- SSD 2 (logical disk 3)
- RAID 0, cache on
- SSD 3 (logical disk 4)
- RAID 0, cache on
- SSD 0 (logical disk 1)
And in vSAN, here’s how a single disk group is now setup:
My theory is that this is the best of both worlds. VMware recommends using HBA-mode to allow the SSDs to passthrough to the hypervisor in order for vSAN to have full management capabilities of each disk. However, HBA-mode means the 2 GB cache on the PERC card goes to waste. In this setup, the PERC is still in RAID mode but presents each physical disk to vCenter as if it were in HBA-mode. As a result, vSAN still controls how to distribute data, but as soon as data is read or written to the capacity tier, the cache on the PERC card is fully utilized.