I’ve been experiencing the disappearing drive act, more commonly known as Permanent Disk Failure whereby under duress, the host will mark the SSD as failed simply because it just can’t keep up and goes walkabouts. This was almost reproducible on demand by either committing a large snapshot or just powering everything on at the same time (basically heavy IO).
After some research into whats causing it (apart from my environment not being on any sort of HCL), it seems that the SATA AHCI controller on the Macs really can’t cope too well and even though I thought I’d bought a decent SSD drive to compliment vSAN (a Samsung 850 Pro), this actually appeared to be more of an achilles heel than the controller. Rather than start replacing my lab with more power hungry, noise demanding hardware to work around the issue, I thought I’d give it one more roll of the dice and whilst again not technically on the HCL for an all Flash vSan, have purchased some Intel DC3700 SSDs to act as the vSAN cache tier to the pre-existing 850 Pro SSDs.
Goodbye Hitachi magnetic disk, hello Intel SSD.
If the 850s continue to provide me with problems, I’ll revert to SATA Magnetic disk, although in theory, I shouldn’t be driving the 850s hard enough now for their bottle neck to rear its ugly head – although having said that, in an All Flash vSan, all reads are directed to the capacity tier (gulp). Another consideration I had thought of was to look at ROBO and whilst vSphere 6.1 supports it, it doesn’t when using All Flash. For the time being I’ll be sticking with three Mac Mini hosts.