Do I need to buy a specific grade SSD for my test environment or can I buy the cheapest SSDs? Do I need to buy enterprise grade SSDs for my POC? They last longer, but why should I bother for a POC? Do we go for Consumer grade or Enterprise grade flash devices? All valid questions that typically arise after a presentation about PernixData FVP, but I can imagine Duncan and Cormac receive the same when talking about VSAN.
Enterprise flash devices are known for their higher endurance rate, their data protection features and their increased speed compared to consumer grade flash devices and although these features are very nice to have, they aren’t the most important features to have when testing flash performance.
The most interesting features of enterprise flash devices are Wear Levelling (To reduce hot spots), Spare Capacity, Write Amplification Avoidance, Garbage Collection Efficiency and Wear-Out Prediction Management. These lead to I/O consistency. And I/O consistency is the Holy Grail for test, POC and production workloads.
One of the main differentiators of enterprise grade disks is spare capacity. The controller and disk use this spare capacity to reduce write amplification. Write amplification occurs when the drive runs out of pages to write data. In order to write data, the page needs to be in an erased state. Meaning that if (stale) data is present in that page, the drive needs to erase it first before writing (fresh) data. The challenge with flash is that the controller can erase per block, a collection of pages. It might happen that the block contains pages that have still valid data. That means that this data needs to be written somewhere else before the controller can delete the block of pages. That sequence is called write amplification and that is something you want to keep to a minimum.
To solve this, flash vendors have over provisioned the device with flash cells. The more technical accurate term is “Reduced LBA access”. For example, the Intel DC S3700 flash disk series comes standard with 25 – 30% more flash capacity. This capacity is assigned to the controller and uses this to manage background operations such as garbage collection, NAND disturb rules or erase blocks. Now the interesting part is how the controller handle management operations. Enterprise controllers contain far more advanced algorithms to reduce the wear of blocks by reducing the movement of data, understanding which data is valid and which is stale (TRIM) and how fast and efficient it can redefine logical to physical LBAs after moving valid data to erase the stale data. Please read this article to learn more about write amplification.
Consumer grade flash
Consumer grade flash devices lack in these areas, most of them have TRIM support, but how advanced is that algorithm? Most of them can move data around, but how fast and intelligent is the controller? But the biggest question is how many spare pages does it have to reduce the write amplification. In worst case scenarios, and that usually happens when running test, the disk is saturated and the data keeps on pouring in. Typically a consumer grade has 7 % spare capacity and it will not use all that space for data movement. Due to the limited space available, the drive will allocate new blocks from its spare area first, then eventually using up its spare capacity to end up doing a read-modify-write operation. At that point the controller and the device are fully engaged with household chores instead of providing service to the infrastructure. It’s almost like the disk is playing a sliding puzzle
Anandtech.com performed similar tests and witnessed similar behaviour, the publish their results in the article “Exploring the Relationship Between Spare Area and Performance Consistency in Modern SSDs” An excellent read, highly recommended. In this test they used the default spare capacity and run some test. In the test they used one of the best consumer grade SSD device, the Samsung 840 PRO. In this test with a single block size (which is an anomaly in real-life workload characteristics) the results are all over the place.
Seeing a scattered plot with results ranging between 200 and 100.000 IOPS is not a good base platform to understand and evaluate a new software platform.
The moment they reduced the user-addressable space (reformat the file system to use less space) the performance goes up and is far more stable. Almost ever result is in the 25.000 to 30.000 range.
Please note that both VSAN as FVP manage the flash devices at their own level, you cannot format the disk to create additional spare capacity.
Latency test show exactly the same. I’ve tested some enterprise disks and consumer grade disks and the results were interesting to say the least. The consumer grade drive performance charts were not as pretty. The virtual machine running the read test was the only workload hitting the drive and yet the drive had trouble providing steady response times.
I swapped the consumer grade for an enterprise disk and ran the same test again, this time the latency was consistent, providing predictable application response time.
Why you want to use enterprise devices:
When testing and evaluate new software, even a new architecture, the last thing you want to do is start an investigation why performance is so erratic. Is it the software? Is it the disk, the test pattern, or is the application acting weird? You need to have stable, consistent and predictable hardware layer that acts as the foundation for the new architecture. You need a stable environment that allows you to baseline the performance of the device and you can understand the characteristics of the workload, the software performance and the overall benefit of this new platform in your architecture.
Enterprise flash devices provide these abilities and when doing a price comparison between enterprise and consumer grade the difference is not that extreme. In Europe you can get an Intel DC S3700 100GB for 200 Euros. Amazon is offering the 200 GB for under 500 US dollars. 100-200GB is more than enough for testing purposes.