Evaluating an acceleration platform requires an understanding of the data flow. Comparing I/O paths of traditional storage architectures and new gen acceleration platform storage architectures show that traditional performance tests are not always suitable to determine real applications performance gains.
Comparing I/O paths
The I/O path of data to and from a storage array is pretty straightforward. As the data exists only in one place, the I/O command follows a predictable path through the architecture. Performance is easy to understand, as the consistent path produces repeatable results. Introducing an acceleration platform introduces an additional layer between the virtual machine and the storage array that keeps copies of recently read or written data to accelerate read and write operations.
Various test methods
There are multiple ways to test storage performance, the two common approaches are by using real applications or by synethic workload generators. The best method to test performance of applications with an acceleration platform present is to use the applications that are deployed on your virtual infrastructure.
If it is not possible to incorporate real workloads, synthetic workload generators can be used. It is important to generate a workload pattern that is representative of the real workload and preserves the characteristics of real workloads such as I/O types – read or writes, I/O size and different types of dependencies. Ensure that data is processed, or at least accessed multiple times mimicking real application behaviour.
The most common dependencies are data dependency and inter-process dependency. These are not limited to a specific application and can exits in both complex multi tiered systems or single application landscapes. Data dependency occurs if the output of a previous data request is the input of the current data request. Interprocess dependencies are usually derived from data dependencies and are expressed in which order operations should be performed. And it’s primarily the lack of dependency simulation that distinguishes synthetic workload testing from realistic application workload testing. Real applications can generate spikey workloads, but in general they do not burst I/O 24/7, there are processes that require time to complete their work.
Another popular way to simulate application performance is to use trace replays. A trace replay is recorded real-world workload which can be “replayed” in the virtual infrastructure by a replayer application. A popular trace tool for vSphere environments is vSCSIstats. vSCSIstats is a command line tool available at the ESXi command line. VMware I/O analyzer can be used to replay vSCSIstats trace files and is available at VMware.com. However caution needs to be taken as vSCSIstats is a heavy duty operation and the inter-process and inter-traffic dependencies won’t necessary be captured as vSCSIStats trace collection work on individual disks.
Don't use the pre-build workload patterns
It can be time-consuming and difficult setting up an environment that closely resembles the production environment or gathering accurate workload traces from production runs can be difficult. Synthetic workload generators such as IOrate, FIO and Iometer provide an easy and quick way to determine the measure the performance of a storage solution.
The typical workload generator issues I/O, but do not process the requested data in any shape or form. Fundamentally the pre-build workload patterns only saturate the queues and connections to determine the maximum of IOPS, latency and throughput. Without really measuring what is important to the end user – time to complete their work (transactions, web request etc.,) This "bandwidth" test is not providing you viable results.
Take your time to investigate how your applications are behaving, determine which block sizes the application operates with and what I/O type ratio the application is using.To properly test your storage acceleration platform, you need to change your default test patterns so that the application reads the same data multiple times. This approaches a real life scenario more due to existence of data dependencies or read after write (RAW) processes. In addition, size the dataset correctly in order to have it served from disk instead of the cache of the storage controller. If you want to simulate real life performance, you cannot expect to have all your data of all your applications present in the storage array controller cache.