The amount of abstraction in IT is amazing. Every level in the software and hardware stack attempts to abstract operations and details. And the industry is craving for more. Look at the impact “All Things Software Defined” has on todays datacenter. It touches almost every aspect, from design to operations. The user provides the bare minimum of inputs and the underlying structure automagically tunes itself to a working solution. Brilliant! However sometimes I get the feeling that this level of abstraction becomes an excuse to not understand the underlying technology. As an architect you need to do your due diligence. You need to understand the wheels and cogs that are turning when dialing a specific knob at the abstracted layer.
But sometimes it seems that the abstraction level becomes the right to refuse to answer questions. This was always an interesting discussion during a VCDX defense session. When candidates argued that they weren’t aware of the details because other groups were responsible for that design. I tend to disagree
What level of abstraction is sufficient?
I am in the lucky position to work with PernixData R&D engineers and before that VMware R&D engineers. They tend to go deep, right down to the core of things. Discussing every little step of a process. Is this the necessary level of understanding the applied technology and solutions for an architect? I don’t think so. It’s interesting to know, but on a day-to-day basis you don’t have to understand the function of ceiling when DRS calculates priority levels of recommendations. What is interesting is to understand what happens if you place a virtual machine at the same hierarchical level as a resource pool filled with virtual machines. What is the impact on the service levels of these various entities?
Something in the middle might be the NFS series of Josh Odgers. Josh goes in-depth about the technology involved using NFS datastores. Virtual SCSI Hard Drives are presented to virtual machines, even when ESXi is connected to an NFS datastore. How does this impact the integrity of I/O’s? How does the SCSI protocol emulation process affect write ordering and of I/O’s of business critical applications. You as the virtual datacenter architect should be able to discuss the impact of using this technology with application owners. You should understand the potential impact a selected technology has on the various levels throughout the stack and what impact it has on the service it provides.
Recently I published a series on databases and what impact their workload characteristics have on storage architecture design. Understanding the position of a solution in the business process allows an architect to design a suitable solution. Lets use the OLTP example. Typically OLTP databases are at the front of the process, customer-facing process, dramatically put they are in the line of fire. When the OLTP database is performing slow or is unavailable it will typically impact revenue-generating processes. This means that latency is a priority but also concurrency and availability. You can then tailor your design to provide the best services to this application. This is just a simplified example, but it shows that you have to understand multiple aspects of the technology. Not just the behavior of a single component. The idea is to get a holistic view and then design your environment to cater the needs of the business, cause that’s why we get hired.
Circling back to the abstraction and the power of software defined, I though the post from Bart Heungens was interesting. Bart argues that Software Defined Storage is not the panacea for all storage related challenges. Which is true. Bart illustrates an architecture that is comprised of heterogeneous components. In his example, he illustrates what happens when you combine two servers HP DL380, but from different generations. Different generations primarily noticeable from a storage controller perspective and especially the way software behave. This is interesting on so many levels, and it would be a very interesting discussion if this were a VCDX defense session.
SDS abstracts many things, but it still relies on the underlying structure to provide the services. From a VCDX defense perspective, Bart has a constraint. And that is the already available hardware and the requirement to use these different generation hardware in his design. VCDX is not about providing the ideal design, but showing how you deal with constrains, requirements and demonstrating your expertise on technology how it impacts the requested solution. He didn’t solve the problem entirely, but by digging in deeper he managed to squeeze out performance to provide a better architecture to service the customers applications. He states the following:
Conclusion: the design and the components of the solution is very important to make this SDS based SAN a success. I hear often companies and people telling that hardware is more and more commodity and so not important in the Software Defined Datacenter, well I am not convinced at all.
I like the idea of VMware that states that, to enable VSAN, you need and SAS and SSD storage (HCL is quite restricted), just to be sure that they can guarantee performance. The HP VSA however is much more open and has lower requirements, however do not start complaining that your SAN is slow. Because you should understand this is not the fault of the VSA but from your hardware.
So be cognizant about the fact that while you are not responsible for every decision being made when creating an architecture for a virtual datacenter, you should be able to understand the impact various components, software settings and business requirements have on your part of the design. We are moving faster and faster towards abstracting everything. However this abstraction process does not exonerate you from understanding the potential impact it has on your area of responsibility