vSphere Design Pocketbook v2 – the blog edition – Call for entries

Last year’s vSphere Design Pocketbook – “Tweet sized Design Consideration for Your Software-Defined Datacenter” was a big hit. Over 3000 copies were given away since last VMworld and I don’t even know how many copies are downloaded.

We knew other community members had loads of advice to share and from that idea the vSphere Design Pocketbook was born. And now it is time for a successor!

The Blog edition
The design considerations featured in the first book are in tweet-sized format, limited to 200 characters. This edition will expand beyond this limit and allows conveying your thoughts up to the length of a blog article. You can select either an existing content such as a published article or create a new one.

  • Is there a maximum length? Not exactly, use the words necessary to describe your design consideration efficiently. If necessary we will ask you to condense your material.
  • Can I use diagrams? Absolutely! Make sure you provide the diagrams and screenshots that can be printed. At least 220DPI, preferably 300DPI. Looking for guidelines on making great diagrams? Please read this article
  • Will I be credited? We will use the same format as the first book. Your name, twitter handle and, if available, your blog url will be listed. In line with most blog sites, you are requested to provide a short bio of 3 sentences that will be printed along side the article.
  • Do I need to be a blogger? You are not required to have a blog, nor be a vExpert or VCDX. There are no requirements for submitting your design decision articles.

We are looking for content in the following categories:

  • Host design
  • Cluster design
  • vCenter design
  • Networking and Security design
  • Storage design
  • Generic design considerations – “Words of Wisdom”.

To avoid saturation we do not allow more than a total of your top three articles. For example, you can provide us with three design consideration articles for the Host category but you could also choose to provide one article for three different categories. Be aware that we rather see one excellent design consideration article than three mediocre ones.

Call for entries
Please provide your blog post in Word or PDF format. If you use any diagrams in your article, please provide them separately in 300 DPI PNG or high quality JPEG format. (Guidelines for designing diagrams can be found here)

It’s about you, so please provide a short bio of yourself with your blog URL and twitter handle, if preferred you can send a headshot as well.

You can email the content to Pocketbook@pernixdata.com

Call for entries close Saturday June 14th.

This book is free!

PernixData generously offered to print the book. If your design consideration is included in the book, you will receive a copy of the book. At the booth at VMworld PernixData will have a copy available for people who submitted a winning design consideration article. A limited number of books will be available for the community. More details will follow. After VMworld an E-book version of the book will be made publicly available.

Collateral benefit of an acceleration platform

Yesterday I visited a customer to review their experience of implementing FVP. They loved the fast response time and the incredible performance that server flash brings to the table. Placing flash resources in the host, as close to the application as possible, allows you to speed up the workloads you select. Reducing distance between the application and the storage device provides lower latencies and the performance of the flash device allows for great performance. But what is interesting is the “collateral benefit” that the FVP architecture provides to the entire architecture.

During the conversation the customer dropped their hero numbers on me. Hero numbers are the historical data points presented by the U.I. such as IOPS saved from the datastore and bandwidth saved. We like to call these Hero numbers as indicate the impact on the environment and they sure were impressive.

In one-week’s time FVP accelerated 1.2 Billion IOPS in their environment. (IOs saved from the Datastore)

1 Billion IOPS

Please note that these are business workloads, not Iometer workload tests. IO’s generated from Oracle and MS SQL databases. 3 hours later I received a new screenshot; it accumulated 24 Million more IOPS saved during that time. Indicating an average acceleration of 8 Million IOPS per hour.

3Hourslater

That is 8 Million of I/O’s per hour served by server flash and not hitting the array. In total almost 60 TB of data did not traverse the storage area network allowing other workloads to roam freely through the storage network. Other workloads such as virtual machines or physical servers connected to the array or SAN. This reduction of I/O results in lower CPU utilization of the storage controllers, freeing up resources to become available for non-accelerated workloads.

60TB is the amount of read I/O’s saved by FVP hitting the storage area network and the array. When accelerating both reads and writes, we still send the write data to the array, as FVP is not a persistent storage layer (i.e. providing datastore capabilities). When the virtual machine is in Write back FVP try to destage (write uncommitted data to the storage system) as fast as possible. If the storage system is busy FVP destages uncommitted data at a rate the primary storage is comfortable of receiving data. Risk of data-loss is averted by storing multiple replicas on other hosts in the FVP Cluster allowing FVP to destage in a more uniform write pattern. Being able to time-release I/O’s results permits FVP to absorb workload spikes and convert them in write pattern more aligned with the performance capabilities of the entire storage area network

I captured a spiky OLTP workload to show this phenomenon. The workload generated 8800 IOPS. (the green line) The flash device absorbed these writes and completed the I/O instantly, allowing the user to continue generating results. Although the application exhibits a spiky workload pattern does not require FVP to mimic this workload behavior. Data is stored safely on multiple non-volatile devices, therefor the 8800 IOPS are send to the array in such a rate that this does not overwhelm the array. The purple line indicates the number of IOPS send to the array. The highest number of IOPS send to the array is in this example 3800 IOPS, 5000 less than the spike produced by the application.

absorbing writes

This behavior reduces the continuous stress on the storage area network and the array. It allows for customers to get more mileage out of their arrays as the array now becomes focused on providing capacity and data services primarily. When having accumulated enough data points over time, these data points can be used as input for your new array configuration. This generally results in a design comprised of a lower amount of spindles, resulting in advantages such as lower cost, smaller physical footprint and reduced thermal signature.

Being able to accelerate both read and write operations goes beyond improving that specific workload, but generating an overall improvement for the entire datacenter architecture.

Want to hear more about the customer, their use case and the benefits they are experiencing implementing FVP. Vote for VMworld Session: 2583 Case Study: Tata Steel Virtualizes Oracle Database, Gets Better Performance than Physical

Data acceleration, more than just a pretty flash device

Sometimes I get the question whether it would make sense to place a flash appliance on the network and use this medium to accelerate data. This pool of flash serves multiple workloads without disrupting any workload when adding it to the infrastructure. Justin Warren, a Storage Field Delegate recently came to the same conclusion. In my opinion this construction leads to an inferior spot solution that does not allow for full leverage of resources and loses a lot of possibilities to grow to a more evolved architecture providing performance where its needed when its needed. Let’s take a closer look what role software has in the act of accelerating data and why you need software to to do this at scale.

Accelerating data is more than adding a faster medium to solve your problem. Just adding a raw acceleration medium will just push out the moment you hit your new performance ceiling. Virtual Datacenters are extremely dynamic. Virtualization isn’t about consolidating workload on a smaller number of servers any more, it’s rapidly moving towards aligning IT to business strategies. Virtualization allows companies to respond to new demand on the fly, being able to rapidly deploy environments that cater to the wishes of the customer while still being able in control of distributing the limited amount of resources that are available.

And in order to do this properly one needs to have full control over the resource. Being able to manage and utilize the resources as efficiently as possible, you need to be able control the stack of resources with the same set of controls, the same granularity and preferably within a single pane of glass. Adding additional resources that require a different set of controls, use a different method of management and distribution of resources reduces efficiency and usually increases complexity. Minimizing time of management AND reducing human touch points is off essence. By using two distinct systems, – inside the hypervisor kernel and outside the hypervisor – chances are that they cannot be integrated into a single policy based management process. Meaning that manual labor needs to take place, which impacts overall lead times of deployment and agility of the services offered. Think availability of human resources, think level of expertise, and think permissions and access of multiple systems. Automation and policy-based management can help you avoid all these uncertainties and dependencies and control automation in a more orchestrated fashion. More and more signals are coming from within the industry that support an overall openness of APIs and frameworks, but unfortunately the industry is not there yet.

Control, integration and automation rely on a very important element and that is identity and in our case VM identity. You cannot distribute resources properly if you don’t know who is requesting the resource. You need to understand who that entity is, what its entitlement to the resource is and what its relative priority is amongst other workloads. When a workload is exiting the ESXi host, it usually is stripped of all its identity and becomes just a random stream of demand and utilization. Many tried to solve this by carving up resources and dedicate it directly to a higher entity. For example, disk groups assigned to a particular cluster, or placing a VM on a separate datastore to dispose all the resources available between the host and the datastore. But in reality this worked for a short amount of time, hogged resources, created a static entity in an architecture that excels when allowing algorithms distribute resource dynamically. In short it does not scale and typically prohibits a more mature method of IT service delivery.

Therefore it’s key to keep the intelligence as close to the application as possible. Harvest the true power of software intelligence. Retain identity of workloads that allows you to distribute resources whenever it’s needed with the correct priority and availability of resources. By using VM identity you can apply your IT services by creating a set of policies, for example RPO and resource availability. Just by selecting the correct availability profile. This is the true power of software! Software can utilize the available resources in the most efficient way. I’ve seen it for example with FVP F-squared, where the performance of the flash device increased by using a better, more intelligent way of presenting the workload of the VMs to the flash resource. Better hardware performance by leveraging VM identity, control of resources and analytics all done in the same domain of control.

You can find the power of software in other industries as well. If you have the chance to talk to an software engineer of any MotoGP racing team that ask him what he can do in his controlled environment with software. By understanding the workload for a particular application (track) they can control the suspension system, throttle control, engine behavior all on the position of the bike on the track, setting up the bike in the most optimal way for the upcoming corner. And its not just A corner, they understand exactly what corner is coming and what the impact it has on the bike and will adjust accordingly. Whether they are allowed to use this in a race is a different debate, but it demonstrates the true power of software, workload analytics, and identity in a controlled system.

That type of analytics and power of resource distribution is exactly what you want for your applications. And the best way to do it is to retain VM identity. Use analytics, distributed resources management, advanced QOS to align the availability of high performance resources to the workload demand. Do it in such a way that it requires a minimal amount of clicks to configure and manage the system and it is my belief that the only place to do this is within the hypervisor kernel. Inside the kernel where multiple schedulers operate in harmony, understand, retain and respect VM identity while being on top of the resource and close to workload as possible.

Adding acceleration resources outside the kernel will not provide you this ability and you have to wonder what you solve with that particular model. vSphere DRS maintenance mode allows migration of workloads seamlessly, transparent and non-disruptively to other hosts in the cluster, not impacting workload in any form and manner. Providing you the ability to install acceleration resources without impacting your IT service level. And if you exercise proper IT hygiene, before connecting any device to an ESXi host, it is recommended (dare I say best practice) to put the host in maintenance mode anyway. Resulting in same host and workload migration behavior.

PernixPro member update – The new class

Today I would like to announce the second class of PernixPros. Enriching the first group of PernixPros. It was a tough process to go through all of the amazing submissions and select the best candidates. If you weren’t selected there is no need to reapply. Your best bet is to keep doing what you have been doing in evangelising PernixData and supporting the virtualization community.

Quick overview of the PernixPro Program
At PernixData we believe in the power of the community and one of the pillars of this program is to provide members early access to the beta product that allows them to chime in and give feedback to the engineers and product team. Luigi Danakos posted an interview with Jane Rimmmer about the PernixPro program recently…Six Questions on the PernixData – PernixPro program

We will be contacting all our new PernixPro members to set-up an overview of the program and kick off the great discussions around PernixData FVP. For those that didn’t make it this time keep following us on Twitter @PernixData for updates.

PernixPro_resized

Name Twitter Handle
Ariel Antigua @aantigua
Francesco Bonetti @fbonez
James Bowling @vSential
Kurt Bunker N/A
Brad Christian @bchristian21
Andrew Dauncey @daunce_
Robert Edwards @bobbygedwards
Niels Engelen @nielsengelen
Chris Evans @chrismevans
John Flisher @hsilf
Dave Henry @davemhenry
Dennis Hoegen Dijkhof @vDennisHD
Manfred Hofer @Fred_vBrain
Chris Horn @Horn_Chris
Masaomi Kudo @interto
Jared Lutgen N/A
Frederic Martin @vmdude_fr
Sean Massey @seanpmassey
Sam McGeown @sammcgeown
Tetsuo Miyoshi @miyo4i
Karel Novak @novakkkarel
Matt O’Donnell N/A
Pietro Piutti @stingray92
Michael Poore @mpoore
Tom Queen @linetracer
Christiaan Roeleveld N/A
Byron Schaller @byronschaller
Chris Shaw @TheAlfCabin
Arjan Timmerman @Arjantim
Martin Valencia @ubergiek
Avram Woroch @AvramWoroch