What differentiates the PernixPro program

This weekend a tweet by Michael Stump caught my eye:

After explaining in the limited 140 characters what sets us apart, Michael was impressed:

Let me expand a little more on the intent of the PernixPro program.

Selection Process
We select members on a few criteria, the level of involvement in the virtualization community, expertise and interaction with PernixData employees if customers apply for the PernixPro program. These three factors are important for us as multiple groups within PernixData interact with the PernixPros. On a regular basis we request feedback on various points. During the development of FVP 2.0, some members interacted with the UI team to provide feedback. A group of PernixPros was involved in alpha and pre-beta testing, interacting with Product Management and the Engineers. We are focused on providing products that align with the requirements and wishes of the customers and there is nothing better to get the feedback from customers and virtualization experts in an early state of the development cycle.

Ambition
A lot of community members are passionate about improving the virtualization ecosystem and are eager to share their expertise. Back in the early days of VMware you could interact with the engineers and that made visiting VMworld even more special. There were plenty of times were I interacted with VMware employees to find out that I was actually talking to the engineer responsible for that product. These moments encouraged me to create a similar experience.

Selective approval
The early exposure to the new products require us to be selective with members, you can imagine that exposing alpha code to competing vendors is not a part of my MBO ;).

PernixPro Update
This week a new group of PernixPros were selected. Congratulations to the new members of the PernixPro members, I’m looking forward to working with you guys.

Daemon Behr Linkedin
Rene Bos @rbos3
Dennis Bray @DennisBray
Marco Broeken @mbroeken
James Burd @TheBurdweiser
Tim Curless @timcurless
Simon Eady @simoneady
Mads Fog Albrechtslund @Hazenet
Jonathan Frappier @jfrappier
Bill Gurling @vDingus
Abhilash HB @abhilashhb
Chris House @chouse
Tim Jabaut @vmcutlip
Mikael Korsgaard Jensen @jekomi
Enrico Laursen @EnricoLaursen
Roger Lund @rogerlund
Alexander Nimmannit @alexanderjn
Robert Novak @gallifreyan
Alain Russell @alainrussell
Michael Ryom @MichaelRyom
Michael Stump @_stump
Ben Thomas @wazoo
Paolo Valsecchi @nolabnoparty
Bart Van Praet Linkedin
Frans van Rooyen @jfvanrooyen
Joey Ware @joey_vm_ware
Geoff Wilmington @vWilmo
Eric Wright @DiscoPosse

We will be contacting all our new PernixPro members to set-up an overview of the program and kick off the great discussions around PernixData FVP. All PernixPros, existing and new, will receive an invite to the VMworld PernixData party and PernixPro breakfast. For those that didn’t make it this time keep following us on Twitter @PernixData for updates.

Pre-order vSphere Pocketbook Blog Edition

vSphere_design_book_Front_246pxI’m proud to present the second edition of the vSphere design Pocketbook series. The vSphere design Pocketbook is the print version of PAAS. We like to provide a platform to the members of the community. The first copy of the book we asked the community to provide their true tried and tested insights. The authors needed to dive into the essence of their recommendation, what problem does this recommendation solve?

Although the tweet sized design consideration pocket book was a big hit, some readers wanted to have more background, more information, richer description of the problem and the suggested solution. Therefore it made sense to provider a bigger canvas to the contributors allowing them to submit recommendations, hints and tips without the 200-character limit.

The amount of submissions where overwhelming. A lot of people sent in amazing content. Interesting to see many of the submissions where focused on creating and following architecture design framework instead of the typical blog post focused on solving a particular problem or explaining new technology.

Many of you were eager to submit. To that end, I received lots of great content, and I want to thank everyone who participated. But, in the spirit of creating a manageable sized “pocketbook”, we could not publish everything. Some tough choices had to be made. These are the Authors who are featured in the book:

Name Twitter Handle
Abdullah Abdullah @do0dzZZ
Dee Abson @deeabson
Martijn Baecke @baecke
Alther Beg @AtherBeg
Peter Chang @virtualbacon
Duncan Epping @DuncanYB
Bred Hedlund @bradhedlund
Cormac Hogan @CormacJHogan
Patrick Kremer @KremerPatrick
William Lam @lamw
Todd Mace @mctodd
Yury Magalif @YuryMagalif
Christian Mohn @h0bbel
Josh Odgers @Josh_Odgers
Dave Pasek @david_pasek
Tom Queen @linetracer
Aylin Sali @V4Virtual
Eric Shanks @Eric_Shanks
Phoummala Schmitt @exchangegoddess
Michael Webster @vcdxnz001
Michael Wilmsen @WilmsenIT

Hard copies will be given away at VMworld (Booth 1017), but you can pre-order your electronic copy today.

Life in the Data Center – a story of love, betrayal and virtualization

I’m excited to announce the first ever “collective novel”, in which members of the virtualization community collaborated to create a book with intrigue, mystery, romance, and a whole lot of geeky data center references.

Lifeinthedatacenter

The concept of the project is that one person writes a section and then passes it along. The writers don’t know their fellow contributors. They get an unfinished story in their mailbox and are allowed to take the story in whatever direction it needs to go. The only limitation is the author imagination.

For me it was a fun and interesting project. Writing a chapter for a novel is a whole different ballgame than writing technical focused content. As I rarely read novels it’s a challenge how to properly describe the situation the protagonist is getting himself into. On top of that I needed to figure out how to extend and expand the story line set by the previous authors but also get the story into a direction I prefer. And to make it more challenging, you do not know what the next author will be writing, therefor your intention for the direction of the storyline may be ignored. All in all a great experience and I hope we can do a second collective novel. I’m already collecting ideas ☺

I would like to thank Jeff Aaron. He came up with the idea and guided the project perfectly. Once again Jon Atterbury did a tremendous job on the formatting and artwork of the book. And of course I would like to thank the authors of taking time out of their busy schedules to contribute to the book. The authors:

Jeff Aaron (@jeffreysaaron)

Jeff Aaron (@jeffreysaaron)

Josh Atwell (@Josh_Atwell)

Josh Atwell (@Josh_Atwell)

Kendrick Coleman on twitter

Kendrick Coleman (@KendrickColeman)

Amy Lewis (@commsNinja)

Amy Lewis (@commsNinja)

Lauren Malhoit (@malhoit)

Lauren Malhoit (@malhoit)

Bob Planker (@plankers)

Bob Planker (@plankers)

Satyam Vaghani (@SatyamVaghani)

Satyam Vaghani (@SatyamVaghani)

Chris Wahl (@ChrisWahl)

Chris Wahl (@ChrisWahl)















To make it more interesting for the readers, we deliberately hid which author wrote which chapter you can have some fun guessing via a short quiz. Prizes will be given to those people with the best scores.

I’m not entirely sure that this book will be nominated for a Pulitzer, but it is worth a read to see what is in the authors’ crazy heads – and to witness how well they work together when collaborating on a project like this.

Go download the book and take the quiz

Let Cloudphysics help rid yourself of Heartbleed

Unfortunately the Open SSL Heartbleed bug (CVE-2014-0224) is present in the ESXi and vCenter 5.5 builds. VMware responded by incorporating a patch to solve the OpenSSL vulnerability in the OpenSSL 1.0.1 library. For more info about the ESXI 5.5 patch read KB 2076665, VMware issued two releases for vCenter 5.5, read KB 2076692.

Unfortunately some NFS environments experienced connection loss after applying the ESXi 5.5 patch, VMware responded by releasing patch 2077360 and more recently vCenter update 1b. The coverage on the NFS problems and the amount of ESX and vCenter update releases to fix a bunch of problems may left organizations in the dark whether they patched the Heartbleed vulnerability. Cloudphysics released a free Heartbleed analytic card in their card store that helps identify which hosts in your environment are unprotected.

Check out the recent article of Cloudphysics CTO, Irfan Ahmad about their recently released Heartbleed analytic package. I would recommend to run the card and rid yourself of this nasty bug.

Stop wasting your Storage Controller CPU cycles

Typically when dealing with storage performance problems, the first questions asked are what type of disks? What speed? what protocol? However your problem might be in the first port of call of your storage array, the storage controller!

When reviewing storage controller configurations of the most favourite storage arrays, one thing stood out to me and that is the absence of CPU specs. Storage controllers of the storage array are just plain simple servers, equipped with a bunch of I/O ports that establish communication with the back end disks and provide a front-end interface to communicate with the attached hosts. The storage controllers run proprietary software providing data services and specific storage features. And providing data services and running the software requires CPU power! After digging some more, I discovered that most storage controllers are equipped with two CPUs ranging from quad core to eight core. Sure there are some exceptions but lets stick to the most common configurations. This means that the typical enterprise storage array is equipped with 16 to 32 cores in total as they come with two storage controllers. 16 to 32 cores, thats it! What are these storage controller CPU used for? Today’s storage controller activity and responsibility:

  • Setting up and maintaining data paths.
  • Mirror writes for write-back cache between the storage controllers for redundancy and data availability.
  • Data movement and data integrity.
  • Maintaining RAID levels and calculating & writing parity data.
  • Data services such as snapshots and replication.
  • Internal data saving services such as deduplication and compression.
  • Executing Multi-tiering algorithms and promoting and demoting data to appropriate tier level.
  • Running integrated management software providing management and monitoring functionality of the array.

Historically arrays were designed to provide centralised data storage to a handful of servers. I/O performance was not the a pain point as many arrays easily delivered the request each single server could make. Then virtualisation hit the storage array. Many average I/O consuming grouped together on a single server, making that server, as George Crump would call it, an fire-breathing I/O demon. Mobility of virtual machines required an increased of connectivity, such that it was virtually impossible (no pun intended) to manually balance I/O load across the available storage controller I/O ports. The need for performance increased, resulting in larger number of disks managed by the storage controller, different types of disks, different speeds.

Virtualization-first policies pushed all types of servers and their I/O patterns on the storage array, introducing the need of new methods of software defined economics (did someone coin that term?) It became obvious that not every virtual machine requires the fastest resource 24/7, causing interest into multi-tiered solutions. Multi-tiering requires smart algorithms promoting and demoting data when it makes sense, providing the best performance to the workload when required while offering the best level of economics to the organisation. Snapshotting, dedup and other internal data saving services raised the need of CPU cycles even more. With the increase of I/O demand and introduction of new data services its not uncommon for virtualised datacenter to have over-utilised storage controllers.

Rethink current performance architecture
Server side acceleration platforms increases performance of virtual machines by leveraging faster resources (flash & memory) that are in closer proximity to the application than the storage array datastore. By keeping the data in the server layer, storage acceleration platforms, such as PernixData FVP, provides additional benefits to the storage area network and the storage array.

Impact of read acceleration on data movement
Hypervisors are seen as I/O blenders, sending the stream of random I/O of many virtual machines to the I/O ports of the storage controllers. Theses reads and writes must be processed, writes are committed to disks, data retrieved from disks to satisfy the read requests. All these operations consume CPU cycles. When accelerating writes, subsequents reads of that data are serviced from the flash device closest to the application. Typically data is read multiple times, decreasing latency for the application, but also unloading – relieving – the architecture from servicing that load. FVP provides metrics that show how many IO are saved from the datastore by servicing the data from flash. The screenshot below is taken after 6 weeks of accelerating database workloads. More info about this architecture here

8billionIOPS

The storage array does not have to service those 8 billion IOPS, but not only that, 520,63 TB did not traverse across the storage area network occupying the I/O ports of the storage controllers. That means that other workloads, maybe virtualised workload that hasn’t been accelerated yet, or external workload using the same array will not be affected by that I/O anymore. Less I/O hitting the inbound I/O queues on the storage controllers, allowing other I/O to flow more freely into the storage controller, lesser data to be retrieved by disks, lesser I/O going upstream from disk to I/O ports to begin its journey back from the storage controller all the way up to the application again. Saving copious amounts of CPU cycles allowing data services and other internal processes to take advantage of the available CPU cycles increasing the response of the array.

The screenshot is made by one of my favourite customers, but we are running a cool contest see which application the accelerated and how many IOPS other customers have saved.

Impact of Write acceleration on storage controller write cache (mirroring)
Almost all current storage arrays contain cache structures to speed up both reads and writes. Speeding up writes provide benefit to both the application and the array itself. Writing to NVRAM, where typically the write cache resides, is much faster than writing to (RAID-configured) disk structures allowing for faster write acknowledgements. As the acknowledgment is provided to the application, the array can “leisurely” structure the writes in the most optimum way to commit to data the backend disks.

To avoid a storage controller to be a single point of failure, redundancy is necessary to avoid data loss. Some vendors provide journaled and consistency points for redundancy purposes, most vendors mirror writes between the cache areas of both controllers. Mirrored write cache requires coordination between the controllers to ensure data coherency. Typically messaging is used via the backplane between controllers to ensure correctness. Mirroring data and messaging requires CPU cycles of both controllers.

Unfortunately even with these NVRAM structures, write problems seem to be persisting even today. No matter the size or speed of the NVRAM it’s the back-end disk capability to process writes that is being overwhelmed. Increasing cache sizes at the controller layer just delays the point at which write performance problems begins. Typically this occurs when there is a spike of write I/O. Remember, most ESX environments generate a constant flow of I/O’s adding a spike of I/Os is usually adding insult to injury to the already strained storage controller. Some controllers reserve a static portion for mirrored writes, forcing the controller to flush the data to disk when that portion begins to fill up. As the I/O keeps pouring in, the write cache has to wait to complete the incoming I/O until the current write data is committed to disk resulting in high latency for the application. Storage controller CPUs can be overwhelmed as the incoming I/O has to be mirrored between cache structures and coherency has to be guaranteed. Wasting precious CPU cycles on (a lot of) messaging between controllers instead of using it for other data services and features.

absorbing writes

FVP write back acknowledges the I/O once the data is written to the flash resources in the FVP cluster. FVP does not replace the datastore, therefore writes still have to be written to the storage array. The process of writing data to the array becomes transparent as the application already received the acknowledgement from FVP. This allows FVP to shape write patterns in such a way that are more suitable for the array to process. Typically FVP writes the data as fast as possible, but when the array is heavily utilised FVP time-releases the I/O’s. This results in a more uniform IO pattern. (Datastore write in the performance graph above). By flatting the spike, i.e. writing the same amount of IO’s over a longer period of time, the storage controller can handle the incoming stream much better. Avoiding forced cache flushes and CPU bottlenecks as a result.

FVP allows you to accelerate your workload, the acceleration of reads and writes reduces the amount of I/O’s hitting the array and the workload pattern. Customers who implemented FVP to accelerate their workloads experience significant changes of storage controllers utilisation benefitting external and non-accelerated workloads in the mix.