Category: Miscellaneous (page 2 of 9)

Insights into CPU and Memory configuration of ESXi Hosts

Recently Satyam Vaghani wrote about PernixData cloud. In short PernixData Cloud is the next logical progression of PernixData Architect and provide visibility and analytics around virtual datacenters, it’s infrastructure and applications.

As a former architect I love it. The most common question asked by customers around the world was how other companies are running and designing their virtual datacenters. Which systems do they use and how do these system perform with similar workload? Many architects struggle with justifying their bill of materials list when designing their virtual infrastructure. Or even worse getting the budget. Who hasn’t heard the reply when suggesting their hardware configuration: “you want to build a Ferrari, a Mercedes is good enough”. With PernixData Cloud you will be able to show trends in the datacenter, popularity of particular hardware and application details. It let you start ahead of the curve, aligned with the current datacenter trends instead of trailing. Of course I can’t go into detail as we are still developing the solution, but I can occasionally provide a glimpse of what we are seeing so far.

For the last couple of days I’ve been using a part of the dataset and queried 8000 hosts on their CPU, memory and ESXi build configuration to get insight in popularity of particular host configurations.

CPU socket configuration
I was curious about the distribution of CPU socket configurations. After analyzing the dataset it is clear that dual socket CPU configurations are the most popular setup. Although single CPU socket configuration are more common than quad CPU socket in the dataset, quad core are more geared towards running real world workload while single CPU configurations are typically test/dev/lab servers. Therefor the focus will primarily on dual CPU socket systems and partially quad CPU sockets systems. The outlier of this dataset is the 8 socket servers. Interesting enough some of these are chuck-full with options. Some of them were equipped with 15 core CPU’s. 120 CPU cores per host, talk about CPU power!


CPU core distribution
What about the CPU core popularity? The most popular configuration is 16 cores per ESXi host, but without the context of CPU sockets one can only guess which CPU configuration is the most popular.

02-Total number of CPU Cores in ESXi hosts

Core distribution of dual CPU socket ESXi hosts
When zooming in to the dataset of dual CPU socket ESXi host, it becomes clear that 8 Core CPU’s are the most popular. I compared it with an earlier dataset and quad and six core systems are slowly reducing popularity. Six core CPU’s were introduced in 2010, assumable most will be up for a refresh in 2016. I intend to track the CPU configurations to provide trend analysis on popular CPU configurations in 2016.

03-Number of cores per CPU in dual CPU socket ESXi hosts

Core count quad socket CPU systems
What about quad socket CPU systems? Which CPU configuration is the most populair? It turns out that CPU’s containing 10 cores are the sweetspot when it comes to configuring a Quad core CPU system.

04-Number of cores per CPU in quad CPU socket ESXi hosts

Memory configuration
Getting insights into memory configuration of the servers provides us a clear picture of the compute power of these systems. What is the most popular memory configuration of dual socket server? As it turns out 256 and 384 GB are the most memory popular configuration. Today’s servers are getting beefy!

05-Memory configuration dual socket ESXi hosts

Zooming into the dataset quering the memory configuration of dual socket 8 core servers, the memory configuration distribution is as follows:

06-Memory in GB in dual socket 8 core server

What about the memory configuration of quad CPU servers?

07-Memory configurations quad CPU socket ESXu hosts

512 GB is the most popular memory configuration for quad CPU socket ESXi host. Assuming the servers are configured properly, this configuration is providing the same amount of memory to each NUMA node of the systems. The most popular NUMA node is configured with 128 GB in both dual and quad CPU socket systems.

ESXi version distribution

I was also curious about the distribution of ESXi versions amongst the dual and quad CPU socket systems. It turns out that 5.1.0 is the most popular ESXi version for dual CPU systems, while most Quad CPU socket machines have ESXi version 5.5 installed

08-ESXi versions

More To Come
Satyam and I hope to publish more results from our dataset in the coming months. The dataset is expanding rapidly, increasing the insights of the datacenters around the globe. And we hope to cover other dimensions like applications and the virtualization layer itself. Please feel free to send me interesting questions you might have for the planet’s datacenters and we’ll see what we can do. Follow me on twitter @frankdenneman

Hammer, MagicBands and challenging status quo

One of my favorite business books is the famous book of Michael Hammer “Reengineering the Corporation”. The theme is the book is to take a hard look at business processes and radically change these old and existing processes.

Hammer states that typically companies sped up their processes by implementing a newer iteration of existing technology. Many processes are dated before the advent of the computers and just by automating the process it can only optimizes performance marginally. Embedding computers in the archaic processes cannot address their fundamental performance challenges.

By understanding what the process is trying to achieve one can break away from the existing design principles of the process. One great example is one of my favorite technologies is the Disney’s MagicBand and the way it’s used to radically change processes.

Photo by Adam Voorhes /

Photo by Adam Voorhes /

Disney’s MagicBand
The MagicBand is what is commonly referred to as a wearable. Inside the bracelet are a RFID chip and a radio. The parks have long range and short-range scanners along with sensors to interact with the MagicBand bracelet. One of the perks of my job is to speak to people who deliver cutting edge technology and as you can imagine I was very thrilled to speak to some of the team members who worked on the MagicBand platform.

In essence the Disney MagicBand replaces every transaction between the customer and cast members or Disney parks and resorts. The MagicBand becomes your key to anything. It allows access to the park, access to the resorts and automatic payment. Its goal is to create a frictionless experience for the customer, increasing the satisfaction, which of course will increase spending.

Instead of tinkering with existing processes Disney overhauled a lot of processes and many more are to follow. A great example is the overhaul of the check-in process and how it’s completely inline with the Hammer doctrine. Instead of buying newer faster desktop computers to speed up the check-in process, customers can go directly to their hotel room, bypassing the check in process all together. Disney’s sends the MagicBand to your home and prior to your visit the resort informs you via email which room is yours for your stay. Just walk up to your room and unlock the door by taping your MagicBand against the sensor on the door.

MagicBands allows Disney to get rid of the ancient turnstiles that is the first port of anxiety of new parents and their strollers, instead of being funnelled into narrow cramped isles the entrance is now in the shape of a inviting V shape form with incredible process speeds. Just hold your MagicBand to the access point and wait to be greeted by an welcoming green glow, from then on its off to your favorite experience.


The MagicBand platform allows for new experiences as well. What about having a more personalized interaction with cast members? What if Cinderella greets your daughter by name and tells her that she knows she is her favorite princess or that she wishes her a happy birthday? Just mind-blowing and an experience she will never forget.

Range scanners, sensors, WIFI, smart phones apps, user profiles and an insane amount of data crunching make this happen. Customers use the smart phone app to access their schedules and their user profiles. Countless short and long-range sensors scattered across the park pick up the signals of the bracelets. These systems are connected to each other, they collect data, and they use the captured data to optimize the experience of the customers. Data on visitors traffic flow, food orders and waiting times can be used to realign internal resources. Ever heard about Internet of things? This is the poster child of Internet of things.

Stop! Hammer Time
Circling back to the opening statement, using newer iteration of existing technology only provide marginally performance increases. Advances like the MagicBand require new technologies and new ways to operationalize these technologies in datacenters. Not every company is of the same size as Disney, but one thing is certain, most companies face the same challenge Disney has. How to reduce cost, increase efficiency and provide a new experience that makes them unique in a highly competitive market?

A lot of brilliant people are trying to solve these problems by creating technical solutions and it’s up to the IT team to understand if these suit their operation models. How can you reengineer your corporation and create a new service offering while your IT is stuck in the past? Stuck using systems that are designed to work in infrastructures dating back to the early ‘70’s? Where they just found out that the market was bigger than 5 computers? Everything has to align, when the business changes it models, the IT team should not take anything for granted, they too need to aim for quantum leaps of performance.

Markets shift rapidly; IT needs to be able to respond almost in a way that anticipates their needs. Maybe even in a way that it doesn’t feel remarkable at all, high service standards are the norm! Within the realm of virtualized datacenters two technology advancements can create an experience that might provide a ubiquitous experience to the business but provide the magic to the IT team; Scale out storage and object based storage.

Scale out storage and object based storage
Proper Scale out storage systems allows you to operationalize new advancement in storage technology as soon as they are available. They allow virtual datacenters to cater to any performance requirement possible any time. While, and this is very important, without impact current workloads that are using the same platform. Many application vendors move away from a monolithic application architecture, why keep holding on to the relics of the past by using a monolithic storage architecture for performance requirements?

Object based storage, such as VMware VVOLs, allows you to fundamentally change how to provide data services to systems (virtual machines). Instead of creating management constructs and aligning data services to these logical layers (LUNs and datastores), data services can be directly applied to the specific machine. Virtual machines become first class citizens on storage systems, allowing IT teams to cater to requirements that both affect the business as well as the IT team requirements.

Challenge status quo
Hammer stated don’t ask, “How can we do what we do faster?” But ask; Why do we do it the way we do?” In essence, challenge status quo if you want to keep on moving forward in a time that introduces new technologies and application landscapes on a daily basis!

DB Deepdive part 6: Query Plans, Intermediate Results, tempdb and Storage Performance

Welcome to part 6 of the Database workload characteristics series. Databases are considered to be one of the biggest I/O consumers in the virtual infrastructure. Database operations and database design are a study upon themselves, but I thought it might be interested to take a small peak underneath the surface of database design land. I turned to our resident Database expert Bala Narasimhan, PernixData’s VP of products to provide some insights about the database designs and their I/O preferences.

Previous instalments of the series:
Part 1 – Database Structures
Part 2 – Data pipelines
Part 3 – Ancillary structures for tuning databases
Part 4 – NoSQL platforms
Part 5 – Query Execution Plans

In a previous article I introduced the database query optimizer and described how it works. I then used a TPC-H like query and the SQL Server database to explain how to understand the storage requirements of a query via the query optimizer.

In today’s article we will deep dive into a specific aspect of query execution that severely impacts storage performance; namely intermediate results processing. For today’s discussion I will use the query optimizer within the PostgreSQL database. The reason I do this is because I want to show you that these problems are not database specific. Instead, they are storage performance problems that all databases run into. In the process I hope to make the point that these storage performance problems are best solved at the infrastructure level as opposed to doing proprietary infrastructure tweaks or rewrites within the database.

After a tour of the PostgreSQL optimizer we will go back to SQL Server and talk about a persistent problem regarding intermediate results processing in SQL Server; namely tempdb. We’ll discuss how users have tried to overcome tempdb performance problems to date and introduce a better way.

What are intermediate results?
Databases perform many different operations such as sorts, aggregations and joins. To the extent possible a database will perform these operations in RAM. Many times the data sets are large enough and the amount of RAM available is limited enough that these operations won’t fully fit in RAM. When this happens these operations will be forced to spill to disk. The data sets that are written to and subsequently read from disk as part of executing operations such as sorts, joins and aggregations are called intermediate results.

In today’s article we will use sorting as an example to drive home the point that storage performance is a key requirement for managing intermediate results processing.

The use case
For today’s example we will use a table called BANK that has two columns ACCTNUM and BALANCE. This table tracks the account numbers in a bank and the balance within each account. The table is created as shown below:

Create Table BANK (AcctNum int, Balance int);

The query we are going to analyze is one that computes the number of accounts that have a given balance and then provides this information in ascending order by balance. This query is written in SQL as follows:

Select count(AcctNum), Balance from BANK GROUP BY Balance ORDER BY Balance;

The ORDER BY clause is what will force a sort operation in this query. Specifically we will be sorting on the Balance column. I used the PostgreSQL database to run this query.

I loaded approximately 230 million rows into the BANK table. I made sure that the cardinality of the Balance column is very high. Below I have a screenshot from the PostgreSQL optimizer for this query. Note that the query will do a disk based merge sort and will consume approximately 4 GB of disk space to do this sort. A good chunk of the query execution time was spent in the sort operation.

part 6

A disk-based sort, and other database operations that generate intermediate results, is characterized by large writes of intermediate results followed by reads of those results for further processing. IOPS is therefore a key requirement.

What is especially excruciating about the sort operation is that it is a materialization point. What this means is that the query cannot make progress until the sort is finished. You’ve essentially bottlenecked the entire query on the sort and the intermediate results it is processing. There is no better validation of the fact that storage performance is a huge impediment for good query times.

What is tempdb?
tempdb is a system database within SQL Server that is used for a number of reasons including the processing of intermediate results. This means that if we run the query above against SQL Server the sorting operation will spill intermediate results into tempdb as part of processing.

It is no surprise then that tempdb performance is a serious consideration in SQL Server environments. You can read more about tempdb here.

How do users manage storage performance for intermediate results including tempdb?

Over the last couple of years I’ve talked to a number of SQL Server users about tempdb performance. This is a sore point as far as SQL Server performance goes. One thing I’ve seen customers do to remediate the tempdb performance problem is to host tempdb alone in arrays that have fast media, such as flash, in them in the form of either hybrid arrays or All Flash Arrays (AFA). The thought process is that while the ‘fast array’ is too expensive to standardize on, it makes sense to carve out tempdb alone from it. In this manner, customers look at the ‘fast array’ as a performance band aid for tempdb issues.

On the surface this makes sense since an AFA or a hybrid array can provide a performance boost for tempdb. Yet it comes with several challenges. Here are a few:

  • You now have to manage tempdb separately from all the other datastores for your SQL Server.
  • You procure the array for tempdb yet you do not leverage any of its data services. You use it as a performance band aid alone. This makes the purchase a lot more expensive than it seems on paper.
  • For queries that don’t leverage tempdb the array is not useful.
  • Performance problems in databases are not limited to tempdb. For example, you may be doing full table scans and these don’t benefit from the array.
  • You cannot leverage innovations in media. You cannot, for example, leverage RAM or PCM or anything else that will come in the future for tempdb.

How can PernixData FVP help?
In my mind PernixData FVP is the ideal solution for storage performance problems related to intermediate results in general and tempdb in particular. Intermediate result processing, including tempdb, shows very good temporal locality and is therefore an ideal candidate for FVP. Below are some other reasons why FVP is ideal for this scenario:

  • PernixData FVP allows you to use all server side media, flash or RAM, for accelerating tempdb and intermediate results processing.
  • You don’t need to configure anything separately for this. Instead you operate at the VM level and accelerate your database VM as a result of which every I/O operation is enhanced including intermediate results processing.
  • As your tempdb requirements change – lets say you need for space for handling it – it’s simply a matter of replacing one flash card with bigger one as far as FVP is concerned. There is no down time and the application is not impacted. This allows you to ride the price/performance curve of flash seamlessly.

Database workload characteristics and their impact on storage architecture design – part 5 – Query Execution Plans

Welcome to part 5 of the Database workload characteristics series. Databases are considered to be one of the biggest I/O consumers in the virtual infrastructure. Database operations and database design are a study upon themselves, but I thought it might be interested to take a small peak underneath the surface of database design land. I turned to our resident Database expert Bala Narasimhan, PernixData’s VP of products to provide some insights about the database designs and their I/O preferences.

Previous instalments of the series:

Part 1 – Database Structures
Part 2 – Data pipelines
Part 3 – Ancillary structures for tuning databases
Part 4 – NoSQL platforms

Databases are a critical application for the enterprise and usually have demanding storage performance requirements. In this blog post I will describe how to understand the storage performance requirements of a database at the query level using database tools. I’ll then explain why PernixData FVP helps not only to solve the database storage performance problem but also the database manageability problem that manifests itself when storage performance becomes a bottleneck. Throughout the discussion I will use SQL Server as an example database although the principles apply across the board.

Query Execution Plans
When writing code in a language such as C++ one describes the algorithm one wants to execute. For example, implementing a sorting algorithm in C++ means describing the control flow involved in that particular implementation of sorting. This will be different in a bubble sort implementation versus a merge sort implementation and the onus is on the programmer to implement the control flow for each sort algorithm correctly.

In contrast, SQL is a declarative language. SQL statements simply describe what the end user wants to do. The control flow is something the database decides. For example, when joining two tables the database decides whether to execute a hash join, a merge join or a nested loop join. The user doesn’t decide this. The user simply executes a SQL statement that performs a join of two tables without any mention of the actual join algorithm to use.

The component within the database that comes up with the plan on how to execute the SQL statement is usually called the query optimizer. The query optimizer searches the entire space of possible execution plans for a given SQL statement and tries to pick the optimal one. As you can imagine this problem of picking the most optimal plan out of all possible plans can be computationally intensive.

SQL’s declarative nature can be sub-optimal for query performance because the query optimizer might not always pick the best possible query plan. This is usually because it doesn’t have full information regarding a number of critical components such as the kind of infrastructure in place, the load on the system when the SQL statement is run or the properties of the data. . One example of where this can manifest is called Join Ordering. Suppose you run a SQL query that joins three tables T1, T2, and T3. What order will you join these tables in? Will you join T1 and T2 first or will you join T1 and T3 first? Maybe you should join T2 and T3 first instead. Picking the wrong order can be hugely detrimental for query performance. This means that database users and DBAs usually end up tuning databases extensively. In turn this adds both an operational and a cost overhead.

Query Optimization in Action
Let’s take a concrete example to better understand query optimization. Below is a SQL statement from a TPC-H like benchmark.

select top 20 c_custkey, c_name, sum(l_extendedprice * (1 - l_discount)) as revenue, c_acctbal, n_name, c_address, c_phone, c_comment from customer, orders, lineitem, nation where c_custkey = o_custkey and l_orderkey = o_orderkey and o_orderdate >= ':1' and o_orderdate < dateadd(mm,3,cast(':1'as datetime)) and l_returnflag = 'R' and c_nationkey = n_nationkey group by c_custkey, c_name, c_acctbal, c_phone, n_name, c_address, c_comment order by revenue;

The SQL statement finds the top 20 customers, in terms of their effect on lost revenue for a given quarter, who have returned parts they bought.
Before you run this query against your database you can find out what query plan the optimizer is going to choose and how much it is going to cost you. Figure 1 depicts the query plan for this SQL statement from SQL Server 2014 [You can learn how to generate a query plan for any SQL statement on SQL Server at


You should read the query plan from right to left. The direction of the arrow depicts the flow of control as the query executes. Each node in the plan is an operation that the database will perform in order to execute the query. You’ll notice how this query starts off with two Scans. These are I/O operations (scans) from the tables involved in the query. These scans are I/O intensive and are usually throughput bound. In data warehousing environments block sizes could be pretty large as well.

A SAN will have serious performance problems with these scans. If the data is not laid out properly on disk, you may end up with a large number of random I/O. You will also get inconsistent performance depending on what else is going on in the SAN when these scans are happening. The controller will also limit overall performance.

The query begins by performing scans on the lineitem table and the orders table. Note that the database is telling what percentage of time it thinks it will spend in each operation within the statement. In our example, the database thinks that it will spend about 84% of the total execution time on the Clustered Index Scan on lineitem and 5% on the other. In other words, 89% of the execution time of this SQL statement is spent in I/O operations! It is no wonder then that users are wary of virtualizing databases such as these.

You can get even more granular information from the query optimizer. In SQL Server Management Studio, if you hover your mouse over a particular operation a yellow pop up box will appear showing very interesting statistics. Below is an example of data I got from SQL Server 2014 when I hovered over the Clustered Index Scan on the lineitem able that is highlighted in Figure 1.


Notice how Estimated I/O cost dominates over Estimated CPU cost. This again is an indication of how I/O bound this SQL statement is. You can learn more about the fields in the figure above here.

An Operational Overhead
There is a lot one can learn about one’s infrastructure needs by understanding the query execution plans that a database generates. A typical next step after understanding the query execution plans is to tune the query or database for better performance. For example, one may build new indexes or completely rewrite a query for better performance. One may decide that certain tables are frequently hit and should be stored on faster storage or pinned in RAM. Or, one may decide to simply do a complete infrastructure rehaul.

All of these result in operational overheads for the enterprise. For starters, this model assumes someone is constantly evaluating queries, tuning the database and making sure performance isn’t impacted. Secondly, this model assumes a static environment. It assumes that the database schema is fixed, it assumes that all the queries that will be run are known before hand and that someone is always at hand to study the query and tune the database. That’s a lot of rigidity in this day and age where flexibility and agility are key requirements for the business to stay ahead.

A solution to database performance needs without the operational overhead
What if we could build out a storage performance platform that satisfies the performance requirements of the database irrespective of whether query plans are optimal, whether the schema design is appropriate or whether queries are ad-hoc or not? One imagines such a storage performance platform will completely take away the sometimes excessive tuning required to achieve acceptable query performance. The platform results in an environment where SQL is executed as needed by the business and the storage performance platform provides the required performance to meet the business SLA irrespective of query plans.

This is exactly what PernixData FVP is designed to do. PernixData FVP decouples storage performance from storage capacity by building a server side performance tier using server side flash or RAM. What this means is that all the active I/O coming from the database, both reads and writes, whether sequential or random, and irrespective of block size is satisfied at the server layer by FVP right next to the database. You are longer limited by how data is laid out on the SAN, or the controller within the SAN or what else is running on the SAN when the SQL is executed.

This means that even if the query optimizer generates a sub optimal query plan resulting in excessive I/O we are still okay because all of that I/O will be served from server side RAM or flash instead of network attached storage. In a future blog post we will look at a query that generates large intermediate results and explain why a server side performance platform such as FVP can make a huge difference.

Post originally appeared on

Don’t backup. Go forward with Rubrik

Screen Shot 2015-03-24 at 16.51.59
Rubrik has set out to build a time machine for cloud infrastructures. I like the message as it shows that they are focused on bringing simplicity to the enterprise backup world. Last week I had the opportunity to catch up with them and they had some great news to share as they were planning to come out of stealth this week. And that day is today.

Rubrik platform
This time machine is delivered on a 2U commodity appliance that runs the Rubrik software. By installing this appliance you greatly reduce the number of machines that are necessary to provide backup and restore services today. Reducing the number of machines simplifies the infrastructure for architects and support, while the UI of Rubrik simplifies the day-to-day operations of the administrators.

User Interface
No agents are needed in the virtual datacenters to discover the workload and the user interface is centered on policy driven SLAs. Unfortunately I can’t show the user-interface, but trust me this is something you longed for a long time. Due to the pedigree of the co-founders it comes as no surprise that the Rubrik platform is fully programmable with REST API’s.

Typically moving to a new backup system introduces risk and cost. Learning curves are high, misconfigured backup configurations possibly risking data loss. Policy driven and the ability to use REST APIs ensure that the platform easily integrates in every environment. The policies are so easy to use that no training is necessary; this reduces the impact of transition to a new backup system. The low learning curve means that no countless hours are lost by figuring out how to safely backup your data, while the REST APIs allow advanced tech crews to integrate Rubrik in their highly automated service offerings.

One thing that made me very happy to see is the Rubriks’ ability to “cloud-out” your data. Rubrik provides a gateway to AWS allowing you to send “aged data” to the cloud in a very secure way. This feature benefits the complexity reduction of local architecture. Instead of having to incorporate a tape library, you now only need an Internet connection. Having worked with a big tape libraries myself for years I know this will not only bring a lot of datacenter space back and reduce your energy bill, you WILL have way less heat to cool.

As the team understand the concept of distributed architectures thoroughly (more about that in the next paragraph) it doesn’t come as a surprise that it scales very well. The architecture can scale to 1000s of nodes. What’s interesting is that it can mount the snapshots directly on the Rubrik platform allowing virtual machines to run directly on the appliance. Think about the possibilities for development. Snapshot your current production workload and test your new code instantly without any impact on active services.

Rubrik starts off by supporting VMware vSphere and it makes sense to focus on the biggest market out there as a startup. But support for other hypervisors and cloud infrastructures (to cloud-out data) will follow.

I expect Rubrik to become a success, the product aligns with the todays enterprise datacenter requirements and the pedigree of the team is amazing. As mentioned before the co-founders have a very rich background in distributed systems, Arvind Jain was the founding Engineer of Riverbed and was a Distinguished Engineer at Google before co-founding the other three members. Interesting enough (for me at least) there are stong ties with PernixData. Prior to Rubrik, Bipul Sinha was a partner at LightSpeed before founding Rubrik. I had the great pleasure of talking to Bipul often, as he is the initial investor of PernixData. Funny enough I can recall a conversation where Bipul asked me on my view of the backup world. I believe boring and totally not sexy was my initial reply. Guess he is setting out to change that fast! The CTO of Rubrik, Arvind Nithrakashyap, worked at Oracle where he co-founded Oracle Exadata. The other Co-founder of Exadata is PernixData CEO Poojan Kumar. Last but certainly not least Soham Mazumdar, who worked at Google on the search engine and founded Tagtile.

As of today you can sign-up for the early access program. Go visit the website to read more and follow them on twitter.

Exciting times ahead for Rubrik! Don’t Backup. Go Forward!

Older posts Newer posts

© 2018

Theme by Anders NorenUp ↑