Tag: vSphere

Virtually Speaking Podcast about Technical Writing

Last week Duncan and I were guests on the ever popular Virtually Speaking Podcast. In this show we discussed the difference in technical writing, i.e., writing a blog post versus writing a book. We spoke a lot about the challenges of writing a book and the importance of a supporting cast. We received a lot of great feedback on social media, and Pete told me the episode was downloaded more than a 1000 times in the first 24 hours. I think this is especially impressive as he published the podcast on a Saturday Afternoon. Due to this popularity, I thought it might be cool to share the episode in case you missed the announcement.

Pete and John shared the links to our VMworld sessions on this page. During the show, I mentioned the VMworld session of Katarina Wagnerova and Mark Brookfield. If you go to VMworld, I would recommend attending this session. It’s always interesting to hear people talk about how they designed an environment and dealt with problems in a very isolated place on earth.

Enjoy listening to the show.

To which host-level latency statistic is the SIOC congestion threshold related?

Today someone asked if the congestion threshold of SIOC is related to which host latency threshold? Is it the Device average (DAVG), Kernel Average (KAVG) or Guest Average (GAVG)?

Well actually it’s none of the above. DAVG, KAVG and GAVG are metrics in a host-local centralized scheduler that has complete control over all the requests to the storage system. SIOC main purpose is to manage shared storage resources across ESXi hosts, providing allocation of I/O resources independent of the placement of virtual machines accessing the shared datastore. And because it needs to regulate and prioritize access to shared storage that spans multiple ESXi hosts, the congestion threshold is not measured against a host-side latency metric. But to which metric is it compared? In essence the congestion threshold is compared with the weighted average of D/AVG per host, the weight is the number of IOPS on that host. Let’s expand on this a bit further.

Average I/O latency
To have an indication of the load of the datastore on the array, SIOC uses the average I/O latency detected by each host connected to that datastore. Average latency across hosts is used to cope with the variety of workloads, the characteristic of the active workloads, such as read versus writes, I/O size and degree of sequential I/Os in addition to array behavior such as block location, caching policies and I/O scheduling.

To calculate and normalize the average latency across hosts, each host writes its average device latency and number of I/Os for that datastore in a file called IORMSTATS.SF stored on the same datastore.

A common misconception about SIOC is that it’s compute cluster based. The process of determining the datastore-wide average latency really reveals the key denominator – hosts connected to the datastore – . All hosts connected to the datastore write to the IORMSTATS.SF file, regardless of cluster membership. Other than enabling SIOC, vCenter is not necessary for normal operations. Each connected host reads the IORMSTATS.SF file each 4 seconds and locally computes the datastore-wide average to use for managing the I/O stream. Therefor cluster membership is irrelevant.

Datastore wide normalized I/O latency
Back to the process of computing the datastore wide normalized I/O latency. The average device latencies of each host are normalized by SIOC based on the I/O request size. As mentioned before, not all storage related workloads are the same. Workloads issuing I/Os with a large request size result in longer device latencies due to way storage arrays process these workloads. For example, when using a larger I/O request size such as 256KB, the transfer might be broken up by the storage subsystem into multiple 64KB blocks. This operation can lead to a decline of transfer rate and throughput levels, increasing latency. This allows SIOC to differentiate high device latency from actual I/O congestion at the device itself.

Number of I/O requests complete per second
At this point SIOC has normalized the average latency across hosts based on I/O size, next step is to determine the aggregate number of IOPS accessing the datastore. As each host reports the number of I/O requests complete per second, this metric is used to compare and prioritize the workloads.

I hope this mini-deepdive into the congestion thresholds explains why the congestion threshold could never be solely related to a single host-side metric . Because the datastore-wide average latency is a normalized value, the latency observed of the datastore per individual host may be different than the latency SIOC reports per datastore.


VMworld 2009 sessions

After skipping both VMworld events in 2008 I’m attending the VMworld event in San Francisco. This is the first event after VMware released vSphere and I hope to see much in-depth information about the OS and it’s new features.  Duncan Epping  and Eric Sloof  posted info about interesting sessions, so I started browsing the session catalog as well.
The following sessions seems to be very interesting; Continue reading

© 2018 frankdenneman.nl

Theme by Anders NorenUp ↑