Increasing the queue depth?

When it comes to IO performance in the virtual infrastructure one of the most recommended “tweaks” is changing the Queue Depth (QD). But most forget that the QD parameter is just a small part of the IO path. The IO path exists of layers of hardware and software components, each of these components can have a huge impact on the IO performance. The best results are achieved when the whole system is analysed and not just the ESX host alone.
 
To be honest I believe that most environments will profit more from a balanced storage design than adjusting the default values. But if the workload is balanced between the storage controllers and IO queuing still occurs, adjusting some parameters might increase IO performance.
Merely increasing the parameters can cause high latency up to the point of major slowdowns. Some factors need to be taking in consideration.
 
LUN queue depth
The LUN queue depth determines how many commands the HBA is willing to accept and process per LUN, if a single virtual machine is issuing IO, the QD setting applies but when multiple VM’s are simultaneously issuing IO’s to the LUN, the Disk.SchedNumReqOutstanding (DSNRO) value becomes the leading parameter.
 
Increasing the QD value without changing the Disk.SchedNumReqOutstanding setting will only be beneficial when one VM is issuing commands. It is considered best practise to use the same value for the QD and DSNRO parameters!
Read Duncan's excellent article about the DSNRO setting.

 
Qlogic Execution Throttle
Qlogic has a firmware setting called „Execution Throttle" which specifies the maximum number of simultaneous commands the adapter will send. The default value is 16, increasing the value above 64 has little to no effect, because the maximum parallel execution of SCSI operations is 64.
(Page 170 of ESX 3.5 VMware SAN System Design and Deployment Guide)
 
If the QD is increased, execution throttle and the DSNRO must be set with similar values, but to calculate the proper QD the fan-in ratio of the storage port needs to be calculated.
 
Target Port Queue Depth
A queue exist on the storage array controller port as well, this is called the "Target Port Queue Depth". Modern midrange storage arrays, like most EMC- and HP arrays can handle around 2048 outstanding IO’s. 2048 IO’s sounds a lot, but most of the time multiple servers communicate with the storage controller at the same time. Because a port can only service one request at a time, additional requests are placed in queue and when the storage controller port receives more than 2048 IO requests, the queue gets flooded. When the queue depth is reached, this status is called (QFULL), the storage controller issues an IO throttling command to the host to suspend further requests until space in the queue becomes available. The ESX host accepts the IO throttling command and decreases the LUN queue depth to the minimum value, which is 1!
 
The VMkernel will check every 2 seconds to check if the QFULL condition is resolved. If it is resolved, the VMkernel will slowly increase the LUN queue depth to its normal value, usually this can take up to 60 seconds.
 
Calculating the queue depth\Execution Throttle value
To prevent flooding the target port queue depth, the result of the combination of number of host paths + execution throttle value + number of presented LUNs through the host port must be less than the target port queue depth. In short T => P * q * L

T = Target Port Queue Depth
P = Paths connected to the target port
Q = Queue depth
L = number of LUN presented to the host through this port

Location of TPQL

Despite having four paths to the LUN, ESX can only utilize one (active) path for sending IO. As a result, when calculating the appropriate queue depth, you use only the active path for "Paths connected to the target port (P)" in the calculation, i.e. P=1.

But in a virtual infrastructure environment, multiple ESX hosts communicate with the storage port, therefore the QD should be calculated by the following formula:
 
T => ESX Host 1 (P * Q * L) + ESX Host 2 (P * Q * L) ..... + ESX Host n (P * Q * L)
 
For example an 8 ESX host cluster connects to 15 LUNS (L) presented by an EVA8000 (4 target ports)* An ESX server issues IO through one active path (P), so P=1 and L=15.
 
The execution throttle\queue depth can be set to 136,5=> T=2048 (1 * Q * 15) = 136,5
But using this setting one ESX host can fill the entire target port queue depth by itself, but the environment exists of 8 ESX hosts. 136,5/ 8 = 17,06
 
In this situation all the ESX Host communicate to all the LUNs through one port. Which does not happen in many situations if a proper load-balancing design is applied. Most arrays have two controllers and every controller has at least two ports. In the case of a controller failure, at least two ports are available to accept IO requests.
 
It is possible to calculate the queue depth conservatively to ensure a minimum decrease of performance when losing a controller during a failure, but this will lead to underutilizing the storage array during normal operation, which will hopefully be 99,999% of the time. It is better to calculate a value which utlilize the array properly without flooding the target port queue.
 
If you assume that multiple ports are available and that all LUNs are balanced across the available ports on the controllers, it will effectively quadruple the target port queue depth and therefore increase the values of the execution throttle in the example above to 68. Besides the fact that you cannot increase this value above 64, it is wise to decrease the value to a number below max value, it will create a buffer for safety 

What’s the Best Setting for Queue Depth?
The examples mentioned are pure worst case scenario stuff, most of the time it is highly unlikely that all hosts perform at their maximum level at any one time. Changing the defaults can improve throughput, but most of the time it is just a shot in the dark. Although you are configuring your ESX hosts with the same values, not every load on the ESX server is the same. Every environment is different and so the optimal queue depths would differ. One needs to test and analyse its environment. Please do not increase the QD without analysing the environment; this can be more harmful than useful.

Get notification of these blogs postings and more DRS and Storage DRS information by following me on Twitter: @frankdenneman

Frank Denneman

Follow me on Twitter, visit the facebook fanpage of FrankDenneman.nl, or add me to your Linkedin network

You may also like...

29 Responses

  1. Paul says:

    Hi Frank

    could you explain this formuala a bit more?

    The execution throttle can be set to 136,5=> T=2048 (1 * Q * 15) = 136,5

    where does the 136,5 come from? i understand this would be the Qlogic execution throttle value but i don't get what the value means

    thanks

  2. Frank Denneman says:

    Hi Paul,

    I changed the text, i forgot to add the queue depth to that sentence. So subsitute execution throttle for queue depth.

    The target port can accept 2048 outstanding IO's in its queue before shutting down communications, the esx host in the example has 15 luns presented over that storage controller. IO is issued over one active path per LUN.
    Now you can calculate a queue depth that will give you the best performance without flooding the queue.
    If you set the queuedepth to 136 the ESX host is
    able to issue 2040 IO's (2048/15).

    But (hopefully) you do not have one ESX host, so you will need to devide the target port queue with the amount of hosts....

    Does the rest of the post make any sense?

  3. duncan74 says:

    On my blog someone suggested to keep the "DSNRO" lower than the QD. This way it won't be possible for just one VM to fill up the entire queue for a host.

    Great article again by the way, keep them coming!

  4. wharlie says:

    Advice please:
    I am an ESX admin, not a storage admin.
    The institution where I work is using 1 LUN per VM (sometimes more than 1 e.g lun per vmdk).
    Their environment currently consists of 3 hosts running 86 vm's connected to 100 LUNs. Most LUNs have 2 paths. The SAN is active/passive.
    According to your formula this would equal 2048/100/3*2=13.65.
    What settings should I be using for QD and DSNRO? And shold I be telling the storage admins to stop assigning 1 LUN per VM? What are the implications of continuing down this path?

  5. PiroNet says:

    "When the queue depth is reached, (QFULL) the storage controller issues an IO throttling command to the host..."

    How do you see that at the host level ?
    Logs wise, what should I grep for ?

    Thx,

  6. Frank Denneman says:

    Good question!
    To my knowledge the qfull condition is handled by the qlogic driver itself. The QFULL status is not returned to the OS. But some storage devices return BUSY rather than QFULL, BUSY errors are logged in the /var/log/vmkernel.
    But not every BUSY error is a Qfull error!
    A KB article exists will SCSI sense codes: KB here

    If you have suspicious error codes in the logs, you can check them at vmproffesional.com SCSI Error Decoder

  7. Frank Denneman says:

    One VM per LUN is a really nice situation.
    The VM has its own path and does not need to compete with other VM's sharing the LUN.
    If more than one VM issues IO to the LUN, the VMkernel controlls the IO with some sort of "fairness" scheme, that means if two VM's issue IO's to the LUN, the vmkernel decides which VM can write the IO. This has to do with sector proximity and the amount of consecutive sequential IO's allowed from one VM. You just gave me an idea of a new article 🙂

    Did you every calculated the storage utilization rate with you vm-to LUN ratio?
    How many free space do you have on every lun?

    In ESX you will always communicate over one active path.
    DSNRO setting does not apply in you situation, because you are hosting one VM per LUN.
    The QD applies.

    Please contact your SAN vendor to ask the target port queue depth before setting the QD.

    How many storage controllers does your SAN have?

  8. Paul says:

    Hi Frank yes it makes perfect sense now was just what I was looking for cheers!

    Paul.

  9. Hi Frank,

    Great post!

    your formula implies a 100% virtual situation.
    What about those physical guests connected to the same storage controllers?
    When determining your QD settings you have to take them into account also.

    -Arnim

  10. Frank Denneman says:

    Thx for the compliment,

    Yes, the formula implies a 100% virtual situation, just to keep it "simple".
    When other systems connect to the storage arrays, you will need to take them into account as well.

    Microsoft published two excellent documents, both are must reads for windows admins
    Disk Subsystem Performance Analysis for Windows
    and
    Performance Tuning Guidelines for Windows Server 2003

    In the Disk subsystem doc, the numberofrequest setting is explained:

    NumberOfRequests
    Both SCSIport and Storport miniport drivers can use a registry parameter to designate how much concurrency is allowed on a device by device basis.
    The default is 16, which is much too small for a storage subsystem of any decent size unless quite a number of physical disks are being presented to the operating system by the controller.

    Up until today I worked on virtual infrastructure which contained only windows systems, so I do not know the default linux and solaris queuedepths yet.
    But I'm working on a new virtual infra for a new client of mine, which will host windows, linux and solaris vm's. So I will post the settings soon.

  11. jame says:

    As you write : you use the same value for QueueDepth, DSNRO and ExecutionThrottle. And you that the max ExcutionThrottle is 64.
    So you never put QueueDepth greater than 64?

  12. aru says:

    please tell me how to configure the adapter queue depth value (AQLEN) in esx4.

  13. Frank Denneman says:

    Hi Aru,

    Changing the queuedepth in ESX 4 is quite similiar to changing the queue depth in ESX 3.5. For the example I used the popular qlogic qla2300_707 driver

    1. Open a connection to the service console.
    2. Issue the command: vicfg-module -s ql2xmaxqdepth=64 qla2300_707
    3. Reboot the system.

    With ql2xmaxqdepth=64 setting the queue depth to 64

    The vSphere SAN guide list also how to change the QD of an emulex , it's listed on page 70/71. http://www.vmware.com/pdf/vsphere4/r40/vsp_40_san_cfg.pdf

    Good luck!

  14. Statuss says:

    Лучшие новые статусы для Вконтакте.ру

  15. vinod shettigar says:

    How to change the QUEUE DEPTH in Windows Server 2008 r2 ?

  16. AB says:

    Hi Frank,

    Good post.

    I have a genuine and simple question, I hope eveybody would like to know when should I go for altering QD from the default one. Suppose if my environment is running well with some slowness in the month end( billing application servers and all) and I don't have any QD error with iscsi code 28 or 40 in my vmkernel repported. So, How should I check if there any possiblity of high I/O and I should go for it.

    Thanks.

  1. March 9, 2009

    [...] is a freelance consultant with a focus on virtualization and storage. His latest addition “Increasing the queue depth” is an excellent article and really shows that Frank knows what he’s talking about! [...]

  2. May 8, 2009

    [...] EnableResignature and/or DisallowSnapshotLUN Script for Balancing Multipathing in ESX 3.x Increasing the queue depth? Understanding MRU behavior in VMware ESX 3.x Are You Stuck with a Single REALLY busy array port [...]

  3. July 7, 2009

    [...] depth. I could do a write up, but Frank Dennenman wrote an excellent blog on this topic. Read it here and read NetApp’s Nick Triantos article as well. But in short you’ve got two [...]

  4. August 14, 2009

    [...] VMware – Increasing the queue depthTipps zu QDepth (Storage-Anbindung) bei VMware [...]

  5. December 30, 2009

    [...] Increasing the queue depth 2. Lefthand SAN – Lessons learned 3. HP Continuous Access and the use of LUN balancing scripts 4. [...]

  6. February 3, 2010

    [...] Increasing the queue depth 2. Lefthand SAN – Lessons learned 3. HP Continuous Access and the use of LUN balancing scripts 4. [...]

  7. March 9, 2010

    [...] Leave a comment Go to comments This blog is no longer active. The new home of this article is: http://frankdenneman.nl/2009/03/increasing-the-queue-depth/ Sorry for the inconvenience Possibly related posts: (automatically generated)Micro-bursting and [...]

  8. June 16, 2010

    [...] nous avons lu avec délectation les posts interposés de Duncan Epping et Frank Denneman au sujet des performances liées aux queue depths QLogic et Emulex. Le maître mot en matière de [...]

  9. October 29, 2010

    [...] the SAN ports have queues. Each LUN has a queue. The HBA has a queue. I would need to defer to a this article by Frank Denneman (a much smarter guy than myself.) That balanced storage design is best course of [...]

  10. October 29, 2010

    [...] Increasing the queue depth?, QLogic FC HBA in an EMC Environment, Scripting Queue Depth in a QLogic/EMC environment, Scalable [...]

  11. January 9, 2013

    [...] Frank Denneman: Increasing the queue depth? [...]

  12. February 28, 2013

    [...] Queue Depth.  There’s only so much active I/O to go around, per LUN, per host, at any given moment in time.  Queue depth is defined in many places along the path of an I/O and at each point, it specifies how many I/Os can be “active” in terms of being handled and processed (decreases latency) as opposed to being queued or buffered (increases latency).  Outside of an environment utilizing SIOC, the queue depth that each virtual machine on a given LUN per host must share is 32 as defined by the default vSphere DSNRO value.  What this effectively means is that all virtual machines on a host sharing the same datastore must share a pool of 32 active I/Os for that datastore. [...]

  13. March 12, 2013

    [...] заключался в глубине очереди. Может выполняться определенное количество запросов [...]