• Skip to primary navigation
  • Skip to main content

frankdenneman.nl

  • AI/ML
  • NUMA
  • About Me
  • Privacy Policy

Restart vCenter results in DRS load balancing

April 8, 2011 by frankdenneman

Recently I had to troubleshoot an environment which appeared to have a DRS load-balancing problem. Every time when a host was brought out of maintenance mode, DRS didn’t migrate virtual machines to the empty host. Eventually virtual machines were migrated to the empty host but this happened after a couple of hours had passed. But after a restart of vCenter, DRS immediately started migrating virtual machines to the empty host.
Restarting vCenter removes the cached historical information of the vMotion impact. vMotion impact information is a part of the Cost-Benefit Risk analysis. DRS uses this Cost-Benefit Metric to determine the return on investment of a migration. By comparing the cost, benefit and risks of each migration, DRS tries to avoid migrations with insufficient improvement on the load balance of the cluster.
When removing the historical information a big part of the cost segment is lost, leading to a more positive ROI calculation, which in turn results in a more “aggressive” load-balance operation.

Filed Under: DRS

Comments

  1. Cody Bunch says

    April 8, 2011 at 3:36 pm

    Interesting to note. Is there a less disruptive way than restarting the entire vCenter service to effect the cost-benefit cache clearing?

  2. Duncan says

    April 8, 2011 at 3:38 pm

    Good stuff Frank,

  3. Sketch says

    April 8, 2011 at 4:09 pm

    So was this due to a restart of the application? or a restart of the underlying vCenter service?

  4. Doug says

    April 8, 2011 at 4:57 pm

    Wow. I never thought about that. Do you think it would make sense to have vCenter cache that information more persistently in the future? At least when vCenter gets shut down cleanly?

  5. Michael says

    April 8, 2011 at 5:04 pm

    I see this behavior from Virtual Center 2.5 times. Thanks for explanation.
    However, I would like DRS to start load balance the hosts only 10-15 minutes after vCenter service start and not immediately. Many times during troubleshooting or upgrades, when several restarts of vCenter service are required, I find myself unable to connect to vCenter because of DRS performing multiple vMotions at once.
    In this situation, “postponed DRS start” feature would be nice and save a lot of time.
    Michael.

  6. NiTRo says

    April 8, 2011 at 5:05 pm

    Thanks Frank, it explains a lot of things indeed

  7. Heino says

    April 8, 2011 at 7:16 pm

    Great post and yeah, that explains why it doesnt happen faster.

  8. Jeff says

    April 8, 2011 at 10:28 pm

    That explains it then! I have run into this for years it seems like and I just accepted the vmotion of tons of VMs after a vcenter restart. Thanks Frank, I can sleep better tonight :).

  9. Manish Patel says

    April 9, 2011 at 4:12 am

    how about artificially creating a load on the Cluster by shutting down some non-critical VMs or some other non destructive activities and see if the DRS triggered again or not.
    Needs to be tested for sure – will take some time for me once I get back to work on Tuesday and keep you/Duncan posted.

  10. Kevin Foster says

    April 11, 2011 at 2:57 pm

    So I am not crazy! Thanks for the explination!

  11. Daniel says

    April 16, 2011 at 10:49 am

    I have accepted this behavior as part of a less-agresssive set DRS, but it always seemed weird that a host out of four would be left completely empty for a good while, since losing a loaded host to a fault will have a larger impact on the entire cluster.

  12. Nathan Bahls says

    June 30, 2011 at 3:20 pm

    Frank,
    Many thanks for the explanation. Are you aware of any advanced configuration options with which we can alter this behavior post vCenter Server service restart?
    Many thanks!

  13. NiTRo says

    December 26, 2011 at 1:14 am

    According to vcenter 4.1 U2 release notes, it’s resolved http://www.vmware.com/support/vsphere4/doc/vsp_vc41_u2_rel_notes.html
    When DRS is set to automatic, restarting the VMware VirtualCenter Server service might generate a large number of vMotion tasks leading to unnecessary movement of virtual machines. The vMotion tasks are queued and make the management of virtual machines difficult until the tasks are completed.
    This issue is resolved in this release.

  14. Brandon says

    January 9, 2012 at 6:42 pm

    Frank, just curious if anyone figured out a way to force DRS to flush it’s cache without restarting vCenter? I have some smaller clusters which seem to have given up attempting to truly balance the load Even though one of the resources is definitely out of balance.
    For now restarting vCenter is do-able, but at some point it may become a nightmare. And with the issue being fixed in 4.1 u2 I wonder if that will even be a valid method to refresh the cache.

Copyright © 2025 · SquareOne Theme on Genesis Framework · WordPress · Log in