In vSphere 4.1 a new network Load Based Teaming (LBT) algorithm is available on the distributed virtual switch dvPort groups. The option “Route based on physical NIC load” takes the virtual machine network I/O load into account and tries to avoid congestion by dynamically reassigning and balancing the virtual switch port to physical NIC mappings.
The three existing load-balancing policies, Port-ID, Mac-Based and IP-hash use a static mapping between virtual switch ports and the connected uplinks. The VMkernel assigns a virtual switch port during the power-on of a virtual machine, this virtual switch port gets assigned to a physical NIC based on either a round-robin- or hashing algorithm, but all algorithms do not take overall utilization of the pNIC into account. This can lead to a scenario where several virtual machines mapped to the same physical adapter saturate the physical NIC and fight for bandwidth while the other adapters are underutilized. LBT solves this by remapping the virtual switch ports to a physical NIC when congestion is detected.
After the initial virtual switch port to physical port assignment is completed, Load Based teaming checks the load on the dvUplinks at a 30 second interval and dynamically reassigns port bindings based on the current network load and the level of saturation of the dvUplinks. The VMkernel indicates the network I/O load as congested if transmit (Tx) or receive (Rx) network traffic is exceeding a 75% mean over a 30 second period. (The mean is the sum of the observations divided by the number of observations).
An interval period of 30 seconds is used to avoid MAC address flapping issues with the physical switches. Although an interval of 30 seconds is used, it is recommended to enable port fast (trunk fast) on the physical switches, all switches must be a part of the same layer 2 domain.
During the test of the Cisco Nexus 1000V the customer deleted the VSM first without removing the DVS using commands from within the VSM, ending up with an orphaned DVS. One can directly delete the DVS from the DB, but there are bunch of rows in multiple tables that need to be deleted. This is risky and may render DB in some inconsistent state if an error is made while deleting any rows. Luckily there is a more elegant way to remove an orphaned DVS without hacking and possibly breaking the vCenter DB.
A little background first:
When installing the Cisco Nexus 1000V VSM, the VSM uses an extension-key for identification. During the configuration process the VSM spawns a DVS and will configure it with the same extension-key. Due to the matching extension keys (extension session) the VSM owns the DVS essentially.
And only the VSM with the same extension-key as the DVS can delete the DVS.
So to be able to delete a DVS, a VSM must exist registered with the same extension key.
If you deleted the VSM and are stuck with an orphaned DVS, the first thing to do is to install and configure a new VSM. Use a different switch name than the first (deleted) VSM. The new VSM will spawn a new DVS matching the switch name configured within the VSM.
The first step is to remove the new spawned DVS and do this the proper way using commands from within the VSM virtual machine.
My background is Fibre Channel and beginning 2009 I implemented a large iSCSI environment. The “other” storage protocol supported by VMware, NFS, is rather unknown to me. And to be honest I really tried to keep away from it as much as possible, thinking it was not a proper enterprise worthy solution. That changed this month as I was asked to perform a design review of an environment which relies completely of NFS storage. This customer decided to use IP-Hash as load-balancing policy for their NFS vSwitch, but what Impact does this have on the NFS environment?
In my first post I had a question about the path data travels when sent to a “standby” virtual connect module.
To quote my own question :
“What will happen if the VMkernel decides to use that nic to send IO?
Is the Flexnic aware of the standby status of it “native” uplink? Will it send data to the uplink of the VC module it’s connected to or will it send data to the active uplink?
How is this done? Will it send the IO through the midplane or CX-4 cable to the VC module with the active uplink? And if this occurs what will be the added latency of this behavior?
HP describes the standby status as blocked, what does this mean? Will virtual connect discard IO send to the standby IO, will it not accept IO and how will it indicate this?”