Re: impact of large pages on consolidation ratios

Gabe wrote an article about the impact of large pages on the consolidation ratio, I want to make something clear before the wrong conclusions are being made.

Large pages will be broken down if memory pressure occurs in the system. If no memory pressure is detected on the host, i.e the demand is lower than the memory available, the ESX host will try to leverage large pages to have the best performance.

Just calculate how big the Translation lookaside Buffer (TLB)is when a 2GB virtual machine use small pages (2048MB/4KB=512.000) or when using large pages 2048MB/2.048MB =1000. The VMkernel need to traverse the TLB through all these pages. And this is only for one virtual machine, imagine if there are 50 VMs running on the host.

Like ballooning and compressing, if there is no need to over-manage memory than ESX will not do it as it generates unnecessary load.

Using Large pages shows a different memory usage level, but there is nothing to worry about. If memory demand exceeds the availability of memory, the VMkernel will resort to share-before-swap and compress-before-swap. Resulting in collapsed pages and reducing the memory pressure.

Comments

  1. says

    Fully agree here Frank. Impact is something slightly exaggerated, will do a post of my own this week at some point.

  2. Matt Lynn says

    Frank –

    Thanks for the post. I’m still trying to understand where zero pages fit into this. Recent posts here & elsewhere have indicated that ‘zero pages are recognized and collapsed (via TPS) immediately’.

    Does this happen on allocation? Or just when memory pressure is being felt and TPS kicks in (‘immediate recognition’ perhaps being based on hints)?

    Ie, is there still a chance that a large non-shared page could exist entirely populated with zeros? If so, when would it get collapsed/TPS’d — when memory pressure kicks in?

    Or, is the zero page a special TPS case that is recognized even when there is no pressure?

    Thanks for any clarifications.

  3. YP Chien says

    Thanks for the post and great information.
    I would like to share the following two findings and please do correct me if I am wrong:
    We have done extensive memory allocation and memory reclamation tests on ESX 4.1 and found the following:
    1. On ESX servers with Intel Nehalem processors or later (with Extended Page Tables – EPT virtualization support), TPS does not kick-in and we can’t seem to verify TPS does even when ESX is under memory pressure. (e.g. Ballooning and Kernel swapping take effects)
    2. Our tests further showed that all Windows Operating Systems zero all their memory pages during the boot process. ESX recognizes this zeroing mechanism and doesn’t actually “allocate” physical memory to these pages. Since these pages were zeroed out by the guest OS anyway, ESX pools all of them under “shared” pages (SHRD), and the “shared saved” (SHDSVD) statistics. These statistics only indicate that the memory pages should have been allocated to specific VM but were not as the VM did not need them yet. As such, we found that Windows VM boot time is almost negligible and we could boot lots VMs at the same time without causing the boot storm.