Prevent knockout monster VM’s by this patch and heap configuration

It is likely you never have heard of VMFS heap size used by VMware vSphere. This probably means you have not run into issues yet. However when using VMware vSphere AND virtual disks located on VMFS volumes AND relative large sized virtual disks you can run into some serious issues!

This posting will give the reader some insight in how to avoid those issues by adjusting settings and apply a recent published VMware patch.

An increasing number of organizations are using hosts with serious amounts of  physical memory and compute power to be able to run many VMs per host. VM’s having large amounts of memory  and virtual CPU’s are called monster VM’s.

vmware_monster_vm

However those monsters can easily be made unconscious when those VMs are running a couple of large VMDK’s. For instance you  might run into issues when running multiple virtual machines like Exchange Servers, file servers  or database servers with large VMDK’s on the same ESX host.

In a default installation each VMware vSphere4.1 and  5.x host can address a maximum of 8 TB of open VMDK files on VMFS-5 volumes. So in case the sum of all active VMDK’s on a given ESX host is over 8 TB you will run into issues. See later in this posting what kind of issues.

This is caused by the size of the VMFS heap in the vmkernel.

The maximum heap size in ESX4.1 /ESX5.x is 128 MB in a default installation. The maximum heap size has been further increased in ESXi 5.0 Patch, ESXi500-201303001 to 640MB, which should allow for 60TB of open virtual disk capacity on a single ESX/ESXi host. This patch was released by VMware at March 28, 2013.

What is VMFS heap size?

VMFS heap is  part of the host’s physical memory in use by the kernel which is reserved for  file handling of VMFS volumes. The heap memory contains pointers to data blocks on VMDK files on VMFS volumes. I could not find more info on heap size.

Heap size is set in the advanced settings of the ESX host. The default setting of VMFS3.MaxHeapSizeMB in vSphere 4.1 & 5.x is such that  on a given ESX host a maximum of 8 TB of VMDK files located on VMFS5 formatted volumes can be opened. This restriction does not apply to RDM of NFS.

Reported issues when running out of heap size

Various issues are reported when heap size is to low. It depends on the action performed. “Cannot allocate memory”  seems to be the most displayed error when heap size is full.

  1. Errors when performing a vMotion: A general system error occurred: Source detected that destination failed to resume.
  2. Virtual machines do not start. Errors shown are (for example in vmkernel.log)
    The VM failed to resume on the destination during early power on.
    Reason: 0 (Cannot allocate memory).
    Cannot open the disk '<<Location of the .vmdk>>' or one of the snapshot disks it depends on.
    vSphere HA unsuccessfully failed over this virtual machine. vSphere HA will retry if the maximum number of attempts has not been exceeded. Reason: Cannot allocate memory.
    WARNING: Heap: 2525: Heap vmfs3 already at its maximum size. Cannot expand.
    WARNING: Heap: 2900: Heap_Align(vmfs3, 2099200/2099200 bytes, 8 align) failed. caller: 0x4180368c0b90
    An unexpected error was received from the ESX host while powering on VM vm-xxx. Reason: (Cannot allocate memory)
  3. Write errors reported in Windows guests when copying large amounts of data from and to large VMDK files
  4. Errors running Microsoft Exchange Server Jetstress
  5. Windows reporting lack of storage capacity.
  6. Issues reported in vSphere Client when performing storage migrations (storage vMotion).
    A storage vmotion of a large VMDK fails at 10%. A storage vMotion of a smaller VMDK completes with no issues.
    Relocate virtual machine <virtual machine name>. A general system error occurred: Storage VMotion failed to copy one or more of the VM ‘s disks.  Please consult the VM’s log for more details, looking for lines starting with “SVMotion”.
    “Failed to create one or more destination disks. Canceling Storage vMotion.
    Storage vMotion failed to create the
    destination disk /vmfs/volumes/….
    (Cannot allocate memory). “
  7. Issues while performing a backup. Backup-applications mount VMDK files. If during a backup job each time a large VMDK is added the heap size maximum could be reached.

In /var/log/vmfs/volumes/DatastoreName/VirtualMachineName/vmware.log you will see  Cannot allocate memory errors listed.

Heap size and maximum active VMDK files

The default heap size in ESXi/ESX 3.5/4.0 for VMFS-3 is set to 16 MB. This allows for a maximum of 4 TB of open virtual disk capacity on a single ESXi/ESX host if a blocksize of 1 MB is set. If a blocksize of 8 MB is set, the maximum is 32 MB

The default heap size has been increased in ESXi/ESX 4.1 and ESXi 5.x to 80 MB, which allows for 8 TB to 10 TB of open virtual disk capacity on a single ESXi/ESX host. .

In ESXi 5.x, the maximum heap size is 256 MB without hotfix. This allows 25TB to 30 TB of open files/VMDKs can be addressed by a single ESXi host

The default heap size has been further increased in ESXi 5.0 Patch, ESXi500-201303001 to 640MB, which should allow for 60TB of open virtual disk capacity on a single ESX/ESXi host.

To make an adjusted  heap size active the ESX host needs a reboot!

How to check active heap usage

The usage of the heap memory can be checked by the command below given from the CLI.

~ # vsish /> cat /system/heaps/vmfs(hit Tab key to auto fill the rest of this)/stats   (hit Enter)

For example the output might look like this.

~ # vsish
/> cat /system/heaps/vmfs3-0x410016400000/stats
Heap stats {
   Name:vmfs3
   dynamically growable:1
   physical contiguity:MM_PhysContigType: 1 -> Any Physical Contiguity
   lower memory PA limit:0
   upper memory PA limit:-1
   may use reserved memory:0
   memory pool:19
   # of ranges allocated:2
   dlmalloc overhead:1008
   current heap size:12587984
   initial heap size:2097152
   current bytes allocated:10620048
   current bytes available:1967936
   current bytes releasable:4000
   percent free of current size:15
   percent releasable of current size:0
   maximum heap size:83886080
   maximum bytes available:73266032
   percent free of max size:87
   lowest percent free of max size ever encountered:84

Percent free of max size (87 in example) tells you the actual free heap size. Lowest percent free of max size ever encountered:84 . If this value is around 20 it is time to increase the heap size and.or apply the VMware hotfix ESXi500-201303001.

How to adjust the heap size

Solving or preventing issues like reported above are simple: adjust on each ESX host the VMFS3.MaxHeapSizeMB setting, reboot the host and done. For ESX(i) 4.x and 5.x the default value is 80. That means 80 MB of host memory is reserved for heap. Increase it to the maximum setting of 256 MB. This will ‘cost’  176 MB of internal memory on the host. This is a no-brainer, 176 MB of internal memory will cost nothing compared to the possible issues caused by the default value of  VMFS3.MaxHeapSizeMB.

To adjust the value:

  1. Log into vCenter Server or the ESXi/ESX host using the vSphere Client or VMware Infrastructure (VI) Client. If connecting to vCenter Server, select the ESXi/ESX host from the inventory.
  2. Click the Configuration tab.
  3. Click Advanced Settings.
  4. Select VMFS3.
  5. Update the field in VMFS3.MaxHeapSizeMB.

Although the setting is named VMFS3.MaxHeapSizeMB this is still valid for VMFS5 as well!

vmfs-heap

VMware has a KB-article titled An ESX/ESXi host reports VMFS heap warnings when hosting virtual machines that collectively use 4 TB or 20 TB of virtual disk storage (1004424)

See this VMware KB article for more information on adjusting the heap size.

The future of VMFS and heap

VMware should adjust both the maximum size of a VMDK file and solve the maximum of active virtual disks per host. Competitor Hyper-V has a maximum of 64 TB per VHDX virtual disk file. I do not know if there is a maximum amount of active virtual disks per Hyper-V host.

This posting by VMware employee and storage expert Cormac Hogan hints on improvements on maximum VMDK size and improvements in the maximum amount of open VMDK files per host.

vmfs-changes

Additional information

http://blogs.vmware.com/vsphere/2012/08/vmfs-heap-considerations.html

http://bizsupport1.austin.hp.com/bizsupport/TechSupport/Document.jsp?prodSeriesId=3690376&prodTypeId=18964&objectID=c03216142

http://www.boche.net/blog/index.php/2012/09/

http://vmnick.com/2012/12/20/vmfs-5-heap-size/

http://longwhiteclouds.com/2012/09/17/the-case-for-larger-than-2tb-virtual-disks-and-the-gotcha-with-vmfs/

http://virtualkenneth.com/2011/02/15/vmfs3-heap-size-maxheapsizemb/

Advertisements

About Marcel van den Berg
I am a technical consultant with a strong focus on server virtualization, desktop virtualization, cloud computing and business continuity/disaster recovery.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: