What is Elastic Memory for Java?

The Java virtual machine (JVM) manages memory automatically for applications, storing objects in heap memory pre-allocated to it by the operating system, and freeing the memory when objects are no longer needed. However, it takes a lazy approach, allowing unreferenced objects to remain in heap memory until it is necessary to free up space to store additional objects. This lazy approach to memory management works fine when Java is running directly on physical hardware. However, it can lead to performance problems if you are trying to leverage certain key benefits of virtualization. For Java applications running on ESXi that require consistent, fast response times, the traditional best practice advice has been to reserve 100% of the virtual machine memory and to make the Java heap size as small as possible to avoid wasting memory. This limitation made Java unable to benefit as effectively as other workloads from ESXi’s industry-leading memory reclamation technology.

Elastic Memory for Java (EM4J) manages a memory balloon that sits directly in the Java heap and works with new memory reclamation capabilities introduced in ESXi 5.0. EM4J works with the hypervisor to communicate system-wide memory pressure directly into the Java heap, forcing Java to clean up proactively and return memory at the most appropriate times—when it is least active. You no longer have to be so conservative with your heap sizing because unused heap memory is no longer wasted on uncollected garbage objects. And you no longer have to give Java 100% of the memory that it needs; EM4J ensures that memory is used more efficiently, without risking sudden and unpredictable performance problems.

EM4J is an add-on to vFabric tc Server. With EM4J and tc Server, you can run more Java applications on your ESXi servers with predictable performance, squeezing the most value out of your hardware investment.

How ESXi and EM4J Improve Memory Utilization

A chief benefit of virtualization is the ability to maximize physical host resources by adding virtual machines to consume unused computing capacity. Recouping unused CPU cycles lowers the costs of hardware, service, management, maintenance, and energy. Available physical memory can limit the number of virtual machines you can deploy on a host, even when CPU is still underutilized. To get the most out of your hardware, you can either add more physical memory or use the existing memory more efficiently.

ESXi ensures the efficient use of physical memory by employing shared memory pages, ballooning, memory compression and, as a last resort, disk swapping. These techniques allow ESXi to reduce the amount of physical memory consumed by the virtual machines. Disk swapping, where memory pages are written to and from disk, is least efficient because disk I/O is much slower than memory I/O. With Java, disk swapping is especially disruptive because when garbage collection occurs, page swapping activity can spike, causing performance degradations and failures.

Transparent Page Sharing

ESXi constantly scans memory pages in virtual machines for duplicate contents and can reduce memory overhead by eliminating the duplication. The indirection between virtual memory pages in the virtual machines and physical memory pages on the host make this possible. Duplicate pages are mapped to the same physical page, and if a virtual page is modified, a unique new copy is created transparently. For example, initialized memory pages can be shared if they are written with zeros and static code in virtual machines running the same OS and applications. When a virtual machine writes to a shared page, it is given a unique copy, which breaks the sharing for that page. The pages reclaimed by transparent page sharing can significantly improve your consolidation ratio.

Memory Compression

ESXi employs memory compression to avoid swapping when memory becomes tight. To free memory, ESXi attempts to compress memory pages and save them in a compression cache. Compressing and decompressing uses CPU cycles, but is still far more efficient than disk I/O. Memory compression is the last opportunity to reclaim memory before disk swapping.

VMware Tools Balloon

There is an important separation of concerns between ESXi and the virtual machines running on it; the memory management techniques employed within a virtual machine are completely opaque to the hypervisor. The hypervisor therefore has no knowledge of how memory allocated to a virtual machine is being used and when it swaps memory to disk or compresses memory, it could swap out memory that is still in use. A far more efficient mechanism of reclaiming memory from a virtual machine is the VMware Tools balloon driver, which runs as a process within the virtual machine, allocates and pins unused memory, and communicates the pages back to the hypervisor. The memory owned by the balloon driver can then be temporarily de-coupled from the virtual machine and used elsewhere.

ESXi gives the balloon driver a target size, and the balloon driver attempts to fulfill the request. Under the control of the ESXi hypervisor, balloons in each host virtual machine expand or shrink depending on the shifting requirements of the virtual machines. ESXi calculates balloon targets based on virtual machine activity; a less active virtual machine gets a higher balloon target and the reclaimed memory moves to the more active virtual machines.

A balloon in the guest operating system forces the OS to use memory more conservatively. Operating systems use every bit of available memory to improve system performance. However, caching inactive data in one virtual machine while another virtual machine has scarce memory for immediate needs is poor use of memory in a virtualization environment. Expanding the balloon in a virtual machine constrains the OS from using memory for low priority storage and allows the hypervisor to distribute physical memory resources where they are most needed.

EM4J Balloon

When a Java application starts, the operating system allocates memory to the JVM. The allocated memory is then managed by the JVM. On a virtual machine deployed to serve an enterprise Java application, the Java heap typically occupies the greatest portion of allocated memory. The separation of concerns between the operating system and the JVM is very similar to that between the ESXi hypervisor and virtual machines. The JVM manages its object heap as a single block of memory which is entirely opaque to the OS. When Java objects become garbage, there is no way that the OS can reclaim this memory, and if the OS cannot reclaim the memory, then neither can the VMware tools balloon. For this reason, in order for the hypervisor to reclaim memory from a virtual machine in which the Java heap occupies the majority of the available memory, a balloon operating within the Java heap can be much more efficient.

When EM4J is enabled in the virtual machine and a JVM executing in the virtual machine, it serves ESXi hypervisor ballooning requests from the JVM heap instead of from the OS memory pool. Just as the guest balloon allocates memory from the OS memory pool, EM4J allocates memory from the Java heap by creating and storing a few large Java objects. To avoid fragmentation, EM4J does not pin memory in the Java heap. The ESXi hypervisor sets the target size of the balloon, and EM4J attempts to satisfy the request.

Like the VMware tools guest balloon, the EM4J balloon encourages the JVM to use the available memory more conservatively. When the EM4J balloon inflates, it can force the JVM to clean up garbage and return memory to the hypervisor, typically during periods of low activity. When the JVM needs the memory, typically during an increase in activity, the next garbage collection clears out some of the balloon and that memory is once again available to the JVM. This is the "elastic" nature of EM4J memory management; the amount of ballooned heap memory grows or shrinks according to the relative requirements of all of the virtual machines on the host.

A welcome side effect of EM4J is that you can size the Java heap less conservatively without wasting memory. Traditionally, to determine the optimum size of the Java heap, you determined the application's peak requirement. Allocating less could mean unacceptable performance, errors, or crashes; allocating more potentially wastes memory. With EM4J you can size the heap to comfortably accommodate the peak requirement, and excess memory is ballooned away and redistributed where it is needed.