The Wayback Machine - https://web.archive.org/web/20120830153916/http://java.sys-con.com:80/node/37613

Welcome!

Java Authors: Liz McMillan, Maureen O'Gara, Elizabeth White, John Bantleman, Kevin Nikkhoo

Related Topics: Java

Java: Article

Avoid Bothersome Garbage Collection Pauses

Avoid Bothersome Garbage Collection Pauses

Many engineers complain that the non-deterministic behavior of the garbage collector prevents them from utilizing the Java environment for mission-critical applications, especially distributed message-driven displays (GUIs) where user responsiveness is critical. We agree that garbage collection does occur at the worst times: for example, when a user clicks a mouse or a new message enters the system requiring immediate processing. These events must be handled without the delay of in-progress garbage collection. How do we prevent these garbage collection pauses that interfere with the responsiveness of an application ("bothersome pauses")?

We have discovered a very effective technique to prevent bothersome garbage collection pauses and build responsive Java applications. This technique or pattern is especially effective for a distributive message-driven display system with soft real-time constraints. This article details this pattern in three simple steps and provides evidence of the effectiveness of the technique.

Pattern to Control Garbage Collection Pauses
The Java environment provides so many benefits to the software community - platform independence, industry momentum, a plethora of resources (online tutorials, code, interest groups, etc.), object-oriented utilities and interfaces (collections, network I/O, Swing display, etc.) that can be plugged in and out - that once you have experienced working with Java it's hard to go back to traditional languages. Unfortunately, in some mission-critical applications, like message-driven GUIs that must be very responsive to user events, the requirements force you to take that step backward. There's no room for multiple second garbage collection pauses. (The garbage collector collects all the "unreachable" references in an application so the space consumed by them can be reused. It's a low-priority thread that usually only takes priority over other threads when the VM is running out of memory.) Do we really have to lose all the benefits of Java? First, let's consider the requirements.

A system engineer should consider imposing requirements for garbage collection like the following list taken from a telecom industry example (see References).
1.  GC sequential overhead on a system may not be more than 10% to ensure scalability and optimal use of system resources for maximum throughput.
2.  Any single GC pause during the entire application run may be no more than 200ms to meet the latency requirements as set by the protocol between the client and the server, and to ensure good response times by the server.

Armed with these requirements, the system engineer has defined the worst-case behavior in a manner that can be tested.

The next question is: How do we meet these requirements? Alka Gupta and Michael Doyle make excellent suggestions in their article (see References). Their approach is to tune the parameters on the Java Virtual Machine (JVM). We take a slightly different approach that leaves the use of parameter definitions as defined by the JVM to be used as a final tuning technique.

Why not tell the garbage collector what and when to collect?

In other words, control garbage collection via the software architecture. Make the job of the garbage collector easy! This technique can be described as a multiple step pattern. The first step of the pattern is described below as "Nullify Objects." The second step involves forcing garbage collection to occur as delineated in "Forcing Garbage Collection." The final step involves either placing persistent data out of the reach of the collector or into a data pool so that an application will continue to perform well in the long run.

Step 1: Nullify Objects
Memory leaks strike fear into the hearts of programmers! Not only do they degrade performance, they eventually terminate the application. Yet memory leaks prove very subtle and difficult to debug. The JVM performs garbage collection in the background, freeing the coder from such details, but traps still exist. The biggest danger is placing an object into a collection and forgetting to remove it. The memory used by that object will never be reclaimed.

A programmer can prevent this type of memory leak by setting the object reference and all underlying object references ("deep" objects) to null when the object is no longer needed. Setting an object reference to "null" tells the garbage collector that at least this one reference to the object is no longer needed. Once all references to an object are cleared, the garbage collector is free to reclaim that space. Giving the collector such "hints" makes its job easier and faster. Moreover, a smaller memory footprint also makes an application run faster.

Knowing when to set an object reference to null requires a complete understanding of the problem space. For instance, if the remote receiver allocates the memory space for a message, the rest of the application must know when to release the space back for reuse. Study the domain. Once an object or "subobject" is no longer needed, tell the garbage collector.

Thus, the first step of the pattern is to set objects to null once you're sure they're no longer needed. We call this step "nullify" and include it in the definition of the classes of frequently used objects.

The following code snippet shows a method that "nullifies" a track object. The class members that consist of primitives only (contain no additional class objects) are set to null directly, as in lines 3-5. The class members that contain class objects provide their own nullify method as in line 9.

1 public void nullify () {
2
3 this.threatId = null ;
4 this.elPosition = null ;
5 this.kinematics = null ;
6
7 if (this.iff != null)
8 {
9 this.iff.nullify();
10 this.iff = null ;
11 }
12 }

The track nullify is called from the thread that has completed processing the message. In other words, once the message has been stored or processed, that thread tells the JVM it no longer needs that object. Also, if the object was placed in some Collection (like an ArrayList), it's removed from the Collection and set to null.

By setting objects to null in this manner, the garbage collector and thus the JVM can run more efficiently. Train yourself to program with "nullify" methods and their invocation in mind.

Step 2: "Force" Garbage Collection
The second step of the pattern is to control when garbage collection occurs. The garbage collector, GC, runs as Java priority 1 (the lowest priority). The virtual machine, VM, runs at Java priority 10 (the highest priority). Most books recommend against the usage of Java priority 1 and 10 for assigning priorities to Java applications. In most cases, the GC runs during idle times, generally when the VM is waiting for user input or when the VM has run out of memory. In the latter case, the GC interrupts high-priority processing in the application.

Some programmers like to use the "-Xincgc" directive on the Java command line. This tells the JVM to perform garbage collection in increments when it desires. Again, the timing of the garbage collection may be inopportune. Instead, we suggest that the garbage collector perform a full garbage collection as soon as it can in either or both of two ways:
1.  Request garbage collection to happen as soon as possible: This method proves useful when the programmer knows he or she has a "break" to garbage collect. For example, after a large image is loaded into memory and scaled, the memory footprint is large. Forcing a garbage collection to occur at that point is wise. Another good area may be after a large message has been processed in the application and is no longer needed.
2.  Schedule garbage collection to occur at a fixed rate: This method is optimal when the programmer does not have a specific moment when he knows his application can stop shortly and garbage collect. Normally, most applications are written in this manner.

Listing 1 introduces a class named "BetterControlOfGC". It's a utility class that provides the methods described earlier. There are two public methods: "suggestGCNow()" and "scheduleRegularGC(milliseconds)" that respectively correspond to the steps described earlier. Line 7 suggests to the VM to garbage collect the unreachable objects as soon as possible. The documentation makes it clear that the garbage collection may not occur instantaneously, but experience has shown that it will be performed as soon as the VM is able to accomplish the task. Invoking the method on line 25 causes garbage collection to occur at a fixed rate as determined by the parameter to the method.

In scheduling the GC to occur at a fixed rate, a garbage collection stimulator task, GCStimulatorTask, is utilized. The code extends the "java.util.timer" thread in line 10. No new thread is created; the processing runs on the single timer thread available beginning with the Java 1.3 environment. Similarly, to keep the processing lean, the GC stimulator follows the Singleton pattern as shown by lines 18-23 and line 27. There can be only one stimulator per application, where an application is any code running on an instance of the JVM.

We suggest that you set the interval at which the garbage collector runs from a Java property file. Thus you can tune the application without having to recompile the code. Write some simple code to read a property file that's either a parameter on the command line or a resource bundle in the class path. Place the command parameter "-verbose:gc" on your executable command line and measure the time it takes to garbage collect. Tune this number until you achieve the results you want. If the budget allows, experiment with other virtual machines and/or hardware.

Step 3: Store Persistent Objects into Persistent Data Areas or Store Long-Lived Objects in Pools
Using persistent data areas is purely optional. It supports the underlying premise of this article. In order to bind the disruption of the garbage collector in your application, make its job easy. If you know that an object or collection of objects would live for the duration of your application, let the collector know. It would be nice if the Java environment provided some sort of flag that could be placed on objects upon their creation to tell the garbage collector "-keep out". However, there is currently no such means. (The Real-Time Specification for Java describes an area of memory called "Immortal Memory" where objects live for the duration of the application and garbage collection should not run.) You may try using a database; however, this may slow down your application even more. Another solution currently under the Java Community Process is JSR 107. JCache provides a standard set of APIs and semantics that allow a programmer to cache frequently used data objects for the local JVM or across JVMs. This API is still under review and may not be available yet. However, we believe it holds much promise for the Java developer community. Keep this avenue open and in mind for future architectures. What can we do now?

The pooling of objects is not new to real-time programmers. The concept is to create all your expected data objects before you begin processing, then all your data can be placed into structures without the expense of instance creation during processing time. This has the advantage of keeping your memory footprint stable. It has the disadvantage of requiring a "deep copy" method to be written to store the data into the pool. (If you simply set an object to another, you're changing the object reference and not reusing the same space.) The nanosecond expense of the deep copy is far less than that of the object instance creation.

If the data pooling technique is combined with the proper use of the "nullify" technique, garbage collection becomes optimized. The reasons are fairly straightforward:
1.  Since the object is set to null immediately after the deep copy, it lives only in the young generation portion of the memory. It does not progress into the older generations of memory and thus takes less of the garbage collector's cycle time.
2.  Since the object is nullified immediately and no other reference to it exists in some other collection object in the application, the job of the garbage collector is easier. In other words, the garbage collector does not have to keep track of an object that exists in a collection.

When using data pools, it's wise to use the parameters "-XX:+UseConcMarkSweepGC -XX:MaxTenuringThreshold=0 -XX:SurvivorRatio=128" on the command line. These tell the JVM to move objects on the first sweep from the new generation to the old. It commands the JVM to use the concurrent mark sweep algorithm on the old generation that proves more efficient since it works "concurrently" for a multi-processor platform. For single processor machines, try the "-Xincgc" option. We've seen those long garbage collector pauses, which occur after hours of execution, disappear using this technique and these parameters. Performing well in the long run is the true benefit of this last step.

Performance Results
Typically, most engineers want proof before changing their approach to designing and coding. Why not? Since we're now suggesting that even Java programmers should be concerned about resource allocation, it better be worth it! Once upon a time, assembly language and C programmers spent time tweaking memory and register usage to improve performance. This step was necessary. Now, as higher-level object-oriented programmers we may disdain this thought. This pattern has dared to imply that such considerations, although not as low level as registers and memory addresses (instead at the object level), are still necessary for high-performance coding. Can it be true?

The underlying premise is that if you know how your engine works, you can drive it better to obtain optimal performance and endurance. This is as true for my 1985 300TD (Mercedes, five cylinder, turbo diesel station wagon) with 265,000 miles as for my Java code running on a HotSpot VM. For instance, knowing that a diesel's optimal performance is when the engine is warm since it relies on compression for power, I let my car warm up before I "push it." Similarly, I don't overload the vehicle with the tons of stuff I could place in the tailgate. HotSpot fits the analogy. Performance improves after the VM "warms up" and compiles the HotSpot code into the native language. I also keep my memory footprint lean and light. The comparison breaks down after awhile, but the basic truth does not change. You can use a system the best when you understand how it works.

Our challenge to you is to take statistics before and after implementing this pattern on just a small portion of your code. Please recognize that the gain will be best exemplified when your application is scaled upward. In other words, the heavier the load on the system, the better the results.

The following statistics were taken after the pattern was applied. They are charted as:
1.  Limited nullify method invocation is used where only the incoming messages are not "nullified." (The remainder of the application from which the statistics were taken was left intact with a very lean memory usage.) There is no forced garbage collection.
2.  Nullify method invocation and forced garbage collection is utilized.

The test environment is a Microsoft Windows 2000 X86 Family 15 Model 2 Stepping 4 Genuine Intel ~1794MHz laptop running the BEA WebLogic Server 7.0 with Service Pack 7.1 with a physical memory size of 523,704KB. The Java Message Server (JMS server), a track generator, and a tactical display are all running on the same laptop over the local developer network (MAGIC). The server makes no optimizations, even though each application resides locally. The JVMs are treated as if they were distributed across the network. They're running on the J2SE 1.4.1 release.

The test target application is a Java Swing Tactical Display with full panning, zooming, and track-hooking capabilities. It receives bundles of tracks via the Java Message Service that are displayed at their proper location on the given image. Each track is approximately 88 bytes and the overall container size is about 70 bytes. This byte measurement does not include all the additional class information that's also sent during serialization. The container is the message that holds an array of tracks that contains information such as time and number of tracks. For our tests, the tracks are sent at a 1Hz rate. Twenty sets of data are captured.

To illustrate the test environment, a screen capture of a 5,000 track load (4,999 tracks plus the ship) is shown in Figure 1. The background shows tracks rendered with the Military Standard 2525B symbology over an image of the Middle East. The small window titled "Track Generator Desktop" is a minimized window showing the parameters of the test set through the track generator application. Notice that 45 messages had been sent at the time of the screen capture. Directly beneath this window sits the Windows Task Manager. Note that the CPU utilization is at 83%. At first this doesn't seem that bad. But at that rate, there isn't much room for the user to begin zooming, panning, hooking tracks, and so on. The final command window to the right is that of the tactical display application. The parameter "-verbose:gc" is placed on the Java command line (java -verbose:gc myMainApplication.class). The VM is performing the listed garbage collection at its own rate, not by command of the application.

The final test of 10,000 tracks performed extremely poorly. The system does not scale; the CPU is pegged. At this point most engineers may jeer at Java again. Let's take another look after implementing the pattern.

After implementation, where the nullify methods are invoked properly and garbage collection is requested at a periodic interval (2Hz), dramatic improvements are realized. The last test of 10,000 tracks proves that the processor still has plenty of room to do more work. In other words, the pattern scales very well.

Performance Summary
The pattern to help control garbage collection pauses most definitely improves the overall performance of the application. Notice how well the pattern scales under the heavier track loads in the performance bar chart in Figure 2. The darker middle bar shows the processor utilization at each level of the message (track) load. As the message traffic increases, the processor utilization grows more slowly than without the pattern. The last light-colored bar shows the improved performance. The main strength of the pattern is how well it scales under heavy message loads.

There is another subtle strength to the pattern. This one is difficult to measure since it requires very long-lived tests. If Step 3 is faithfully followed, those horribly long garbage collection pauses that occur after hours of running disappear. This is a key benefit to the pattern since most of our applications are designed to run "forever."

We're confident that many other Java applications would benefit from implementing this very simple pattern.

The steps to control garbage collection pauses are:
1.  Set all objects that are no longer in use to null and make sure they're not left within some collection. "Nullify" objects.
2.  Force garbage collection to occur both:

  • After some major memory-intense operation (e.g., scaling an image)
  • At a periodic rate that provides the best performance for your application
    3.  Save long-lived data in a persistent data area if feasible or in a pool of data and use the appropriate garbage collector algorithm.

    By following these three simple steps, you'll avoid those bothersome garbage collection pauses and enjoy all the benefits of the Java environment. It's time the Java environment was fully utilized in mission-critical display systems.

    References

  • Gupta, A., and Doyle, M. "Turbo-Charging the Java HotSpot Virtual Machine, v1.4.x to Improve the Performance and Scalability of Application Servers": http://developer.java.sun.com/developer/ technicalArticles/Programming/turbo/
  • JSR 1, Real-Time Specification for Java: http://jcp.org/en/jsr/detail?id=1
  • Java HotSpot VM options: http://java.sun.com/docs/hotspot/VMOptions.html
  • Java Specification Request for JCache: http://jcp.org/en/jsr/detail?id=107
  • More Stories By Lillian Andres

    Lillian Andres loves the challenge of not only architecting mission critical systems but also implementing them with the latest technologies. Her passion is performance, and she is often teased by colleagues on which stress test she is running in the background at her desk.
    Lillian brings 20+ years experience to Lockheed-Martin as a Lead Member of the Engineering Staff.

    Comments (5) View Comments

    Share your thoughts on this story.

    Add your comment
    You must be signed in to add a comment. Sign-in | Register

    In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


    Most Recent Comments
    Gunther 07/09/03 06:49:00 AM EDT

    I have the impression that nullify is not so well explained in the article. Especially its use in the code snippet (where the object in the attribute iff is first nullified and then iff is set to null) is confusing: the object itself should not be nullified if iff is set to null. Java would have found that this object can be garbage collected if it is not referred from somewhere else.

    The nullify is actually useful and important, especially in the case where you know that an object's attributes will be set before further usage of the object. In that case the current values of the attributes do not matter, and should not contain unnecessary references. E.g., a non-used object in a pool. When it is taken from the pool, you first have to set its attributes properly. Then you can use it. When putting it back in the pool, you should release all references that it has ("nullify it") so that it does not keep unnecessary references.

    Dmitry 07/09/03 03:36:00 AM EDT

    This technique effectively nullifies the advantage of automatic garbage collection. new/"nullify" is pretty much like new/delete in C++ or malloc/free in C.

    A more viable alternative would be for the JVM vendor to provide an alternative, possibly non-generational GC having the full-collection cycle optimized to the maximum extent possible, and run it automatically from time to time.

    J. David Beutel 07/08/03 09:15:00 PM EDT

    I'm skeptical that this nullify technique provides a significant performance improvement, especially one large enough to justify its additional complexity. Most of the documented improvement could be attributed to the other two techniques: caching and frequently scheduled GC. I'd like to see a benchmark isolating the nullify technique. If it is effective, then I suspect it would be better to optimize it in the JVM.

    11011011

    Adrian Adrigan 07/25/03 07:48:00 AM EDT

    From studying the article, I think you all may be missing the point. The article hones in on performance, not your everyday Java programming. It makes sense to me that if you make the job of the gc easy by telling it what to reclaim that you do get better performance. This works on my non-display applications as well. Try it. Sometimes the simplest solutions really are the best.

    J. David Beutel 07/18/03 02:20:00 AM EDT

    In agreement with Gunther, yes, it's important to avoid memory leaks, especially when caching. Java can't do everything for us. So sometimes we have to set references to null. But the problem with the nullify technique presented by this article is that it recursively nullifies all objects as soon as they're not in use. That's a lot more complicated than necessary, especially when sharing an object from a cache. How does the nullifier know that nothing else is using that sub-object? Such reference counting is really a job for the garbage collector. (So is caching and pooling, if it's done only for the sake of local memory management, as in this article.)