|
|
|  |
 |

Better Software Home > In This Issue > Featured Article

 |
 |  | August 2007 |  | 
 |
 |
Your Mom Doesn’t Work Here
by Alan Berg
Java is a very popular object-oriented programming language that is used to build enterprise-level applications. Unlike C and C++, Java performs memory management for the programmer. While most memory-management operations are performed deep under the hood of the Java Virtual Machine, my experience shows that understanding Java's memory management is a critical aspect of the success of many large-scale systems. To gain an understanding of how Java memory management works, this article presents a simple example that illustrates the complexity of Java's memory-management functions.
About Java Memory Management
Since Java removed the chore of managing memory and the disastrous effects of invalid memory pointers and dangling references, why do we need to understand the gruesome details of Java memory management? For large-scale applications, understanding and adjusting memory-related options may mean the difference between a wonderful user experience and recurring system slowdowns and failures.
One of Java's claims to fame is the ability to "write once, run anywhere." To accomplish this, Java is compiled into byte code that is interpreted by the Java Virtual Machine (JVM). It is the JVM that performs (and hides) memory-management activities. Managing memory is primarily about cleaning up after the programmer—recovering memory associated with objects that no longer are used. The garbage collector performs this clean-up process. The collector verifies that the program no longer can reference an object and then removes the object, freeing its memory for other uses.
Garbage collection uses different algorithms—from marking in-use objects and sweeping away non-used ones to copying live objects from one storage area to another depending on the object's age. Modern JVMs often implement a number of algorithms that you can choose or combine to optimize garbage collection in different run-time scenarios. The Sun JVM 1.4 provides four collection algorithms. By default, JVM 1.4 uses a serial garbage collector. However, the designers optimized the serial collector for small-scale applications not large-scale ones. In version 1.5, the designers removed this potential weakness. Java 1.5 of large-scale applications is not negatively affected by minor collections. On the other hand, major garbage collection pauses all the threads of a running application to perform what is called a mark-and-sweep collection. This action may take a significant amount of time—milliseconds to seconds—for processing. If the virtual machine stops the application for this length of time, then the user may notice. Database caching may become confused, and other non-trivial problems may result including frustrated users waiting with blank screens.
At peak loads, a vicious cycle may occur. As the system creates more objects, at some point the garbage collector stops the application in order to perform a full collection. When the application resumes, it finds a pent-up demand that pushes the short-term load to much higher levels than anticipated. In response, the application creates more objects, thus triggering another full garbage collection.
So what triggers a minor or major garbage collection? To answer this question, let's consider an object's lifetime and its effect on collecting objects in the heap space.
Figure 1: A plot of the lifetime of objects vs. number for a typical application
Figure 1 is an idealized distribution that visually describes how object lifetimes have an early peak. In most applications, many objects have a high infant mortality rate—that is, they are created, used, and then deleted (or abandoned) quickly.
At the risk of oversimplification, the JVM splits the heap space into two main areas—one intended for short-life objects and the other for long-life objects. The JVM further divides the short-term space into Eden space, where the JVM first creates objects, and survivor space, where it moves older objects that are still in use. When Eden is full, the JVM culls that space and copies surviving objects into the survivor space. This process is a minor collection. Since many short-term objects are already dead, they are not copied in the process but are deleted and their memory reclaimed. If, in a typical situation, 98 percent of objects are short lived, then minor collecting is very efficient. On occasion, the JVM copies longer-lived objects into the second heap space, called the tenured space. If that space needs to be cleaned and compressed, then that event triggers a major collection during which the application stops. This creates the potential of a significant performance and stability impact. Figure 2 illustrates the various spaces.
Figure 2: A representation of the heap and non-heap spaces
By increasing the size of the heap space, you can affect how often garbage collection takes place and how long it takes. You also can vary the type and behavior of the garbage collector itself.
In practice, the best method for understanding the effect of changing the size of the heap is to stress test using a tool such as Jmeter (see the StickyNotes for a link) and observe the actual results. Theoretical calculations are unlikely to predict the actual effects accurately.
Get Things Going
In this section, I will present an example class and how to run the class from the command line with specific options that will enable garbage collection logging to a file. In addition, I'll explain the use of Java Management Extensions in combination with the Jconsole tool included in the standard Java Development Kit.
You might never guess by its name, but Java Management Extensions (JMX) is a full-fledged framework for managing applications. It is part of the standard JVM 1.5. In this situation, the important feature of this tool is that you can get almost real-time data from the JVM and from within a Java application (if written with JMX in mind) by connecting to a specific port and using the right communication protocol. Jconsole then builds highly intuitive graphs based on the data returned by JMX. For full details of the potential of JMX, visit its homepage (see the StickyNotes for a link).
When enabled by setting a command line option, Dcom.sun.management.jmxremote,Jconsole can gather JVM information and then plot and update a graphical display. For this example class to work, you must have Java 1.5 installed and the location to the Java/bin directory set in the PATH variable. To verify this, execute the command java–version. You should see output similar to:
java version "1.5.0_11"
Java™ 2 Runtime Environment, Standard Edition
(build 1.5.0_11-b03)
Java HotSpot™ Client VM (build 1.5.0_11-b03, mixed mode, sharing)
To compile the code shown in Listing 1, cut and paste the code into a text file called EatMemoryExample. java and compile locally via:
Listing 1: A Java class that has a main method that loops forever or until its memory is used up
javac -g EatMemoryExample .java
or, if your path has not been set, a command similar to:
"c:\Program Files\Java\jdk1.5.0_11\bin\javac.exe" -g
EatMemoryExample.java
Note: -g option tells the compiler to add debug symbols to the compiled class.
You now should have the compiled class in the same directory.
So what does this code do? Once the main class is called, a vector is created that can store other objects. Next, the class iterates through an infinite loop creating new objects that it continually attaches to the vector. The vector gets larger and larger until an out-of-memory error occurs.
To run the class file, type:
java -Xmx32M -Xloggc:/tmp/gc.txt -XX:+\
PrintGCDetailscom.sun.management.jmxremote
EatMemoryExample
Notice that -Xmx32M sets the maximum heap space to thirty-two megabytes. -Xloggc points to a file, and -XX:_PrintGCDetails forces the JVM to print extra details.
Note: These options work for the SUN JVM but may be different for other JVM implementations.
During the process of running, the JVM generates a garbage-collection log file with entries similar to those in figure 3.
Figure 3: Sample garbage-collection log file
The first part of the entry is the time in seconds since the JVM has started, and the remainder is details associated with the size before and after garbage collection for various spaces and the time it took for the collecting. For the gruesome details of garbage collecting in Java 1.5, the Sun Developer Network article "Tuning Garbage Collection with the 5.0 Java Virtual Machine" is an excellent source but a difficult read (see the StickyNotes for a link).
Visual display of garbage collection logs is helpful for understanding the data and for problem solving. I prefer the excellent visual log processor GCViewer from Tagtraum (see the StickyNotes for a link), whose output is shown in figure 4. On my Ubuntu Linux box, running the example class from
the command line and then running Jconsole opens Jconsole with a dialog with the process ID of the class as a choice for connection. Click OK to connect the graphical tool to the JVM and start the collection of information with a default frequency of five seconds.
Selecting the memory tab highlights a plot of heap memory usage against time. Notice how the usage is increasing. In the top left corner is a selection box that allows you to drill into different spaces of memory. On the right side is a button that can force a full garbage collection for the target application. If necessary, you can configure the JMX engine to be more secure by using additional command line options—for example, making information read-only through a certain port via SSL and client certificate authentication, but I have omitted this for the sake of clarity. Suffice it to say, JMX is not a security risk if properly configured.
The bottom left text in figure 4 gives a quick and relevant summary of the historical costs of garbage collecting, and the right side gives a graphical display of the usage of each allotted memory space. Jconsole displays the chart in near real-time.
Figure 4: The memory tab within Jconsole with no memory options set
Our example class slowly consumes memory as the vector is maintained through all the iterations of the while statement. The class eventually will fail with the following error message from the JVM:
Exception in thread "main"
java.lang.OutOfMemoryError:
Java heap space
The JVM kindly tells us we have run out of heap space. To accelerate the process, you can comment out the System.out statement in the code and recompile. The iterations then will occur more rapidly and the program will fail sooner.
Short-lived objects—those that exist for only one iteration of a loop—are culled early during a minor collection and should not affect the overall memory consumption over the course of program execution. We can see this if we replace the previous while statement with this one:
while(true){
EatMemoryExample timer=new
EatMemoryExample();
System. out.println(timer.getTime()+”: Free memory:
“+Runtime. getRuntime().freeMemory());
}
Notice how we are creating objects that we never use after the current iteration. After recompiling and watching from Jconsole, we can see in almost real-time the constant battle and equilibrium found by the creation of objects and their removal. Figure 5 shows a shallower version of the characteristic peaks and troughs associated with garbage collection, as we are generating minor collections that occur frequently but with relatively little expense.
Figure 5: The constant battle between object thrashing and cleaning up by the garbage collector
The Real World
The JVM places in tenured space objects that live long enough. Major garbage collection occurs when tenured space fills up. Therefore, the lifetime of objects can have impact not only on retaining memory but also on the time and frequency of garbage collecting (see the StickyNotes for more information on memory settings). Figure 6 displays a sadly realistic realworld pattern involving memory leakage over time. Roughly speaking, the baseline is associated with initial long-term costs of objects that have a lifetime of the application—for example, "singletons" that manage a specific part of the business process or the objects placed in the application context of a Web application.
Figure 6: An idealized picture of memory usage over time for a real-world Java application
A second observed pattern is the peaks and valleys of the garbage collector fighting the generation of relatively shortterm objects such as HTTP requests and longer-term objects such as HTTP sessions that do expire at some point.
The third and final trend shows the objects that do not die. The vector in the code example is our artificial attempt to simulate a memory leak. The vector holds objects for all time and slowly increases the number of objects it retains and thus memory. The cause of memory leaks can vary: from the failure to close connections to a database, LDAP, or network correctly to a poorly designed caching algorithm. The system tester must observe carefully the symptoms to discover the root cause. For example, using the netstat command may
show a build-up of connections, or the ps command may show an increase in the number of processes and the processes' age. My favorite command tail, –f, allows you to watch live log files for Tomcat servers. Studying the log files allows you to get to know the personality of the many memory-related issues you may have. If you are lucky, you may see that a particular type of event triggers a full garbage collection. If the event happens two or three times with the same process, then the pattern becomes clear.
Memory leaks may come from bugs in the JVM or the use of native libraries, which have their own leaks. If you see this upwardly walking pattern, then providing more memory (which by today's standards is cheap) will give your JVM extra time between failures, such as users waiting on blank screens because of long garbage collections, or the application simply not having enough memory to function. A positive side effect of throwing memory at the problem is that this sometimes helps the root cause to emerge more clearly.
Summary
Java Memory Management is all about the culling of objects that the application no longer references. The filling of tenured space forces major garbage collection activities that have the potential to affect adversely the smooth running of heavily loaded applications. Tools such as Jconsole, MC4J, and GCViewer give you insight into the allocation and de-allocation of memory over time. Stress testing with Jmeter accelerates failures and brings problems to the surface, especially if you deliberately starve your application of memory via the JAVA_OPTS environment variable.
I have only scratched the surface on the subject of JVM tuning, in which you have substantial control over the details. For example, changing maximum and minimum sizes or the ratio of different survivor spaces may mean better tuning and higher performance from the application. Nevertheless, because of the extra complexity and the associated rework issues as the system matures, beware of prematurely tuning Java applications unless required. If by necessity you must tune early, place your application under consistent load and then tune.
I wish you the best of luck in your garbage collection. {end}
Alan Berg (BSc.; MSc.; PGCE) has been a lead developer at the Central Computer Services at the University of Amsterdam for the past eight years. In his spare time, he writes computer articles. He remains agile by playing computer games with his kids who (sadly) consistently beat him.
|
|  | |