The new Cleaner
programming interface in OpenJDK 9 addresses a longstanding performance issue when working with short-lived native resources in Java.
Back in 2009, while working SQLite bindings, I encountered a performance issue related to Java finalization: Objects with finalizers are never short-lived because they are enqueued into a finalization queue. Even if the programmer deallocates the underlying native resource by calling an explicit close
method, the JVM will still record the fact that the finalize
method must be called eventually, despite the fact that it will not have to do anything because the native resources have already been deallocated.
The new java.lang.ref.Cleaner
class in OpenJDK 9 addresses this problem. It splits the original handle object (with the explicit close
method and the finalize
method) into three parts:
The handle object, which retains its original externally visible programming interface (including the close
method).
An internal state object, which contains all the information to identify the native resources and to eventually deallocate them. The internal state object is not referenced externally; all access is mediated by the handle object.
The cleaner infrastructure: A global Cleaner
object shared by all instances, and a Cleanable
object specific to each handle which is used to implement explicit deallocation via close
.
For our purposes, the key part is the Cleanable
object. When its clean
method is called, the state object is asked to deallocate its resources and at the same time, the cleaner infrastructure removes the associated handle object from its internal tracking because it is no longer necessary to deallocate resources when the handle object is no longer referenced.
To see how this works out in practice, I implemented a minimal example for testing and benchmarking purposes. The native resource is emulated with a volatile
long
. In practice, the long
value would be a native pointer. I added the volatile
keyword as an optimization barrier, to discourage Hotspot from optimizing away everything.
First, there is the test driver:
import java.lang.ref.Cleaner; import java.lang.ref.Reference; import java.util.function.Supplier; public final class Finalization { public static void main(String[] args) { Supplier<Implementation> source; switch (args[0]) { case "plain": source = () -> new PlainHandle(); break; case "finalize": source = () -> new FinalizeHandle(); break; case "Cleaner": source = () -> new CleanerHandle(); break; default: throw new IllegalArgumentException(args[0]); } int count = Integer.parseInt(args[1]); for (int i = 0; i < count; ++i) { run(source); } } private static void run(Supplier<Implementation> source) { try (Implementation handle = source.get()) { handle.use(); } } } interface Implementation extends AutoCloseable { long use(); void close(); }
It allows us to run three different implementations: One without any finalization (plain
), traditional finalize
-based finalization (finalize
), and the new approach (Cleaner
). There is also an iteration count, to see how the JVM deals with many object allocations in quick succession.
The traditional implementation looks like this:
final class FinalizeHandle implements Implementation { private volatile long handle; FinalizeHandle() { // Native resource allocation happens here. handle = 1; } @Override public long use() { try { return handle; } finally { Reference.reachabilityFence(this); } } @Override protected synchronized void finalize() { if (handle >= 0) // Native resource deallocation happens here. handle = -1; } @Override public synchronized void close() { if (handle >= 0) // Native resource deallocation happens here. handle = -1; } }
The reachabilityFence
call ensures that the object is not finalized while the native resource is accessed. For the close
method, concurrent deallocation by the finalize
is prevented by regular Java synchronization. A handle
value of -1
indicates that the native object has been deallocated. There is no synchronization between use
and close
; this is the user's responsibility. (In practice, the use
method would check that the resource has not already been deallocated using the close
method.)
The Cleaner
-based approach is a little bit more involved:
final class CleanerHandle implements Implementation { private final static Cleaner cleaner = Cleaner.create(); private final State state; private final Cleaner.Cleanable cleanable; private static class State implements Runnable { volatile long handle; State() { // Native resource allocation happens here. handle = 1; } public void run() { // Native resource deallocation happens here. handle = -1; } } CleanerHandle() { state = new State(); cleanable = cleaner.register(this, state); } @Override public long use() { try { return state.handle; } finally { Reference.reachabilityFence(this); } } @Override public void close() { cleanable.clean(); } }
The native resource is now encapsulated by the new State
class, which performs the native resource deallocation in this run
method. Objects are registered for finalization with a global Cleaner
instance (which can be shared among multiple classes). The close
method is just a wrapper for the clean
method; it no longer checks for previous deallocation. The Cleaner
infrastructure ensures that the actual deallocation is only performed once (by calling the run
method of the State
object).
For completeness, here is the native resource wrapper without any finalization infrastructure:
final class PlainHandle implements Implementation { private volatile long handle; PlainHandle() { // Native resource allocation happens here. handle = 1; } @Override public long use() { return handle; } @Override public void close() { if (handle >= 0) // Native resource deallocation happens here. handle = -1; } }
The Cleaner
-based approach results in substantially reduced RSS consumption due to the smaller heap size, and the test completes much faster:
$ \time java Finalization finalize 10000000 19.14user 0.52system 0:06.27elapsed 313%CPU (0avgtext+0avgdata 735228maxresident)k 0inputs+64outputs (0major+181482minor)pagefaults 0swaps $ \time java Finalization Cleaner 10000000 1.12user 0.04system 0:00.99elapsed 118%CPU (0avgtext+0avgdata 177312maxresident)k 0inputs+64outputs (0major+40801minor)pagefaults 0swaps
Of course, the finalization-less approach is still much faster and consumes fewer resources still:
$ \time java Finalization plain 100000000 0.90user 0.01system 0:00.85elapsed 106%CPU (0avgtext+0avgdata 33260maxresident)k 0inputs+64outputs (0major+4325minor)pagefaults 0swaps
But for the most part, the Cleaner
benchmark variant is no longer about finalization, but it is mostly a garbage collector benchmark. (The plain
variant is a purely a collector stress test.) The RSS consumption is mostly a result of the G1 ergonomics trying to size the heap to reduce collection overhead, and the Cleaner
benchmark simply allocates many more objects than the plain
benchmark. Unlike the finalize
benchmark, it will not run into OutOfMemoryError
exceptions with much smaller heap sizes.
Of course, this means that there is still some value in avoiding the new form of finalization where possible.
With the Cleaner
approach, the two-step resource allocation (native allocation in the State
constructor followed by registration for cleanup with the register
method) leaves room for a resource leak: register
could fail and throw an OutOfMemoryError
exception. With the finalize
-based approach, this could not happen because the object was implicitly registered for finalization. Recommended practices how to deal with this issue are still emerging. Two approaches seem particularly attractive:
Write an exception handler around the register
method call and free the native resource directly if an exception is thrown before rethrowing it.
Allocate the state object and register it for cleanup, and then allocate the native resources and update the state object.
2017-11-01: published