Java finalization revisited

The new Cleaner programming interface in OpenJDK 9 addresses a longstanding performance issue when working with short-lived native resources in Java.

Back in 2009, while working SQLite bindings, I encountered a performance issue related to Java finalization: Objects with finalizers are never short-lived because they are enqueued into a finalization queue. Even if the programmer deallocates the underlying native resource by calling an explicit close method, the JVM will still record the fact that the finalize method must be called eventually, despite the fact that it will not have to do anything because the native resources have already been deallocated.

The new java.lang.ref.Cleaner class in OpenJDK 9 addresses this problem. It splits the original handle object (with the explicit close method and the finalize method) into three parts:

  1. The handle object, which retains its original externally visible programming interface (including the close method).

  2. An internal state object, which contains all the information to identify the native resources and to eventually deallocate them. The internal state object is not referenced externally; all access is mediated by the handle object.

  3. The cleaner infrastructure: A global Cleaner object shared by all instances, and a Cleanable object specific to each handle which is used to implement explicit deallocation via close.

For our purposes, the key part is the Cleanable object. When its clean method is called, the state object is asked to deallocate its resources and at the same time, the cleaner infrastructure removes the associated handle object from its internal tracking because it is no longer necessary to deallocate resources when the handle object is no longer referenced.

An example

To see how this works out in practice, I implemented a minimal example for testing and benchmarking purposes. The native resource is emulated with a volatile long. In practice, the long value would be a native pointer. I added the volatile keyword as an optimization barrier, to discourage Hotspot from optimizing away everything.

First, there is the test driver:

import java.lang.ref.Cleaner;
import java.lang.ref.Reference;
import java.util.function.Supplier;

public final class Finalization {
    public static void main(String[] args) {

        Supplier<Implementation> source;
        switch (args[0]) {
            case "plain":
                source = () -> new PlainHandle();
                break;
            case "finalize":
                source = () -> new FinalizeHandle();
                break;
            case "Cleaner":
                source = () -> new CleanerHandle();
                break;
            default:
                throw new IllegalArgumentException(args[0]);
        }

        int count = Integer.parseInt(args[1]);
        for (int i = 0; i < count; ++i) {
            run(source);
        }
    }

    private static void run(Supplier<Implementation> source) {
        try (Implementation handle = source.get()) {
            handle.use();
        }
    }
}

interface Implementation extends AutoCloseable {
    long use();
    void close();
}

It allows us to run three different implementations: One without any finalization (plain), traditional finalize-based finalization (finalize), and the new approach (Cleaner). There is also an iteration count, to see how the JVM deals with many object allocations in quick succession.

The traditional implementation looks like this:

final class FinalizeHandle implements Implementation {
    private volatile long handle;

    FinalizeHandle() {
        // Native resource allocation happens here.
        handle = 1;
    }

    @Override
    public long use() {
        try {
            return handle;
        } finally {
            Reference.reachabilityFence(this);
        }
    }

    @Override
    protected synchronized void finalize() {
        if (handle >= 0)
            // Native resource deallocation happens here.
            handle = -1;
    }

    @Override
    public synchronized void close() {
        if (handle >= 0)
            // Native resource deallocation happens here.
            handle = -1;
    }
}

The reachabilityFence call ensures that the object is not finalized while the native resource is accessed. For the close method, concurrent deallocation by the finalize is prevented by regular Java synchronization. A handle value of -1 indicates that the native object has been deallocated. There is no synchronization between use and close; this is the user's responsibility. (In practice, the use method would check that the resource has not already been deallocated using the close method.)

The Cleaner-based approach is a little bit more involved:

final class CleanerHandle implements Implementation {
    private final static Cleaner cleaner = Cleaner.create();
    private final State state;
    private final Cleaner.Cleanable cleanable;

    private static class State implements Runnable {
        volatile long handle;

        State() {
            // Native resource allocation happens here.
            handle = 1;
        }

        public void run() {
            // Native resource deallocation happens here.
            handle = -1;
        }
    }

    CleanerHandle() {
        state = new State();
        cleanable = cleaner.register(this, state);
    }

    @Override
    public long use() {
        try {
            return state.handle;
        } finally {
            Reference.reachabilityFence(this);
        }
    }

    @Override
    public void close() {
        cleanable.clean();
    }
}

The native resource is now encapsulated by the new State class, which performs the native resource deallocation in this run method. Objects are registered for finalization with a global Cleaner instance (which can be shared among multiple classes). The close method is just a wrapper for the clean method; it no longer checks for previous deallocation. The Cleaner infrastructure ensures that the actual deallocation is only performed once (by calling the run method of the State object).

For completeness, here is the native resource wrapper without any finalization infrastructure:

final class PlainHandle implements Implementation {
    private volatile long handle;

    PlainHandle() {
        // Native resource allocation happens here.
        handle = 1;
    }

    @Override
    public long use() {
        return handle;
    }

    @Override
    public void close() {
        if (handle >= 0)
            // Native resource deallocation happens here.
            handle = -1;
    }
}

Performance results

The Cleaner-based approach results in substantially reduced RSS consumption due to the smaller heap size, and the test completes much faster:

$ \time java Finalization finalize 10000000
19.14user 0.52system 0:06.27elapsed 313%CPU (0avgtext+0avgdata 735228maxresident)k
0inputs+64outputs (0major+181482minor)pagefaults 0swaps
$ \time java Finalization Cleaner 10000000
1.12user 0.04system 0:00.99elapsed 118%CPU (0avgtext+0avgdata 177312maxresident)k
0inputs+64outputs (0major+40801minor)pagefaults 0swaps

Of course, the finalization-less approach is still much faster and consumes fewer resources still:

$ \time java Finalization plain 100000000
0.90user 0.01system 0:00.85elapsed 106%CPU (0avgtext+0avgdata 33260maxresident)k
0inputs+64outputs (0major+4325minor)pagefaults 0swaps

But for the most part, the Cleaner benchmark variant is no longer about finalization, but it is mostly a garbage collector benchmark. (The plain variant is a purely a collector stress test.) The RSS consumption is mostly a result of the G1 ergonomics trying to size the heap to reduce collection overhead, and the Cleaner benchmark simply allocates many more objects than the plain benchmark. Unlike the finalize benchmark, it will not run into OutOfMemoryError exceptions with much smaller heap sizes.

Of course, this means that there is still some value in avoiding the new form of finalization where possible.

A remaining complication

With the Cleaner approach, the two-step resource allocation (native allocation in the State constructor followed by registration for cleanup with the register method) leaves room for a resource leak: register could fail and throw an OutOfMemoryError exception. With the finalize-based approach, this could not happen because the object was implicitly registered for finalization. Recommended practices how to deal with this issue are still emerging. <http://mail.openjdk.java.net/pipermail/core-libs-dev/2017-October/049641.html> Two approaches seem particularly attractive:

Revisions


Florian Weimer
Home Blog (DE) Blog (EN) Impressum RSS Feeds