Clojure can leak memory when used in a servlet container


When used within a servlet container (Jetty/Tomcat/JBossAS/Immutant/etc), the thread locals Var.dvals (used to store dynamic bindings) and LockingTransaction.transaction (used to store the currently active transaction(s)) prevent all of the classes loaded by an application's clojure runtime from being garbage collected, resulting in a memory leak.

Cause: The issue comes from threads living beyond the lifetime of a deployment - servlet containers use thread pools that are shared across all applications within the container. Currently, the dvals and transaction thread locals are not discarded when they are no longer needed, causing their contents to retain a hard reference to their classloaders, which, in turn, causes all of the classes loaded under the application's classloader to be retained until the thread exits (which is generally at JVM shutdown).

Solution: I've attached a patch that does the following:

  • Var.dvals is initialized to a canonical TOP Frame

  • Var.dvals is now removed when the thread bindings are popped to the TOP

  • The outer transaction in LockingTransaction.transaction now removes the thread local when it is finished

There is still the opportunity for memory leaks if agents or futures are used, and the executors used for them are not shutdown when the app is undeployed. That's a solvable problem, but should probably be solved by the containers themselves (and/or the war generation tools) instead of in clojure itself.

This patch has a small performance impact: its use of a try/finally around running transactions to remove the outer transaction adds 4-6 microseconds to each transaction call on my hardware.

Providing an automated test for this patch is difficult - I've tested it locally with repeated deployments to a container while monitoring GC and permgen. All of clojure's tests pass with it applied.

The above is a condensation of:

Patch: threadlocal-removal-tcrawley-2013-11-24.diff

Screened by: Alex Miller - the new patch (since prior screening) has no changes in the LockingTransaction code but has been updated in Var to address the regression logged in CLJ-1299.












Fix versions