Reducer (and folder) instances hold onto the head of seqs

Description

Problem Statement
A shared function #'clojure.core.reducers/reducer holds on to the head of a reducible collection, causing it to blow up when the collection is a lazy sequence.

Reproduction steps:
Compare the following calls:

The second call should fail on a normal or small heap.

(If reducers are faster than seqs, increase the range.)

Cause: #'reducer closes over a collection when in order to reify CollReduce, and the closed-over is never cleared. When code attempts to reduce over this anonymous transformed collection, it will realize the tail while the head is stored in the closed-over.

Patch: clj-1250-2.patch

Approach:

Clear the reference to 'this' on the stack just before a tail call occurs

Removes calls to emitClearLocals(), which is a no-op.

When the context is RETURN (indicating a tail call), and the operation
is an InvokeExpr, StaticMethodExpr, or InstanceMethodExpr, clear the
reference to 'this' which is in slot 0 of the locals.

Edge-case: Inside the body of a try block, we cannot clear 'this' at the tail
position as we might need to keep refs around for use in the catch or finally
clauses. Introduces another truthy dynamic binding var to track position being
inside a try block.

Adds two helpers to emitClearThis and inTailCall.

Advantages: Fixes this case with no user code changes. Enables GC to do reclaim closed-overs references earlier.
Disadvantages: A compiler change.

Screened by: Alex Miller

Alternate Approaches:

1) Reimplement the #'reducer (and #'folder) transformation fns similar to the manner that Christophe proposes here:

Advantages: Relatively non-invasive change.
Disadvantages: Not evident code. Additional protocol dispatch, though only incurred once

2) Alternate approach

from Christophe Grand:
Another way would be to enhance the local clearing mechanism to also clear "this" but it's complex since it may be needed to keep a reference to "this" for caches long after obvious references to "this" are needed.

Advantages: Fine-grained
Disadvantages: Complex, invasive, and the compiler is hard to hack on.

Mitigations
Avoid reducing on lazy-seqs and instead operate on vectors / maps, or custom reifiers of CollReduce or CollFold. This could be easier with some implementations of common collection functions being available (like iterate and partition).

See https://groups.google.com/d/msg/clojure-dev/t6NhGnYNH1A/2lXghJS5HywJ for previous discussion.

Environment

None

Activity

Show:
Ghadi Shayban
August 4, 2015, 11:18 PM

Reproduced on 1.8.0-alpha4. Could not reproduce with the previously committed patch applied directly to 1.7.

Cause: Tail clearing is insensitive to direct linking

In a direct linking static method (through invokeStatic) there's no "this" to clear.

Patch addresses the root cause

Alex Miller
August 5, 2015, 3:02 PM

Ok, backing out a little, here's another example that still fails with the patch:

1.7:
queue write :a
queue write :b
queue write :c
queue write :done
[1]
nil

1.8:
queue write :a
[1]
queue write :b
queue write :c
queue write :done
NullPointerException

Alex Miller
August 5, 2015, 4:58 PM

Moved latest case to

Ghadi Shayban
August 6, 2015, 2:44 AM

Both this regression and the loop regression are fixed on CLJ-1793.

Alex Miller
August 7, 2015, 2:56 PM

The patch for this ticket has been reverted in master after 1.8.0-alpha4. Further work on re-implementation is moved to CLJ-1793.

Completed

Assignee

Unassigned

Reporter

Christophe Grand

Approval

Vetted

Patch

Code and Test

Fix versions

Affects versions

Priority

Major
Configure