clojure.core/sort is not thread-safe on Java collections with backing arrays


If a (mutable) Java collection that exposes it's backing array is passed to c.c/sort in multiple threads, the collection will be concurrently modified in multiple threads.

Approach: Convert coll to a seq before converting it to an array, thus preserving the original collection.

Patch: 0001-CLJ-1763-make-sort-thread-safe.patch

Alternate approaches:

1. Document in sort that, like Java arrays, Java collections backed by arrays are modified in-place.
2. Change RT.toArray() to defensively copy the array returned from a (non-IPersistentCollection) Java collection. This has a number of potential ramifications as this method is called from several paths.
3. For non-Clojure collections, could also use Collections.sort() instead of dumping to array and using Arrays.sort().




Alex Miller
June 19, 2015, 4:55 PM

The docstring says "If coll is a Java array, it will be modified. To avoid this, sort a copy of the array." which also seems like solid advice in this case.

Creating a sequence view of the input collection would significantly alter the performance characteristics.

Alex Miller
June 19, 2015, 4:59 PM

I guess what I'm saying is, we should not make the performance worse for persistent collections in order to make it safer for arbitrary Java collections. But it may still be useful to make it safer without affecting persistent collection performance and/or updating the docstring.

Nicola Mometto
June 19, 2015, 5:02 PM

Alex, no additional sequence is being created, the seq call was already there

Alex Miller
June 19, 2015, 5:53 PM

Well, that's kind of true. The former use did not force realization of the whole seq, just the first element. That said, from a quick test the extra cost seems small on a set (vector seqs are actually faster due to their structure).

Stuart Halloway
July 30, 2015, 3:11 PM

I think this should be a docstring change, if anything at all.




Nicola Mometto