Serializing+deserializing lists breaks their hash

Description

clojure.lang.PersistentList implements Serializable but a serialization roundtrip results in a hash of 0. This appears to be caused by ASeq marking its _hash property as transient; other collection base classes don't do this. I don't know if there is a rationale for the distinction, but the result is that serializing, then deserializing, a list results in its _hash property being 0 instead of either its previous, calculated value, or -1, which would trigger recalculation.

This means that any code that relies on a list's hash value can break in subtle ways.

Examples:

Approach: Use 0 as the "uncomputed" hash value, not -1. The auto initialized value on serialization construction will then automatically trigger a re-computation.

Alternate approaches:

  • Implement a readObject method that sets the _hash property to -1. This is somewhat complicated in the face of subclasses, etc.

Note: Also need to consider other classes that use -1 instead of 0 as the uncomputed hash value: APersistentMap, APersistentSet, APersistentVector, PersistentQueue - however I believe in those cases they are not transient and thus may avoid the issue. Perhaps they should be transient though.

Patch: clj-1766-2.patch
Screened by: Alex Miller

Environment

Tested on OS X 10.10.3, Clojure 1.6.0, Java 1.8.0_45-b14, but don't think this is a factor.

Assignee

Unassigned

Reporter

Christopher Vermilion

Labels

Approval

Ok

Patch

Code and Test

Fix versions

Affects versions

Priority

Major
Configure