Potential speed improvements for key lookup of large collections as array-map keys

Description

Demonstration of performance difference of looking up one of these
types of large collections in array-maps vs. hash-maps:

  • set

  • map

  • PersistentQueue

  • primitive vector created via vector-of

hash-maps lookup of these types takes advantage of an identical? (Java ==) fast path for key equality check, but for array-maps they do not.

There is already such an identical fast path in the equiv check for APersistentVector, so there is nothing to improve there in this regard.

 

The difference between looking up these values in a hash-map vs. array-map is that hash-map uses the clojure.lang.Util.equiv(Object, Object) method for key equality checks, which has a fast path check for identical key objects, whereas array-map uses clojure.lang.Util.EquivPred(Object) to determine a method to use for equiv checking while iterating through the array-map keys, and it selects clojure.lang.Util.equivColl(Object), which ends up calling one of:

+ clojure.lang.APersistentMap.equiv(Object) for maps

+ clojure.lang.APersistentSet.equiv(Object) for sets

+ clojure.lang.PersistentQueue.equiv(Object) for PersistentQueue

+ clojure.core.Vec.equiv(Object) for Vec

None of those methods have a fast path check for identical objects.

I realize there are method size limits on what the JVM JITs will in-line, and that is a bigger performance factor in many cases than the use cases that adding an identical fast path check might bring on average to Clojure applications. I can check, but at least for clojure.lang.APersistentMap.equiv(Object), it seems likely that it is already over the default limit for maximum byte code size of an inline-able method.

Environment

None

Activity

Show:
Andy Fingerhut
July 20, 2019, 12:35 PM

Attachment clj-2527-identical-equiv-check-v1.patch is one way to speed up the slow cases described in the issue. I can add another comment later with details on how many byte codes in size the affected methods are before and after that patch.

Andy Fingerhut
July 20, 2019, 6:43 PM

For the patch in attachment clj-2527-identical-equiv-check-v1.patch, below are the method sizes, in bytes of JVM byte code, using no.disassemble of the entire class to disassemble and show the byte code of the entire class. The numbers reported are only for the size of the equiv method.

Unmodified Clojure 1.10.1 method sizes, vs. with modifications:

clojure.lang.APersistentSet/equiv 69 -> 76
clojure.lang.APersistentMap/equiv 125 -> 132
clojure.lang.PersistentQueue/equiv 74 -> 81
clojure.core.Vec/equiv 223 -> 238

Assignee

Unassigned

Reporter

Andy Fingerhut

Labels

None

Approval

None

Patch

None

Affects versions

Priority

Minor