Unroll assoc and assoc! for small numbers of arguments

Description

Whilst doing performance work recently, I discovered that unrolling to single assoc calls were significantly faster than using multiple keys (~10% for my particular application). Zachary Tellman then pointed out that clojure.core doesn't unroll assoc at all, even for the case of relatively low numbers of keys.

We already unroll other performance critical functions that call things via `apply`, e.g. `update` https://github.com/clojure/clojure/blob/master/src/clj/clojure/core.clj#L5914, but `assoc` (which is, I think in the critical path for quite a bunch of applications and libraries), would likely benefit from this.

I have not yet developed patches for this, but I did some standalone benchmarking work:

https://github.com/yeller/unrolling-assoc-benchmarks

benchmark results:

code: https://github.com/yeller/unrolling-assoc-benchmarks/blob/master/src/bench_assoc_unrolling.clj

 

1

2

3

4

empty array map (not unrolled)

23ns

93ns

156ns

224ns

empty array map (unrolled assoc)

N/A

51ns

80ns

110ns

 

20 element persistent hashmap (not unrolled)

190ns

313ns

551ns

651ns

20 element persistent hashmap (unrolled assoc)

N/A

250ns

433ns

524ns

 

record (not unrolled)

12ns

72ns

105ns

182ns

record (unrolled assoc)

N/A

21ns

28ns

41ns

Each measurement was made in a separate JVM, to avoid JIT path dependence.

Benchmarks were ran on a commodity server (8 cpus, 32gb ram), with ubuntu 12.04 and a recent release of Java 8. Attached are `cpuinfo`, `uname` and `java -version` output.

Relatively standard JVM production flags were enabled, and care was taken to disable leiningen's startup time optimizations (which disable many of the JIT optimizations).

Benchmarks can be run by cloning the repository, and running `script/bench`

There's one outstanding question for this patch: How far should we unroll these calls? `update` (which is unrolled in the 1.7 alphas) is unrolled to 3 arguments. Adding more unrolling isn't difficult, but it does impact the readability of assoc.

Patch: CLJ-1656-v5.patch

Environment

None

Activity

Show:
Michael Blume
April 30, 2015, 2:01 PM

There, code and test.

This also checks that assoc! is passed an even number of kvs in the varargs case, which is the behavior of assoc. The test verifies that both assoc and assoc! throw for odd arg count.

Alex Miller
April 30, 2015, 3:10 PM

The existing arglist is fine - it just overrides the generated one for doc purposes.

Did you try any of the RT.assocN() stuff?

I guess another question I have is whether people actually do this enough that it matters?

Michael Blume
September 29, 2016, 6:14 AM

Updated patch to apply to master

Alex Miller
August 14, 2018, 8:27 AM

This still needs assessment on frequency of different counts. Based on some very quick checking, passing two kvs is common enough to matter. Passing three is far less common and so on. But I'd love to see some rough idea of frequency from running some stuff. I think the current unrolling is too much (2-3 kvs is probably sufficient).

Michiel Borkent
December 22, 2020, 3:20 AM
Edited

To gather some statistics of how many k/v pairs people typically use with assoc, I made this script:

https://gist.github.com/borkdude/e6f0b12f9352f3375e5f3277d2aba6c9

The output from my entire m2 directory:

([1 20884] [2 3036] [3 1083] [4 327] [5 126] [6 64] [7 33] [10 23] [8 11] [9 6] [14 4] [0 1] [12 1] [16 1])

Stats from 's code base:

([1 32258] [2 4848] [3 1286] [4 535] [5 354] [6 118] [7 100] [9 32] [11 22] [12 14] [18 10] [8 8] [29 6] [16 6] [19 2] [10 2])

Assignee

Unassigned

Reporter

Tom Crayford

Labels

Approval

Triaged

Patch

Code and Test

Affects versions

Priority

Major