Optimize code in `read-object` and `read-array`

Description

Approach:

Introduce a fn next-token which skips over whitespace and returns the next interesting token. Use that any place we need to skip over a bunch of whitespace to produce the next token.

Then for read-object it’s implementation more closely follows the bnf-grammar for json objects:

  1. read-key Which reads up to and including the :

  2. -read which read the value

  3. depending on the next-token, either continue reading next key-value pair (if it’s a comma) or return the object (if it’s a }).

Benchmarks:

Current master (e0200bf5391cfb654ff25975e709c0de5ba7d407)

10b

100b

1k

10k

100k

544.271713 ns

2.48000 µs

14.393780 µs

163.630036 µs

1.619043 ms

With 0002-DJSON-36-Optimize-code-in-read-object-and-read-array.patch applied

10b

100b

1k

10k

100k

505.456237 ns

1.930512 µs

11.021817 µs

134.057042 µs

1.32651 ms

0002-DJSON-36-Optimize-code-in-read-object-and-read-array.patch is tested against the generative test in DJSON-45

Patch:

0002-DJSON-36-Optimize-code-in-read-object-and-read-array.patch

Environment

None

Activity

Show:
Alex Miller
April 26, 2021, 5:21 PM

Reviewed transient code in 0002 patch, passing new manual and generative tests, applied.

Alex Miller
April 19, 2021, 1:05 PM

Reverted, per - not properly handling transients

Alex Miller
April 16, 2021, 7:02 PM

Applied

Erik Assum
March 27, 2021, 6:26 PM
Edited

19:18 $ git checkout --force master
Already on 'master'
Your branch is ahead of 'origin/master' by 23 commits.
(use "git push" to publish your local commits)
✔ ~/Documents/github.com/data.json [master ↑·23|…45⚑ 1]
19:19 $ clj -Sdeps '{:deps {org.clojure/clojure {:mvn/version "1.10.3"} criterium/criterium {:mvn/version "0.4.4"}}}'
Clojure 1.10.3
(require '[criterium.core :refer :all] '[clojure.data.json :as json])
nil
(defn json-data [size] (slurp (str "dev-resources/json" size ".json")))
#'user/json-data
(doseq [size ["10b" "100b" "1k" "10k" "100k"]] (let [json (json-data size)] (quick-bench (json/read-str json))))
Evaluation count : 1045122 in 6 samples of 174187 calls.
Execution time mean : 623.832443 ns
Execution time std-deviation : 60.964359 ns
Execution time lower quantile : 569.581008 ns ( 2.5%)
Execution time upper quantile : 702.481526 ns (97.5%)
Overhead used : 7.470016 ns
Evaluation count : 192738 in 6 samples of 32123 calls.
Execution time mean : 3.397444 µs
Execution time std-deviation : 470.556976 ns
Execution time lower quantile : 3.069843 µs ( 2.5%)
Execution time upper quantile : 4.014398 µs (97.5%)
Overhead used : 7.470016 ns
Evaluation count : 34296 in 6 samples of 5716 calls.
Execution time mean : 18.333675 µs
Execution time std-deviation : 872.870742 ns
Execution time lower quantile : 17.385690 µs ( 2.5%)
Execution time upper quantile : 19.456226 µs (97.5%)
Overhead used : 7.470016 ns
Evaluation count : 3006 in 6 samples of 501 calls.
Execution time mean : 223.689654 µs
Execution time std-deviation : 17.658602 µs
Execution time lower quantile : 203.939916 µs ( 2.5%)
Execution time upper quantile : 245.589067 µs (97.5%)
Overhead used : 7.470016 ns
Evaluation count : 306 in 6 samples of 51 calls.
Execution time mean : 2.284957 ms
Execution time std-deviation : 255.016202 µs
Execution time lower quantile : 1.986014 ms ( 2.5%)
Execution time upper quantile : 2.565211 ms (97.5%)
Overhead used : 7.470016 ns
nil
^Der=>
✔ ~/Documents/github.com/data.json [master ↑·23|…45⚑ 1]
19:22 $ patch -p1 < 0001-DJSON-36-Optimize-code-in-read-object-and-read-array.patch
patching file src/main/clojure/clojure/data/json.clj
Hunk #1 succeeded at 50 (offset -3 lines).
Hunk #2 succeeded at 143 (offset -2 lines).
patching file src/test/clojure-perf/clojure/data/json_perf_test.clj
✔ ~/Documents/github.com/data.json [master ↑·23|✚ 2…45⚑ 1]
19:22 $ clj -Sdeps '{:deps {org.clojure/clojure {:mvn/version "1.10.3"} criterium/criterium {:mvn/version "0.4.4"}}}'
Clojure 1.10.3
(require '[criterium.core :refer :all] '[clojure.data.json :as json])
nil
(defn json-data [size] (slurp (str "dev-resources/json" size ".json")))
#'user/json-data
(doseq [size ["10b" "100b" "1k" "10k" "100k"]] (let [json (json-data size)] (quick-bench (json/read-str json))))
Evaluation count : 1165326 in 6 samples of 194221 calls.
Execution time mean : 588.565661 ns
Execution time std-deviation : 74.005433 ns
Execution time lower quantile : 486.608292 ns ( 2.5%)
Execution time upper quantile : 652.890659 ns (97.5%)
Overhead used : 7.470613 ns
Evaluation count : 263508 in 6 samples of 43918 calls.
Execution time mean : 2.488069 µs
Execution time std-deviation : 384.623842 ns
Execution time lower quantile : 2.200909 µs ( 2.5%)
Execution time upper quantile : 3.086553 µs (97.5%)
Overhead used : 7.470613 ns
Evaluation count : 38898 in 6 samples of 6483 calls.
Execution time mean : 17.106896 µs
Execution time std-deviation : 1.737168 µs
Execution time lower quantile : 15.074744 µs ( 2.5%)
Execution time upper quantile : 19.774719 µs (97.5%)
Overhead used : 7.470613 ns

Found 1 outliers in 6 samples (16.6667 %)
low-severe 1 (16.6667 %)
Variance from outliers : 30.5274 % Variance is moderately inflated by outliers
Evaluation count : 3204 in 6 samples of 534 calls.
Execution time mean : 193.544785 µs
Execution time std-deviation : 6.106605 µs
Execution time lower quantile : 186.547197 µs ( 2.5%)
Execution time upper quantile : 200.099338 µs (97.5%)
Overhead used : 7.470613 ns
Evaluation count : 336 in 6 samples of 56 calls.
Execution time mean : 2.144619 ms
Execution time std-deviation : 302.511493 µs
Execution time lower quantile : 1.812900 ms ( 2.5%)
Execution time upper quantile : 2.432809 ms (97.5%)
Overhead used : 7.470613 ns

 

10b

100b

1k

10k

100k

before

623.832443 ns

3.397444 µs

18.333675 µs

223.689654 µs

2.284957 ms

after

588.565661 ns

2.488069 µs

17.106896 µs

193.544785 µs

2.144619 ms

Alex Miller
March 19, 2021, 6:39 AM

This seems to make things slower afaict, maybe worth re-benchmarking from new clean slate. I did not really look at the contents of the patch much.

Fixed

Assignee

Unassigned

Reporter

Erik Assum