Uploaded image for project: 'Clojure'
  1. CLJ-1872

empty? is broken for transient collections

    Details

    • Type: Bug
    • Status: Open
    • Priority: Critical
    • Resolution: Unresolved
    • Affects versions: Release 1.7
    • Fix versions: Release 1.11
    • Labels:
    • Approval:
      Vetted
    • Patch:
      Code and Test

      Description

      Couldn't find whether it was brought up earlier, but it seems that empty? predicate is broken for transient collections

      user=> (empty? (transient []))
      IllegalArgumentException Don't know how to create ISeq from: clojure.lang.PersistentVector$TransientVector  clojure.lang.RT.seqFrom (RT.java:528)
      
      user=> (empty? (transient {}))
      IllegalArgumentException Don't know how to create ISeq from: clojure.lang.PersistentArrayMap$TransientArrayMap  clojure.lang.RT.seqFrom (RT.java:528)
      
      user=> (empty? (transient #{}))
      IllegalArgumentException Don't know how to create ISeq from: clojure.lang.PersistentHashSet$TransientHashSet  clojure.lang.RT.seqFrom (RT.java:528)
      

      The workaround is to use (zero? (count (transient ...))) check instead.

      Cause: empty? is based on seqability, which transients don't implement.

      Proposed Add a branch to empty? for counted? colls. Transients implement Counted so gain support via this branch. Other colls that are counted are faster. Seq branch continues to work for seqs.

      Perf test:

      (def p [])
      (def p1 [1])
      (def t (transient []))
      (def t1 (transient [1]))
      
      ;; take last time of all these
      (dotimes [i 20] (time (dotimes [_ 10000] (empty? p))))
      (dotimes [i 20] (time (dotimes [_ 10000] (empty? p1))))
      (dotimes [i 20] (time (dotimes [_ 10000] (empty? t))))
      (dotimes [i 20] (time (dotimes [_ 10000] (empty? t1))))
      

      Results:

      coll before after result
      p 0.72 ms 0.08 ms much faster when empty
      p1 0.15 ms 0.13 ms slightly faster when not empty
      t error 0.19 ms no longer errors
      t1 error 0.20 ms no longer errors

      Not sure if doc string should be tweaked to be more generic, particularly the "same as (not (seq coll))" which is now only true for Seqable but not Counted. I think the advise to use (seq coll) for seq checks is still good there.

      I did a skim for other types that are Counted but not seqs/Seqable and didn't find much other than internal things like ChunkBuffer. Many are both and would thus use the counted path instead (all the persistent colls for example and any kind of IndexedSeq).

      I guess another option would be just to fully switch empty? to be about (zero? (bounded-count 1 coll)) and lean on count's polymorphism completely.

      Patch: clj-1872.patch

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              alex+import import
            • Votes:
              2 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated: