Efficient shortcut for (first (filter pred coll)) idiom


It's a common task to look up for an item in a collection based on predicate. Currently Clojure has no direct support for that in clojure.core. Instead, our options are:

1. (first (filter pred coll)) will create intermediate lazy sequence and might evaluate pred up to 31 extra times in case of chunked sequence

2. (some #(when (pred %) %) coll) will short-circuit on first match, but won't catch false value in something like (some #(when false? %) [true false true])

Additionally, both of these workarounds a) obscure the purpose of the code, and b) do not handle custom not-found values.

Attached is a patch that makes use of efficiency of reduce-able collections, handles edge cases like looking for false? or nil?, and supports optional not-found value.


Patch: clj-2056-clojure-core-seek-2.patch

Prescreening notes: I think the general approach is good. Is it necessary to support nil? and false? preds? Or would a transduce formulation like the one in comments be sufficient.




Alex Miller
November 12, 2016, 1:54 AM

Just as an interesting aside, the new halt-when transducer could actually be used to create something like this too (if you set aside the desire to support nil? and false? preds).

Patch has some trailing whitespace in the test code - could you clean that up?

Nikita Prokopov
November 12, 2016, 8:46 PM

Attaching patch with trailing whitespace cleaned

Nikita Prokopov
November 12, 2016, 8:46 PM

Thanks Alex! Attached new patch with whitespace cleaned

January 31, 2017, 5:34 AM

Comment made by: dergutemoritz

I had a similar train of thought today and arrived at the idea of adding transducer support to first, e.g. (first (filter pred) coll). This could potentially be as efficient as the special-purpose seek function proposed above but would of course not support the not-found value. However, it seems like a natural extension to first and could be useful for other purposes, too.

Alex Miller
May 13, 2017, 7:34 AM

Upon review, we've decided that we do not wish to include this. Use of linear search (and in particular nested linear search) leads to poor performance - often it's better to use other kinds of data structures and that's why this functionality has not been included in the past.





Nikita Prokopov






Code and Test