Set literal duplicate check occurs too early.


I cannot use literal syntax to create a set/map with unique members/keys if the elements are generated with an identical form. Examples of such legal forms: (rand), (read), (clojure.core.async/<!!), etc. I will use (rand) in these examples.

It appears that the input is being checked for duplicates before the arguments to the collection constructors are evaluated. However, this doesn't prevent the need to run the check again later.

Note that duplicates are still (correctly) detected, after evaluation, even if duplicates do not appear as literals in the source:

The first duplicate check therefore seems to be both redundant and incorrect.

Note that this eager duplicate-checking seems to have higher precedence even than the syntax-quote reader macro.

This is odd – since syntax-quote should not realize a collection at all at read time:

For Lists/Vectors/Sets/Maps, syntax-quote establishes a template of the corresponding data structure. Within the template, unqualified forms behave as if recursively syntax-quoted, but forms can be exempted from such recursive quoting by qualifying them with unquote or unquote-splicing, in which case they will be treated as expressions and be replaced in the template by their value, or sequence of values, respectively. (

Definitions aside, based on the apparent expansion of syntax-quote, I would expect the previous to have worked correctly.

If I fake the expected macroexpansion by manually substituting the desired inputs, I get the expected results:

It seems to me that there is a superfluous duplicate check being run before the set/map reader macros evaluate their arguments. This check should seemingly be removed. Even if the check did not catch some false-positive duplicates (as it does), it would be unnecessary since the apparent second post-evaluation check would catch all true duplicates.

All that said, it's unclear that this check should happen at all. If I try to create sets/map with duplicate members/keys, I don't get an error. The duplicates are silently removed or superseded.

It seems it would be most consistent for literals constructed by the reader syntax to do the same.

I can see the argument that a literal representation is not a 'request to construct' but rather an attempt to simulate the printed representation of a literal data object. From that perspective, disallowing 'illegal' printed representations seems reasonable. Unfortunately, the possibility of evaluated forms inside literal vectors, sets, and maps (since lists are evaluated at read time) already breaks this theory. That is, the printed representation of such collections is not an accurately readable form, so read-time duplicate checking still cannot prevent seeming inconsistencies in print/read representations:

Given that the problem cannot be completely avoided at all, it seems simplest and most consistent to treat reader literal constructors like their run-time counterparts, as syntax quote would in the absence of the spurious duplicate check.




Nicola Mometto
October 9, 2014, 2:09 PM
Alex Miller
October 9, 2014, 2:04 PM

Also see

Your pinned fields
Click on the next to a field label to start pinning.








Affects versions