cljs.reader/read-string throws errors when reading keywords that begin with integers

Description

(cljs.reader/read-string ":1") throws an error: TypeError: 'null' is not an object (evaluating 'a[(0)]')
at read_keyword (:474)

Environment

Using clojurescript 0.0-2371. Error seen with both phantomjs and Chrome

Activity

Show:
import
November 5, 2014, 10:10 PM

Comment made by: josf

read-keyword, in reader.cljs, matches against symbol-pattern, which disallows symbol names that begin with numbers. Symbols can't begin with numeric characters, but keywords actually begin with ":", so in a keyword like :1 , "1" is actually the second character.

Francis Avila
November 6, 2014, 2:01 AM

Your reasoning that in a keyword the number is the second character is precisely one of the unclear points about keyword and symbol parsing. Some implementations say yes, and some do not, and there is no "spec" unambiguous enough to decide the issue.

This issue is a duplicate of CLJS-677. The comments in there go in to much greater detail.

import
November 6, 2014, 8:52 AM

Comment made by: josf

I'll leave it at that then.

I would like to add that the current state of affairs is rather confusing, because keywords like :1 seem to work fine in clojurescript, except when deserializing with cljs.reader/read-string, from localStorage for example, which fails without a clear explanation.

Francis Avila
November 6, 2014, 6:19 PM

Agreed, the current state of affairs is not good but the proper fix would be:

  1. Produce a rigorous formal specification of the reader syntax for Clojure (and variants/subsets for edn, Clojurescript, ClojureCLR). (Including consideration of unicode chars, etc.)

  2. Unifying all reader implementations around these specifications (across multiple projects).

  3. Dealing with code breakage in upstream libraries.

Understandably the core developers would probably think this is a very large effort with a lot of disruption for very little gain. I advise just avoiding edge cases in the spec, like :1, :supposedly:ok:keyword, non-ascii characters, etc, in both code and edn.

import
November 7, 2014, 10:15 AM

Comment made by: josf

One last thing about the confusion this causes. The problem I see is that {:1 "one"} compiles without any problem in Clojurescript, (keyword? :1} returns true, and (keyword "1") returns :1. The only time the problem comes up is when using reader/read-string. It seems to me that this should be coherent at least within Clojurescript, even if there are discrepancies with the other implementations.

And when using :keywordize :true with externally supplied data, it is hard to be sure that some of the JSON keys won't begin with a digit. (This is how I stumbled onto this.)

Duplicate

Assignee

Unassigned

Reporter

import

Labels

Approval

None

Patch

None

Priority

Minor