Add named loop + recur-to facility for nested loops

Description

Copied from the proposal email to the Clojure/dev Google group:

https://groups.google.com/d/msg/clojure-dev/zlMGmv60MVA/beyIRTrhAgAJ

Hi,

Currently loop/recur is limited to "single-layered" loops: loop forms can occur inside other loop forms, but there is no facility for "recuring to an outer loop".

A few years ago I posted a proposal to add support for nested loops to Clojure with a proof-of-concept patch to ClojureScript with syntax and semantics that I think suffice to make nested loops feel natural while remaining a natural extension of the core loop/recur model, with the same explicit tail recursion benefits:

https://groups.google.com/d/msg/clojure-dev/imtbY1uqpIc/8DWLw8Ymf4IJ
https://dev.clojure.org/display/design/Named+loops+with+recur-to
https://github.com/michalmarczyk/clojurescript/tree/recur-to
https://github.com/michalmarczyk/clojurescript/commit/feba0a078138da08d584a67e671415fc403fa093

I have now implemented a complete patch enabling the proposed feature in Clojure (the first link is to a branch based on current master, that is, the "prepare for next development iteration" commit after 1.9.0-alpha17; the second is to the current tip of this branch for future reference):

https://github.com/michalmarczyk/clojure/tree/recur-to

https://github.com/michalmarczyk/clojure/commit/212ea06d21d3b336ac35480c99170e81defaf956

I also opened a ticket in JIRA so as to have a place to post the above in the form of a patch file:

https://dev.clojure.org/jira/browse/CLJ-2235

The remainder of this email sets out the proposal in more detail, states its key properties in a somewhat rigorous form, briefly summarizes the implementation approach and discusses certain design choices made in the patch.

Overview
========

The idea is that one could write e.g.

and, provided that each recur-to form is in tail position with respect to all its enclosing loop forms up to and including its target and the number of arguments passed to each recur-to form matches the number of loop locals of the target loop (plus one for the leading loop name argument), this should compile and behave much like nested loops in Java.

The proposed syntax is modelled on Scheme's named lets, although semantically
those are quite different - this proposal is strictly limited to expanding the loop/recur model to nested loops in a natural way. Of course named fn forms ought also to be valid recur-to targets.

Key properties of named loops and recur-to
==========================================

With the above-linked patch in place, the following rules are enforced at compilation time:

1. Each recur-to form must be in tail position with respect to all its enclosing loop forms, whether named or not, up to and including its target (which may be a named loop or fn form).

2. It is an error to specify a recur-to target which does not occur among the names of the recur-to form's enclosing loop or fn forms with respect to which it is in tail position.

3. It is not possible to recur-to across try.

4. The number of arguments passed to recur-to beyond the initial target/label argument must match the number of formal parameters of the target loop or fn form.

5. Shadowing loop names is permissible; recur-to can only target the innermost loop of the given name among those with respect to which it is in tail position. Loop locals introduced by a shadowed named loop remain visible within the shadowing loop (unless they are themselves shadowed by identically named locals).

NB. loop names are not locals. In particular, they neither shadow nor are shadowed by locals of the same name. This point merits a longer discussion; see the section on design choices at the end of this email.

The innermost loop or fn form can always be targeted using plain recur, whether it is named or not. Additionally (recur-to nil ...) is equivalent to (recur ...) (even when the innermost loop or fn form is actually named), and (loop nil [...] ...) is equivalent to (loop [...]).

Summary of the implementation approach
======================================

The patch modifies the handling of loop labels in the compiler and implements the few necessary tweaks to the loop macro.

It also introduces an optional name argument to the loop* special form. (It is optional primarily so as to avoid breaking any non-core macros that emit loop* directly.)

Finally, it renames the recur special form to recur*; recur and recur-to become macros defined in clojure.core. See the section on design choices below for alternative approaches.

Design choices
==============

1. During development, purely as a matter of convenience at that stage, I had a separate loop-as macro that accepted a name argument. I thought it reasonable to add the naming feature directly to loop, particularly since fn already takes an optional name. Still, loop-as is a valid alternative design.

2. Should it be desirable to avoid renaming the existing recur special form to recur* and reimplementing recur as a macro, a new recur-to special form could be added. (Alternatively, one could keep recur as it is while adding recur-to as a macro delegating to a new recur* special form.)

3. Should it be desirable to preserve the option of treating loop names as locals in the future, it would probably be preferable to make them shadow and be shadowed by locals now, as otherwise elevating them to the status of locals at a later point would be a breaking change. To give an example of why such a future change might be useful, if tail call elimination support arrives in a future JDK spec, one might consider whether it'd be useful to adopt a Scheme-like approach with the loop name treated as a local bound to a function with a single arglist corresponding to the loop locals of the named loop; the closure allocations this would entail could perhaps be optimized away if the local is never referenced.

It bears pointing out that if TCE support does materialize, it will enable a range of alternative designs. For example, Scheme-style named lets could then be introduced as a feature of the let macro. Thus it seems to me that it is reasonable to restrict loop/recur/recur-to to label+goto-style looping only, even in a hypothetical future with VM TCE support, and that there is no reason to afford local-like treatment to loop names; hence the current no-shadowing-interactions approach of the patch.

Cheers,
Michał

Environment

None

Activity

Show:
Michał Marczyk
September 18, 2017, 4:02 AM

Here's a first cut at a "keyword loop names" patch:

0002-CLJ-2235-use-keywords-as-loop-names.patch

This is to be applied on top of the squashed "separate special forms" patch:

0001-CLJ-2235-implement-named-loop-recur-to-separate-special-form.patch

Also attaching squashed patch for convenience:

0001-CLJ-2235-recur-to-keyword-loop-names.patch

Here's an example of the new scheme:

Note that recur-to still uses symbols where the target is an fn form.

Also note that the approach taken in this patch has the side effect that a loop named :foo won't shadow an fn-introduced recur target named foo. If we wanted eventually to introduce non-fn-introduced recur target using symbols as names (recur-to-targetable Scheme-style let/loop forms as discussed in the previous comment), that would definitely be the way to go. If not, perhaps it's still less arbitrary than declaring that :foo shadows foo in this context?

Michał Marczyk
May 31, 2020, 6:31 PM

I've just attached a handful of new files with a version of the patch rebased against the current master:

0001-CLJ-2235-implement-named-loop-recur-to.patch
0002-CLJ-2235-support-recur-to-targeting-defn-introduced-.patch
0003-CLJ-2235-enforce-the-tail-position-requirement-for-r.patch
0004-CLJ-2235-avoid-emitting-nil-loop-names.patch

Also attaching a zip with all of the above in case it's more convenient to manage the files this way:

CLJ-2235-2020-05-31-1/CLJ-2235-2020-05-31-1.zip

This version uses symbols as loop names as that's my current aesthetic preference, though of course it would be no problem to use keywords as demonstrated in the previous version of the patch.

NB. this version correctly supports recur-to with defn targets without making defn-introduced functions "named" in the sense of having (fn fn-name […] …) expansions. The source of defn in core.clj has this comment attached:

{noformat}
;;todo - restore propagation of fn name
;;must figure out how to convey primitive hints to self calls first
{noformat}

It's been there since November 2010, so I'm not sure if the current plan is still to reintroduce name propagation – clearly there'd be some trade-offs there – and thus I thought it best not to change this behaviour in this patch.

As it nevertheless seems that the top of a defn-introduced function should be a valid recur-to target, this patch has defn attach a piece of custom metadata (:loop-name) to the fn symbol to convey the top-level loop name to where it's needed without changing the current behaviour of defn w.r.t. (the lack of) fn name propagation to the emitted fn form.

Were fn name propagation to be reintroduced, this special arrangement could be simplified away.

Michał Marczyk
May 31, 2020, 6:37 PM

Looking at the list of attachments, I wonder if it'll be more convenient for the individual filenames to be prefixed with the upload date… So here's the same four files as in the previous upload, renamed for more obvious grouping:

2020-05-31-0001-CLJ-2235-implement-named-loop-recur-to.patch
2020-05-31-0002-CLJ-2235-support-recur-to-targeting-defn-introduced-.patch
2020-05-31-0003-CLJ-2235-enforce-the-tail-position-requirement-for-r.patch
2020-05-31-0004-CLJ-2235-avoid-emitting-nil-loop-names.patch

Michał Marczyk
May 31, 2020, 6:45 PM

Finally,

CLJ-2235-implement-named-loop-recur-to.patch

is a squashed vesion of the last round of patches.

Michał Marczyk
May 31, 2020, 6:53 PM
Edited

Ah, I should add that I’ve pushed the above revisions – unsquashed and squashed – to my GitHub fork of Clojure:

Unsquashed: https://github.com/michalmarczyk/clojure/tree/wip/recur-to-5

Squashed: https://github.com/michalmarczyk/clojure/tree/wip/recur-to-6

Assignee

Unassigned

Reporter

Michał Marczyk

Labels

Approval

Triaged

Patch

None

Affects versions

Priority

Major