transduce does not respect the init arity of transducers
Note: I initially raised this issue for discussion on the mailing list
transduce and other transducible processes currently ignore the 'init' arity of transducers. The currently implementation of transduce takes the 'init' from the reducing function before being transformed by the transducer, rather the reducing function after being transformed.
The current implementation of transduce is equivalent to the following (simplified for exposition purposes):
Current implementation of transduce
The arity 3 case uses (f) to construct the seed value of the reduction. The arity 4 case uses the explicitly provided seed, init.
I would like to propose an alternate implementation of transduce, one which makes use of the transducer when seeding the reduction.
Proposed implementation of transduce
Now, the arity 3 case uses (xform f) to construct the seed value of the reduction. The arity 4 case combines both f and init into a new reducing function that is given to xform. Both of these ensure that the init arity of the transducer is used.
As into is implemented in terms of transduce, it is also taken care of. However, sequence is separate, and would also have to be tweaked to respect the init arity.
Comment made by: sfnelson
I would appreciate a response to Daniel's question. Why is an init-arity required if it's never called?
Comment made by: dwhjames
Ok, as you can guess I’m a little perplexed by that design choice, but I’ll accept it.
I’d appreciate any further insight you can offer on why this design choice has been taken.
Is the init arity simply a case of compatibility, despite it not being used? Is this a case of attempting to prevent the transducer writer from erroneously corrupting a transducible process? Is init-with actually actually considered to be an invalid transducer, and thus the only way to implement something equivalent would be as a stateful transducer?
Rich asked me to decline the ticket because the init arity of the xform should not be involved in the reducing function accumulation.
Comment made by: dwhjames
Ok, at the risk of belaboring the point (I have enough self-awareness to realized that I am probably about to do exactly that…) I feel that you are still missing something here.
Permit me to try one more time to explain my position.
the map transducer
It defines all three arities, init, step, and completion. It doesn’t have anything to do in init arity, and so the only thing it can do is — “call the init arity on the nested transform rf, which will eventually call out to the transducing process.” (taken from your update to http://clojure.org/transducers)
Saying that transducers should not be involved in the accumulating process has the right spirit, but you are missing something. It is involved, but in a strictly constrained way. The transducer’s responsibility is to carefully thread the accumulator value around. Sure, it should not know what the value is, or what type it has, but it is still there. Every arity of map has access to it! In the init arity, map delegates to rf to construct it. In the completion arity, map has the result, but the only valid thing it can do with it is to pass it on to rf. Again, in the step arity, map has the result, and again the only legitimate thing it can do with it is to thread to through to rf.
Now consider the identity transducer:
the identity transducer
This is a transducer in its purest form. All it has to do is correctly thread the accumulation value around. It doesn’t and shouldn’t know any details of what that value is, nonetheless, it still has the responsibility of threading that value correctly.
In each arity the identity transducer does the ‘trivial’ thing. In my post to the mailing list, I illustrated three example of transducers that do something beyond the trivial thing in each of the three arities. (I’ll copy them here for completeness.)
non trivial threading of the accumulator in the _init_ arity
non trivial threading of the accumulator in the _completion_ arity
non trivial threading of the accumulator in the _step_ arity
I would consider all of these to be perfectly valid transducers. However, unless I’ve misunderstood, you appear to be taking issue with init-with. If so, I’m very curious as to why!
a closer look at the _init_ arity of init-with
Rather than just delegating to (rf), it threads that value immediately into rf with (rf (rf) x). So I don’t agree at all that any of these, init-with, complete-with, or dupl, are “entangled” with the accumulation value or the accumulation process. They are completely oblivious to both its value and its type!
So, returning to transduce,
the first case of an alternate transduce
A valid transducer is one that threads the accumlation value correctly. Therefore, ((xform f)) is (f) threaded through xform. All the transducers in clojure.core have the trivial ( (rf)), so ((xform f)) built from these core transducers degenerates into (identity (f)).
However, as transduce, into, and sequence never even invoke the init arity, it begs the question, why even require that transducers have that arity in the first place? Personally, I think that init arity is great as it enables a transducer such as init-with (while remaining stateless), but that requires transducible processes to actually make use of the init arity! Hence why I raised this issue.
It seems troubling to me that complete-with works perfectly fine in the current framework, yet init-with, its dual, does not.
I recognize that the various discussions around ‘typing transducers’ have made various approximations at elucidating the properties of transducers, but I feel strongly that the discussions around rank-2 polymorphism have some bearing on exactly this issue. In fact, it says rather a lot about correctly threading the accumulation value throught transducers without ever “entangling” it in the precise accumulation process of where a transducer is being used.
And on this, it appears that Rich Hickey agrees: — “The rank-2 type in particular captures an important property.” (http://conscientiousprogrammer.com/blog/2014/08/07/understanding-cloure-transducers-through-types/#comment-1533318972) Maybe I’ve got him all wrong, but as of right now I’m pretty convinced I don’t. Still, I’m willing to be convinced otherwise
By "entangling" I mean that in your alternate transduce you invoke the xform to obtain the initial value: ((xform f)) instead of (f). Transducers should not know about or be involved in the accumulating process.
The transducers page is in error and I will correct it (I wrote it; the error is mine).