Deleting dupes without sorting just requires a dictionary.  If you've seen this before, don't push it through.  It is stateful, but at least it is O(n).

On Tue, Oct 8, 2019 at 3:59 PM Linus Björnstam <xxxxxx@veryfast.biz> wrote:
Resending this, because I had some problems with my email aliases, so the things I wrote never made it back to the mailing lists:



On Tue, 10 Sep 2019, at 14:14, Duy Nguyen wrote:
> OK some more. This time I really do read it :)
>
> add1 is used in examples but not really defined. Common sense probably
> works. But for specifications like these I think it's better to avoid
> that. In some other examples like 'rany', "add1" is basically spelled
> out as a lambda instead, which works too.

Thanks. I always thought it was standard since I have it in chez. You learn something every day :D


>
> For rany part, the second example returns "5" on Guile, Chibi and
> Gauche, not #t. Perhaps it's the problem with 'rany' implementation?
> Or perhaps this function could be described in more detail if "5" is
> expected.

This is as it should. I must update the tests. I want them to work like any and every in srfi-1.


>
> Maybe I suggest rename delete-neighbor-dupes with tunique? It seems to
> do the same thing UNIX "uniq" (or C++ std::unique) does. The name is a
> bit shorter but does not really lose any meaning.
>
> Which makes me think if we should have tsort, since the equivalent of
> tdelete-duplicates in UNIX/C++ is basically "sort | uniq". Then we
> don't need tdelete-duplicates because you can easily compose. I guess
> not because most sort algorithm needs to read the whole (or a big
> chunk) of the input to do its job, except merge sort...

Yeah. This would be stateful to the extreme and would sort of violate "don't build intermediate collections". It is a nice idea though :D
>
> The stateful transducers do make me wonder if we could avoid mutation
> by making the transducer return the result _and_ a (optionally new)
> transducer though. That way the state could be carried along with the
> new transducer, kinda like fold. The overhead is likely higher than
> stateful implementation.

I tried the following for pure/visible state transducers: I tried having each step in the reduction cons it's state in a list and then having a non-tco tail-call to the reducer step for the reducers that wanted to keep state. That way I could experiment with both mutable but visible state and pure transducers. The implementation was OKish, but the GC preassure became very high in the pure case (a new N-element list for each reducing step) or boxing mutable variable overhead (either a lot when done automatically as in chez or guile, or just a little bit too much when using boxes manually).

The Atria transducers for C++ are pure, but that is due to some C++ voodoo (and clever use of move semantics) that I wont be able to implement portably (if at all) in scheme.

All in all, the benefits of purity are overshadowed by the speed costs. I have refrained from mentioning _how_ the transducers keep state, which means you can implement pure transducers and still follow the SRFI (but also introduce portability issues).

>
> That's it! Nice SRFI.

Thanks for your inputs and corrections! I will sit down this week and put everything in.

> --
> Duy
>

--
  Linus Björnstam