Re: [Chicken-users] Which eggs to migrate from Chicken 3 first? Alex Shinn 07 May 2009 06:32 UTC

Hi,

Phil Bewig <xxxxxx@gmail.com> writes:

> Streams are not lists.  Scheme ensures there are
> substantial disadvantages to using streams rather than
> lists, due to the underlying promises that require
> numerous type conversions, so streams should be used only
> when the sequence of elements is truly infinite (such as
> mathematical series) or when there is some clear advantage
> of laziness (such as reducing the number of passes through
> a large data set).

I also find that any data structure in general which is
built on I/O works well as a stream.  For the current NLP
app I'm working on, I need to build a parse graph from a
port.  Slurping up the whole input at once could require too
much memory, and also would prevent the parser acting as a
proper (buffered) Unix filter.  On the other hand, since the
algorithm for determining how much text I need to work with
is dynamic, I can't just read chunks at a time (the basic
unit I want to work on is a sentence, but I don't know what
a sentence is until the parse is finished).  So I build the
graph as a lazy stream of nodes, and the algorithm
transparently expands only as much input as needed.

>From what I've seen Alejandro also uses streams primarily
with I/O - it's a very natural combination.

> Writing a library that duplicates SRFI-1 for streams seems
> to me to be folly.

Well, if you have streams at all, even if they are only
suited to a special class of algorithms, it makes sense to
provide a complete library of basic operations.  Otherwise
people will continuously reinvent the same utilities, and
sometimes get them wrong.

In fact, it is specifically desirable to provide an exact
match of the SRFI-1 API for easy conversions and
comparisons.  In any module system with import prefixing
(including Chicken 4), you can write everything with a
stream- prefix and swap the backend from streams to lists
with:

  (import srfi-41)
  (import (prefix stream-) srfi-1)

Going the other direction (writing for SRFI-1 but then
switching to streams) is only a little more verbose,
requiring renaming of the individual identifiers you use.

> Some certainly don't belong in a general-purpose library
> -- if you need symbol->stream to convert the name of a
> symbol into a stream of characters, you can write it as
> part of your program.

Sure:

   (define (symbol->stream sym)
     (string->stream (symbol->string sym)))

That's just trivial and probably borderline enough that it
isn't needed in the library.  string->stream or equivalent
functionality should be included, though, because the most
efficient implementation of this may vary wildly depending
on the Scheme implementation.

However, the name may be unintuitive if you're not coming
from a "streams as I/O" perspective.  It may be both simpler
to specify and easier to understand if you replace most of
the foo->stream procedures with:

  (write-to-character-stream object)
  (display-to-character-stream object)

> Many -- such as stream-butlast -- make sense only for
> lists (which are materialized in their entirety) and not
> for streams (which may never be materialized).

I think it's a little unfair to pick on stream-butlast when
SRFI-41 includes stream-length, stream->list,
stream-reverse, etc.  As you yourself say, not all streams
are infinite, and for finite streams these can be useful.
Otherwise you'll repeatedly find people who when working
entirely with streams (for type signature compatibility, and
because all of their utilities are designed for streams, not
lists), write things like

  (list->stream (butlast (stream->list stream)))

when they really do need all but the last element of a
stream they know to be finite.

[I would argue the name and API should be changed to
stream-drop-right to match SRFI-1, though.]

Now, if you want to argue that the SRFI-1 API is too large,
that's another story :)

--
Alex