On Wed, May 6, 2009 at 4:56 PM, Alex Shinn <xxxxxx@gmail.com> wrote:
Hi,

Phil Bewig <xxxxxx@gmail.com> writes:

> Streams are not lists.  Scheme ensures there are
> substantial disadvantages to using streams rather than
> lists, due to the underlying promises that require
> numerous type conversions, so streams should be used only
> when the sequence of elements is truly infinite (such as
> mathematical series) or when there is some clear advantage
> of laziness (such as reducing the number of passes through
> a large data set).

I also find that any data structure in general which is
built on I/O works well as a stream.  For the current NLP
app I'm working on, I need to build a parse graph from a
port.  Slurping up the whole input at once could require too
much memory, and also would prevent the parser acting as a
proper (buffered) Unix filter.  On the other hand, since the
algorithm for determining how much text I need to work with
is dynamic, I can't just read chunks at a time (the basic
unit I want to work on is a sentence, but I don't know what
a sentence is until the parse is finished).  So I build the
graph as a lazy stream of nodes, and the algorithm
transparently expands only as much input as needed.

>From what I've seen Alejandro also uses streams primarily
with I/O - it's a very natural combination.

You are correct that streams and parsing go well together.  One of the examples I wrote for SRFI-41 was a parsing library based on the calculator Wadler wrote for his 'list of successes' paper, but I dropped it because it was too long, and illustrated the list of successes technique with the eight-queens problem.

Parsing also works well without streams, as decades of lex/yacc users can attest.

> Writing a library that duplicates SRFI-1 for streams seems
> to me to be folly.

Well, if you have streams at all, even if they are only
suited to a special class of algorithms, it makes sense to
provide a complete library of basic operations.  Otherwise
people will continuously reinvent the same utilities, and
sometimes get them wrong.

In fact, it is specifically desirable to provide an exact
match of the SRFI-1 API for easy conversions and
comparisons.  In any module system with import prefixing
(including Chicken 4), you can write everything with a
stream- prefix and swap the backend from streams to lists
with:

 (import srfi-41)
 (import (prefix stream-) srfi-1)

Going the other direction (writing for SRFI-1 but then
switching to streams) is only a little more verbose,
requiring renaming of the individual identifiers you use.

I disagree.  Streams and lists are not substitutable.  Haskell conflates the two because everything in Haskell is lazy, so there is no difference.  But that isn't true in Scheme.

> Some certainly don't belong in a general-purpose library
> -- if you need symbol->stream to convert the name of a
> symbol into a stream of characters, you can write it as
> part of your program.

Sure:

  (define (symbol->stream sym)
    (string->stream (symbol->string sym)))

That's just trivial and probably borderline enough that it
isn't needed in the library.  string->stream or equivalent
functionality should be included, though, because the most
efficient implementation of this may vary wildly depending
on the Scheme implementation.

However, the name may be unintuitive if you're not coming
from a "streams as I/O" perspective.  It may be both simpler
to specify and easier to understand if you replace most of
the foo->stream procedures with:

 (write-to-character-stream object)
 (display-to-character-stream object)

When would it make sense to convert the name of a symbol to a stream of characters?

> Many -- such as stream-butlast -- make sense only for
> lists (which are materialized in their entirety) and not
> for streams (which may never be materialized).

I think it's a little unfair to pick on stream-butlast when
SRFI-41 includes stream-length, stream->list,
stream-reverse, etc.  As you yourself say, not all streams
are infinite, and for finite streams these can be useful.
Otherwise you'll repeatedly find people who when working
entirely with streams (for type signature compatibility, and
because all of their utilities are designed for streams, not
lists), write things like

 (list->stream (butlast (stream->list stream)))

when they really do need all but the last element of a
stream they know to be finite.

[I would argue the name and API should be changed to
stream-drop-right to match SRFI-1, though.]

Stream-length and stream-reverse are certainly candidates to be dropped from a stream library.  I'm not entirely sure how I feel about them.

You and Alejandro seem to be suggesting that you would use streams in preference to lists in a variety of programs.  Take as given that streams will be much slower than lists, at least in current Scheme implementations.  I can't imagine using streams for much more than toy programs or academic exercises.  I would always find a better way.


Now, if you want to argue that the SRFI-1 API is too large,
that's another story :)

It is.  I never use SRFI-1.  My personal standard library has take, drop, take-while, drop-while, range, iterate, filter, zip, a simple list comprehension macro, and pattern matching.


--
Alex

I continue to maintain that it makes no sense to use SRFI-1 as a checklist for implementing a streams extension library.  I hate to keep repeating myself, but streams are not lists.  The two data types are used in two entirely different sets of circumstances, with different usage patterns.

All that said, if you want to write a streams-ext library that mimics SRFI-1, feel free.  Just be sure it is based on SRFI-41, not SRFI-40.  If you are concerned that there is some conflict between the old streams-ext and the streams-derived of SRFI-41, and don't mind using the same name for two different things, you can build streams-ext on top of streams-primitive and provide exactly what you want.

Phil