Re: URI/URL handling - Simplelists

Show/hide message thread

What libraries we need Lassi Kortela (07 Apr 2019 08:55 UTC)

Re: What libraries we need Peter Bex (07 Apr 2019 09:31 UTC)

URI/URL handling Lassi Kortela (07 Apr 2019 10:11 UTC)

Re: URI/URL handling Peter Bex (07 Apr 2019 10:56 UTC)

Re: URI/URL handling Lassi Kortela (07 Apr 2019 12:03 UTC)

Re: URI/URL handling Lassi Kortela (07 Apr 2019 12:46 UTC)

Re: URI/URL handling Peter Bex (07 Apr 2019 14:20 UTC)

Re: URI/URL handling Lassi Kortela (07 Apr 2019 15:06 UTC)

Re: URI/URL handling Peter Bex (07 Apr 2019 15:39 UTC)

Re: URI/URL handling Lassi Kortela (07 Apr 2019 15:52 UTC)

Re: URI/URL handling Peter Bex (07 Apr 2019 16:03 UTC)

Re: URI/URL handling Lassi Kortela (07 Apr 2019 16:30 UTC)

Re: URI/URL handling Arthur A. Gleckler (09 Apr 2019 21:06 UTC)

Re: What libraries we need Arthur A. Gleckler (09 Apr 2019 20:49 UTC)

Re: URI/URL handling Lassi Kortela 07 Apr 2019 15:52 UTC

>> And the separators ";&" would be given separately to each call of the query
>> parameter list getter. I.e.:
>
> I don't like this; it would require parsing and re-parsing every time
> you access the query alist.

The result from the last call could be cached inside the URI object. It
wouldn't be immutable by a strict definition but I thought this wouldn't
matter since the public interface is looks like it is.

But your comment got me thinking, is it good to permit arbitrary ways to
parse the same URI object after all? Should the API be such that once
you parse a given URI object one way, then it is "locked" into that way
and you can no longer parse it another way. This would enforce a
consistent interpretation of each URI while still permitting URIs
originating form different places to have different interpretations.

> Maybe we can make it another parameter for
> the constructor so it's stored inside the uri object?  That way, we can
> split it when parsing and combine it when constructing the underlying
> raw string.

Do you mean an approach where the above "URI interpretation" is supplied
in the constructor and cannot be changed afterwards? That could be
simplest. Another approach is to also let people specify it in update
procedures (but it would be an error to specify a different
interpretation if the URI has already been locked into another). Your
constructor-only approach would be simpler so maybe it's better.

> Slash is the one thing you can unambiguously encode (because you're
> already separating components on slashes, slashes _must_ be encoded as
> %2F in path components), but giving an error seems rather unfriendly.

It may be formally unambiguous, but it is very confusing to humans who
are not experts at URI syntax :) I don't have a strong opinion on
whether or not it should be permitted.

> The way we handle it in uri-common is to simply encode as much as
> possible.  When that's not desired, you must update the raw path,
> instead.

That seems like a good general principle.