Email list hosting service & mailing list manager

What libraries we need Lassi Kortela (07 Apr 2019 08:55 UTC)
Re: What libraries we need Peter Bex (07 Apr 2019 09:31 UTC)
URI/URL handling Lassi Kortela (07 Apr 2019 10:11 UTC)
Re: URI/URL handling Peter Bex (07 Apr 2019 10:56 UTC)
Re: URI/URL handling Lassi Kortela (07 Apr 2019 12:03 UTC)
Re: URI/URL handling Lassi Kortela (07 Apr 2019 12:46 UTC)
Re: URI/URL handling Peter Bex (07 Apr 2019 14:20 UTC)
Re: URI/URL handling Lassi Kortela (07 Apr 2019 15:06 UTC)
Re: URI/URL handling Peter Bex (07 Apr 2019 15:39 UTC)
Re: URI/URL handling Lassi Kortela (07 Apr 2019 15:52 UTC)
Re: URI/URL handling Peter Bex (07 Apr 2019 16:03 UTC)
Re: URI/URL handling Lassi Kortela (07 Apr 2019 16:30 UTC)
Re: URI/URL handling Arthur A. Gleckler (09 Apr 2019 21:06 UTC)
Re: What libraries we need Arthur A. Gleckler (09 Apr 2019 20:49 UTC)

Re: URI/URL handling Peter Bex 07 Apr 2019 16:03 UTC
On Sun, Apr 07, 2019 at 06:52:17PM +0300, Lassi Kortela wrote:
> > > And the separators ";&" would be given separately to each call of the query
> > > parameter list getter. I.e.:
> >
> > I don't like this; it would require parsing and re-parsing every time
> > you access the query alist.
>
> The result from the last call could be cached inside the URI object. It
> wouldn't be immutable by a strict definition but I thought this wouldn't
> matter since the public interface is looks like it is.

But then you'd need to compare the string with the one we passed in last
whenever you want to convert.

> But your comment got me thinking, is it good to permit arbitrary ways to
> parse the same URI object after all? Should the API be such that once you
> parse a given URI object one way, then it is "locked" into that way and you
> can no longer parse it another way. This would enforce a consistent
> interpretation of each URI while still permitting URIs originating form
> different places to have different interpretations.

I think locking it in is fine.  Perhaps you could still update the
separator which would cause the raw query to be re-parsed in the new
object.

> > Maybe we can make it another parameter for
> > the constructor so it's stored inside the uri object?  That way, we can
> > split it when parsing and combine it when constructing the underlying
> > raw string.
>
> Do you mean an approach where the above "URI interpretation" is supplied in
> the constructor and cannot be changed afterwards? That could be simplest.

Yeah.

> > Slash is the one thing you can unambiguously encode (because you're
> > already separating components on slashes, slashes _must_ be encoded as
> > %2F in path components), but giving an error seems rather unfriendly.
>
> It may be formally unambiguous, but it is very confusing to humans who are
> not experts at URI syntax :) I don't have a strong opinion on whether or not
> it should be permitted.

It may be confusing, but it's terribly wrong to *not* encode slashes,
because writing and then re-parsing the URI would change its meaning,
which could cause all sorts of security issues.

Cheers,
Peter