[Q] Isn't 'file://foo/bar' asUrl supposed to give a relative FileUrl?

Lex Spoon lex at cc.gatech.edu
Tue Aug 19 03:55:25 UTC 2003


Michael Rueger <michael at squeakland.org> wrote:
> Lex Spoon wrote:
> 
> > Michael Rueger <michael at squeakland.org> wrote:
> 
> 
> > But, I also think this kind of thing should be handled in the parser,
> > not the protocol handler.  I don't even understand how the protocol
> > handler *could* handle it.  If you have a parsed URL, then you have a
> > scheme already and so it's too late to make a decision about what to do
> > with 'www.google.com' or 'foo.txt'.  If there is going to be smart
> > parsing, then it seems like the parser needs to do it.
> 
> That is exactly the point. The parser doesn't need to do anything with 
> www.google.com. According to the RFC it is a perfectly valid *relative* 
> URI. If you want e.g. Scamper to do something with it, then Scamper 
> could complement the http://, as Scamper can relatively safely assume 
> that you mean a http url.

I think we agree in essence.  There are two kinds of parsers in
principle.  The thing is, I only implemented one.  :)  It would be nice
if Scamper could ask for a smart parse and, oh, the SqueakMap client
could use strict parses.  If you only have one parser, though, it seems
reasonable to choose the tolerant version.

Please note that I made efforts that smartness *not* get in the way of
anything that has a correct meaning otherwise.  That includes your
'www.google.com' example, if I understand what you are after.  For
example:

	'www.google.com' asUrlRelativeTo: 'http://www.w3c.org/' asUrl
		==>  'http://www.w3c.org/www.google.com'

Note that Scamper uses asUrl, not asUrlRelativeTo:, and thus it will get
the http:// prepended.  But as you say, this behavior is fairly safe.


Let's look at this example which is more challenging:

> >>none of the discussed problems are an issue.
> >>E.g., 'g' asUrl -> http://g/  is one of these smart things that are just 
> >>plain incorrect.
> 
> > What do you mean by incorrect?  This behavior is correct according to
> > the letter of the RFC's, because the input is invalid to begin with. 
> 
> See above. The input is perfectly valid for a relative URI. This above 
> behavior is one of the "smart" actions that actually get in the way. If 
> I want to specify a relative URI that I at some later point want to 
> resolve against a file, ftp, or http base address. the above doesn't let 
> me do this. Is basically have to concatenate the Strings myself as
> 
> 'g/' asUrl asUrlRelativeTo: 'file:/data/test' asUrl
> yields
> 
> http://g/
> 
> as a result. It should be
> 
> 'file:/data/test/g/'
> 

To get your desired result, simply don't send the first asUrl:

	 'g/' asUrlRelativeTo: 'file:/data/test' asUrl


When you send asUrl to 'g/', you get an absolute URL back, and an
absolute URL relative to anything else is itself again.  

Thus Squeak's behavior seems correct.  Theoretically it could be clearer
by making asUrl be called asAbsoluteUrl or asUrlRelativeToNothing, but
asUrl seems pretty clear once you get the idea that all URL objects are
in fact absolute.


Possibly, you could get into having some sort of RelativeURL object, and
add a method like asRelativeUrl with no argument.  This idea doesn't
seem to work out very well, however.  In particular, you don't even know
the *syntax* of a relative URL until you know what it is relative to;
thus, you can't do much of anything with it.  I guess you could have
RelativeHTTPUrl, where you assume it will be used relative to an HTTP
URL, but this sounds like a highly special-purpose usage.  Most usages
can simply pass around the string of a relative URL.


>  Please take a look at the 
> URI package and then think of cases where you want to register new 
> schemes and protocol handlers, a thing that the current url 
> implementaiton just doesn't support because it makes assumptions about 
> the meaning of the URI *strings*.
> 

I don't understand.  While there is currently no scheme registry in
Squeak's URL hierarchy, there is nothing stopping one from being
written.  It could simply be a dictionary mapping things like 'file' to
things like FileUrl, plus methods to add and remove entries from the
dictionary.

Such a registry might or might not be useful; it depends on whether
there would be packages other than the URL package which define new
kinds of URL's.  Would there be?  I know of none so far, and generally
defining new URL schemes is frowned upon.  But it's a small thing; a
registry can be added very easily if that's desired.


Lex



More information about the Squeak-dev mailing list