[Seaside] Scamper vs. non-Squeak browser

Julian Fitzell julian at beta4.com
Tue May 13 11:02:37 CEST 2003


I've pulled out the relevant parts of RFC 1738.  The @, ;, and = 
characters are reserved for specific uses in each scheme.  The spec does 
not define their uses in the HTTP spec.  They should *not* however be 
encoded if they are being used for the reserved purposes in the URL. 
We'll have to see whether we think our usage is allowed by the spec, but 
to say that those characters are unsafe in a URL is completely 
incorrect.  They should never be escaped when used as prescribed in the 
scheme, and should always be escaped when used in any other fashion.  Is 
Scamper rewriting the URLs that it is being redirected to?

Anyway, the following should give us points for further discussion if we 
want to change the url format again...

Julian

====snip from RFC 1738=======
    Unsafe:

    Characters can be unsafe for a number of reasons.  The space
    character is unsafe because significant spaces may disappear and
    insignificant spaces may be introduced when URLs are transcribed or
    typeset or subjected to the treatment of word-processing programs.
    The characters "<" and ">" are unsafe because they are used as the
    delimiters around URLs in free text; the quote mark (""") is used to
    delimit URLs in some systems.  The character "#" is unsafe and should
    always be encoded because it is used in World Wide Web and in other
    systems to delimit a URL from a fragment/anchor identifier that might
    follow it.  The character "%" is unsafe because it is used for
    encodings of other characters.  Other characters are unsafe because
    gateways and other transport agents are known to sometimes modify
    such characters. These characters are "{", "}", "|", "\", "^", "~",
    "[", "]", and "`".

    All unsafe characters must always be encoded within a URL. For
    example, the character "#" must be encoded within URLs even in
    systems that do not normally deal with fragment or anchor
    identifiers, so that if the URL is copied into another system that
    does use them, it will not be necessary to change the URL encoding.

    Reserved:

    Many URL schemes reserve certain characters for a special meaning:
    their appearance in the scheme-specific part of the URL has a
    designated semantics. If the character corresponding to an octet is
    reserved in a scheme, the octet must be encoded.  The characters ";",
    "/", "?", ":", "@", "=" and "&" are the characters which may be
    reserved for special meaning within a scheme. No other characters may
    be reserved within a scheme.

    Usually a URL has the same interpretation when an octet is
    represented by a character and when it encoded. However, this is not
    true for reserved characters: encoding a character reserved for a
    particular scheme may change the semantics of a URL.

    Thus, only alphanumerics, the special characters "$-_.+!*'(),", and
    reserved characters used for their reserved purposes may be used
    unencoded within a URL.

    On the other hand, characters that are not required to be encoded
    (including alphanumerics) may be encoded within the scheme-specific
    part of a URL, as long as they are not being used for a reserved
    purpose.
=====end snip======

Jim Menard wrote:

> Avi,
> 
> On Monday, May 12, 2003, at 11:00  PM, Avi Bryant wrote:
> 
>>
>> On Mon, 12 May 2003, Jim Menard wrote:
>>
>>> When I try to use Seaside from within Scamper, it displays the error
>>> message
>>>
>>>     error occured retrieving http://localhost:9090/seaside/rate: 
>>> Could not
>>> resolve the server named:
>>
>>
>> I played around with this a bit, and the problem is definitely with
>> Scamper - or rather, with the HTTPSocket class that it uses.  This is a
>> really fragile HTTP client, as I've found when trying to use it before -
>> in this case, it's not properly supporting relative paths in redirects.
>> That particular problem looks like it can be fixed by changing the '../'
>> string literal in HTTPSocket class>>expandUrl:ip:port: to '/' instead 
>> (I'm
>> completely perplexed by the use of '../' in this method), but after doing
>> this I immediately ran into another problem, that seems like it might be
>> related to Scamper mistreating the '@' sign in Seaside urls.  It's
>> possible that someone could also fix this one fairly easily, but in
>> general I'm not optimistic about Scamper being useful for Seaside work
>> until someone revamps the HTTP client implementation (which I understand
>> Stephen is trying to lay some groundwork for anyway).
> 
> 
> Thanks for the fix.
> 
> The problem with '@' (or '=' or ';') in the URL is that they are not 
> legal URL characters. I only see the latter two in my Seaside URLs. They 
> get correctly encoded using String>>encodeForHTTP into '%3D' and '%3B' 
> respectively. See Character>>isSafeForHTTP for the definition of legal 
> characters.
> 
> I'd like to suggest that Seaside change the way that it encodes its 
> URLs. There are a few different possible approaches: run them through 
> String>>encodeForHTTP, turn the 'k=3;xyzzy' part into parameters like 
> '?k=3&id=xyzzy', or use a different encoding scheme that sticks to legal 
> characters.
> 
> I will play around with these and let the group know what I come up with.
> 
> Jim
> -- 
> Jim Menard, jimm at io.com, http://www.io.com/~jimm/
> ---- BEGIN META GEEK CODE ----
> gc
> ---- END META GEEK CODE ----
> 
> _______________________________________________
> Seaside mailing list
> Seaside at lists.squeakfoundation.org
> http://lists.squeakfoundation.org/listinfo/seaside




More information about the Seaside mailing list