[Seaside] Scamper vs. non-Squeak browser
Julian Fitzell
julian at beta4.com
Tue May 13 11:02:37 CEST 2003
I've pulled out the relevant parts of RFC 1738. The @, ;, and =
characters are reserved for specific uses in each scheme. The spec does
not define their uses in the HTTP spec. They should *not* however be
encoded if they are being used for the reserved purposes in the URL.
We'll have to see whether we think our usage is allowed by the spec, but
to say that those characters are unsafe in a URL is completely
incorrect. They should never be escaped when used as prescribed in the
scheme, and should always be escaped when used in any other fashion. Is
Scamper rewriting the URLs that it is being redirected to?
Anyway, the following should give us points for further discussion if we
want to change the url format again...
Julian
====snip from RFC 1738=======
Unsafe:
Characters can be unsafe for a number of reasons. The space
character is unsafe because significant spaces may disappear and
insignificant spaces may be introduced when URLs are transcribed or
typeset or subjected to the treatment of word-processing programs.
The characters "<" and ">" are unsafe because they are used as the
delimiters around URLs in free text; the quote mark (""") is used to
delimit URLs in some systems. The character "#" is unsafe and should
always be encoded because it is used in World Wide Web and in other
systems to delimit a URL from a fragment/anchor identifier that might
follow it. The character "%" is unsafe because it is used for
encodings of other characters. Other characters are unsafe because
gateways and other transport agents are known to sometimes modify
such characters. These characters are "{", "}", "|", "\", "^", "~",
"[", "]", and "`".
All unsafe characters must always be encoded within a URL. For
example, the character "#" must be encoded within URLs even in
systems that do not normally deal with fragment or anchor
identifiers, so that if the URL is copied into another system that
does use them, it will not be necessary to change the URL encoding.
Reserved:
Many URL schemes reserve certain characters for a special meaning:
their appearance in the scheme-specific part of the URL has a
designated semantics. If the character corresponding to an octet is
reserved in a scheme, the octet must be encoded. The characters ";",
"/", "?", ":", "@", "=" and "&" are the characters which may be
reserved for special meaning within a scheme. No other characters may
be reserved within a scheme.
Usually a URL has the same interpretation when an octet is
represented by a character and when it encoded. However, this is not
true for reserved characters: encoding a character reserved for a
particular scheme may change the semantics of a URL.
Thus, only alphanumerics, the special characters "$-_.+!*'(),", and
reserved characters used for their reserved purposes may be used
unencoded within a URL.
On the other hand, characters that are not required to be encoded
(including alphanumerics) may be encoded within the scheme-specific
part of a URL, as long as they are not being used for a reserved
purpose.
=====end snip======
Jim Menard wrote:
> Avi,
>
> On Monday, May 12, 2003, at 11:00 PM, Avi Bryant wrote:
>
>>
>> On Mon, 12 May 2003, Jim Menard wrote:
>>
>>> When I try to use Seaside from within Scamper, it displays the error
>>> message
>>>
>>> error occured retrieving http://localhost:9090/seaside/rate:
>>> Could not
>>> resolve the server named:
>>
>>
>> I played around with this a bit, and the problem is definitely with
>> Scamper - or rather, with the HTTPSocket class that it uses. This is a
>> really fragile HTTP client, as I've found when trying to use it before -
>> in this case, it's not properly supporting relative paths in redirects.
>> That particular problem looks like it can be fixed by changing the '../'
>> string literal in HTTPSocket class>>expandUrl:ip:port: to '/' instead
>> (I'm
>> completely perplexed by the use of '../' in this method), but after doing
>> this I immediately ran into another problem, that seems like it might be
>> related to Scamper mistreating the '@' sign in Seaside urls. It's
>> possible that someone could also fix this one fairly easily, but in
>> general I'm not optimistic about Scamper being useful for Seaside work
>> until someone revamps the HTTP client implementation (which I understand
>> Stephen is trying to lay some groundwork for anyway).
>
>
> Thanks for the fix.
>
> The problem with '@' (or '=' or ';') in the URL is that they are not
> legal URL characters. I only see the latter two in my Seaside URLs. They
> get correctly encoded using String>>encodeForHTTP into '%3D' and '%3B'
> respectively. See Character>>isSafeForHTTP for the definition of legal
> characters.
>
> I'd like to suggest that Seaside change the way that it encodes its
> URLs. There are a few different possible approaches: run them through
> String>>encodeForHTTP, turn the 'k=3;xyzzy' part into parameters like
> '?k=3&id=xyzzy', or use a different encoding scheme that sticks to legal
> characters.
>
> I will play around with these and let the group know what I come up with.
>
> Jim
> --
> Jim Menard, jimm at io.com, http://www.io.com/~jimm/
> ---- BEGIN META GEEK CODE ----
> gc
> ---- END META GEEK CODE ----
>
> _______________________________________________
> Seaside mailing list
> Seaside at lists.squeakfoundation.org
> http://lists.squeakfoundation.org/listinfo/seaside
More information about the Seaside
mailing list