Parsing Numbers
Andreas Raab
andreas.raab at gmx.de
Sat Sep 17 22:32:57 UTC 2005
Tom Phoenix wrote:
> Rule 1: In writing numerals in a radix past base ten, capital letters
> must be used to represent the extra digits. Under this rule, 16r1E4 is
> the hexadecimal number 1E4 (484), but 16r1e4 is 65536.
The fact that "16r1E4 ~= 16r1e4" is deeply disturbing if you ever have
to copy hex constants from somewhere else. This is just what I did and I
was staring for an hour at that code wondering why the heck it would
compute total nonsense. The fact that this was crypto code and that
crypto code often involves lots of magic hex constants makes it even
more disturbing (just think about what havoc a wrongly spelled hex
constant might wreck on you).
> Rule 2: In writing numerals in a radix past base ten, bare exponents
> are disallowed. Under this rule, 16r1E4 and 16r1e4 are equal to 484.
> (Under this rule, 16r1e+4 may still be used to denote 65536, if we
> wish to allow such a thing.)
That would be somewhat better but note that the problem here is that
"16r1e+4" can be easily interpreted as "16r1e + 4".
> Is there any alternative rule that's any better than either of these?
How about: In order for consistency to prevail, an (upper or lower case)
character that _could_ be interpreted as a digit under the current base
_will_ be interpreted as such. This would mean 16r1e4 = 16r1E4 and if
you do need an exponent you have to compute it, say "16r1e4 * 10e4".
Which I will admit doesn't look exactly great either but given that I've
yet to find code which has used bases greater than ten with exponents I
feel pretty safe that this won't be a major issue.
I guess the fundamental question here is: Is it more important to have
an easy way to write upper or lower case hex constants or to be able to
use lower case "e" (and possibly other characters) to denote
exponentiation for bases greater than ten?
My (obvious) preference is the first - I have never had the need to
write exponents for bases greater than ten I don't expect to see that
need in the future. Contrary to which I use (and copy!) hex constants
all the time and I am used to read them upper and lower case mixed.
> As a side issue, we should decide whether 'd' and 'q' can stand in for
> 'e'. I'm sure that allowing that would help somebody, and it's not
> likely to cause many problems, once we decide which of the above rules
> to use.
Like I was saying before, where exactly is this used? Who would be
helped by it and why would we care given that Squeak doesn't have the
distinction between single, double, and quad precision?
Cheers,
- Andreas
More information about the Squeak-dev
mailing list
|