[squeak-dev] OpenCL
Josh Gargus
schwa at fastmail.us
Sat Jan 10 20:30:13 UTC 2009
Reading the CUDA 2.0 programming guide, I saw an interesting difference
between the error-bounds for single- and double-precision floating point:
Double-precision
Operation ULPs
x+y 0 (IEEE-754 round-to-nearest-even)
x*y 0 (IEEE-754 round-to-nearest-even)
x/y 0 (IEEE-754 round-to-nearest-even)
1/x 0 (IEEE-754 round-to-nearest-even)
sqrt(x) 0 (IEEE-754 round-to-nearest-even)
Single-precision
Operation ULPs
x+y 0 (IEEE-754 round-to-nearest-even)
x*y 0 (IEEE-754 round-to-nearest-even)
x/y 2
1/x 1
sqrt(x) 3
So, there might be some hope, at least for double-precision ops.
Currently, AFAIK, the latest NVIDIA GPUSs are the only ones with
double-precision FP support. But, things are changing rapidly in this
area: in about a year, Intel will release Larrabee, which will "fully
support IEEE standards for single and double precision floating-point
arithmetic". Hopefully this forces the other vendors to follow suit,
and future OpenCL revisions reflect this.
Cheers,
Josh
Josh Gargus wrote:
> Bert Freudenberg wrote:
>> On 10.01.2009, at 11:01, Josh Gargus wrote:
>>> As noted by John, Croquet uses fdlibm for bit-identical floating
>>> point math. Does anyone have a feeling for how difficult (or
>>> impossible) it will be to achieve identical computation on
>>> OpenCL-compliant devices?
>>
>> The numerical behavior of compliant OpenCL implementations is covered
>> in section 7 of the OpenCL spec. In particular, table 7.1 gives the
>> error bounds for the various operations. If I interpret that
>> correctly, very few functions are guaranteed to behave bit-identical.
>
> Oops, I was skimming by the time I read that part of the spec. I saw
> that the transcendental functions need not return identical results
> (which was why I mentioned porting fdlibm), but I missed that even x/y
> isn't precisely specified.
>
>>
>>> For example, how difficult would it be to port, say, fdlibm, so
>>> that trancedentals use the exact same code? Any other show-stoppers
>>> that might not occur to the naive mind :-) ?
>>
>> Well OpenCL only requires single-precision, double-precision support
>> is optional, whereas fdlibm is double-precision only.
>
> I didn't know that. Maybe that's what the "d" stands for in "fdlibm".
>
>> I don't know how compliant current implementations actually are.
>
> I think that there's only one implementation right now, and only
> available to paid-up Apple developers. Anyway, I'm more interested in
> what the spec says than current conformance... implementations will
> gradually become more compliant.
>
> Thanks,
> Josh
>
>
>> - Bert -
>>
>>
>>
>
>
More information about the Squeak-dev
mailing list
|