Hello gods of FFI and of the vm,
I am blocked for a while in the development of Smallapack (Smalltalk interface to LAPACK) on squeak.
I have a very strange behaviour isolated on a small test case (see below):
I want to call DLANGE, a LAPACK FORTRAN routine to compute the norm of a matrix.
And on the seventh call, i always get a stange result (result is 0.0 but will answer false to = 0.0).
This fail on unix with image 3.9alpha7029, vm is: squeak -version 3.7-7 #1 Sat Mar 19 13:12:20 PST 2005 gcc 3.3.5 Squeak3.7 of '4 September 2004' [latest update: #5989] Linux squeak.hpl.hp.com 2.4.27-1-386 #1 Fri Sep 3 06:24:46 UTC 2004 i686 GNU/Linux default plugin location: /usr/local/lib/squeak/3.7-7/*.so
Same code seem to work on windows (i have tried it today)...
Beware, i presume this can corrupt your image. Maybe i am doing something wrong, could someone explain me please ?
Nicolas
----------------------------------------------------------------------------------------------------
My definition of dlange2.c (translated with f2c then modified to simply answer 0.0) is:
/* #include "f2c.h" */ typedef double doublereal; typedef long integer; typedef long ftnlen; /*< DOUBLE PRECISION FUNCTION DLANGE( NORM, M, N, A, LDA, WORK ) >*/ doublereal dlange2_(char *norm, integer *m, integer *n, doublereal *a, integer *lda, doublereal *work, ftnlen norm_len) { return 0.0; } /* dlange2_ */
you just compile with:
gcc -c dlange2.c; ld -shared -o libdlange2.so dlange2.o
and call from Squeak with:
ExternalLibrary subclass: #DLANGE2Library instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: 'Smallapack-Test-DLANGE'
DLANGE2Library class>>moduleName ^'dlange2'
DLANGE2Library>>dlange2Withnorm: norm m: m n: n a: a lda: lda work: work length: lengthOfnorm <cdecl: double 'dlange2_'( char * long * long * double * long * double * long )> ^self externalCallFailed
DLANGE2Library class>>testDLANGE2 "DLANGE2Library testDLANGE2"
| a m n lda norm cm cn clda work | m := lda := 3. n := 4. a := ExternalData fromHandle: (ByteArray new: m*n*8) type: ExternalType double.
"AS FORTRAN IS PASSING POINTERS, DO ALLOCATE ExternalData" cm := ExternalData fromHandle: ((ByteArray new: 4) signedLongAt: 1 put: m; yourself) type: ExternalType long. cn := ExternalData fromHandle: ((ByteArray new: 4) signedLongAt: 1 put: n; yourself) type: ExternalType long. clda := ExternalData fromHandle: ((ByteArray new: 4) signedLongAt: 1 put: lda; yourself) type: ExternalType long. norm := 'M'. work := nil. ^(1 to: 10) collect: [:i | (self new dlange2Withnorm: norm m: cm n: cn a: a lda: clda work: work length: 1) = 0.0]
"you always obtain false from the seventh entry on..."
Nobody alive on the thread? The bug is now at http://bugs.impara.de/view.php?id=3929
Nicolas
Hi nicolas,
I just test this :
^(1 to: 10) collect: [:i | r := (self new dlange2Withnorm: norm m: cm n: cn a: a lda: clda work: work length: 1) = 0.0. Transcript show: r asString. r ].
And its works...... but if I remove the "Transcrip show" I obtain : #(true true false false false false false false false false)
vincent.
2006/6/28, nicolas cellier ncellier@ifrance.com:
Nobody alive on the thread? The bug is now at http://bugs.impara.de/view.php?id=3929
Nicolas
Hi Nicolas,
And on the seventh call, i always get a stange result (result is 0.0 but will answer false to = 0.0).
I don't know a whole lot about FFI, but I know that generally accepted practice (at least in C which is the language I'm most familiar with) is that you NEVER test for equality when using floating point numbers.
You always use some small epsilon and test that the number you've got is within epsilon of the comparison number.
This is because due to roundoff and other such effects, you can wind up with really tiny numbers, like 1 x 10^-53 which is essentially zero, but is not equal to zero.
This might not even be relevant to your discussion, but seeing the equality test on 0.0 raised a red flag for me.
Hi all,
I think to a "synchronisation" problem : I remember similar problem with c++ (in an other life) : The value is read before she was write. (with a Transcript, or a debug screen, the execution speed is slower.)
2006/6/29, Dave Hylands dhylands@gmail.com:
Hi Nicolas,
And on the seventh call, i always get a stange result (result is 0.0 but
will
answer false to = 0.0).
I don't know a whole lot about FFI, but I know that generally accepted practice (at least in C which is the language I'm most familiar with) is that you NEVER test for equality when using floating point numbers.
You always use some small epsilon and test that the number you've got is within epsilon of the comparison number.
This is because due to roundoff and other such effects, you can wind up with really tiny numbers, like 1 x 10^-53 which is essentially zero, but is not equal to zero.
This might not even be relevant to your discussion, but seeing the equality test on 0.0 raised a red flag for me.
-- Dave Hylands Vancouver, BC, Canada http://www.DaveHylands.com/
squeak-dev@lists.squeakfoundation.org