Hi All, anyone know the x86/IA32 really well? If so, read on. Otherwise save yourself the yawn.
I just tried to save an instruction in Cog;s generated bitShift: primitive. It seems to me that SARL (shift arithmetic right long) should set the sign flag based on the result, in fact it says as much in the manual; I quote fro m IA-32 Intel(R) Architecture Software Developer's Manual Volume 2B: Instruction Set Reference, N-Z p 4-192
Flags Affected
The CF flag contains the value of the last bit shifted out of the destination operand; it is unde-
fined for SHL and SHR instructions where the count is greater than or equal to the size (in bits)
of the destination operand. The OF flag is affected only for 1-bit shifts (see "Description"
above); otherwise, it is undefined. The SF, ZF, and PF flags are set according to the result. If the
count is 0, the flags are not affected. For a non-zero count, the AF flag is undefined.
(my emphasis added). But neither the Bochs simulator nor my Intel Core Duo set the flags when doing sarl $1, %eax when %eax contains -1. Have I misread, or is the manual wrong?
TIA
Eliot
Hi Eliot,
Eliot Miranda wrote:
Hi All, anyone know the x86/IA32 really well? If so, read on. Otherwise save yourself the yawn.
[...]
(my emphasis added). But neither the Bochs simulator nor my Intel Core Duo set the flags when doing sarl $1, %eax when %eax contains -1. Have I misread, or is the manual wrong?
I cannot confirm this. Using this simple C-Program:
int calc(int i) { return i >> 1; }
int main() { printf("%i\n", calc(-1)); }
my GCC 4.3.2 generates a sarl %eax instruction as the assembler output shows. Debugging it with Kdbg shows a change of the flags after the instruction. In fact, CF and SF are set as (more or less) expected. I also have a Intel Core 2 Duo.
Regards, Martin
Hi Martin, can you send me the assembly? Or show me the opcodes? When I try this it doesn't work. So I must be doing something differently. and Hi! Robert Hirschfeld mentioned you when he and I met last week.
On Wed, Jan 21, 2009 at 11:06 AM, Martin Beck < martin.beck@hpi.uni-potsdam.de> wrote:
Hi Eliot,
Eliot Miranda wrote:
Hi All, anyone know the x86/IA32 really well? If so, read on. Otherwise
save
yourself the yawn.
[...]
(my emphasis added). But neither the Bochs simulator nor my Intel Core
Duo
set the flags when doing sarl $1, %eax when %eax contains -1. Have I misread, or is the manual wrong?
I cannot confirm this. Using this simple C-Program:
int calc(int i) { return i >> 1; }
int main() { printf("%i\n", calc(-1)); }
my GCC 4.3.2 generates a sarl %eax instruction as the assembler output shows. Debugging it with Kdbg shows a change of the flags after the instruction. In fact, CF and SF are set as (more or less) expected. I also have a Intel Core 2 Duo.
Regards, Martin
Eliot - I know you've already moved past this problem, but in the future, gcc -S foo.c will create foo.s with the assembly generated by gcc.
David
On Jan 21, 2009, at 2:08 PM, Eliot Miranda wrote:
Hi Martin, can you send me the assembly? Or show me the opcodes? When I try this it doesn't work. So I must be doing something differently.
and Hi! Robert Hirschfeld mentioned you when he and I met last week.
On Wed, Jan 21, 2009 at 11:06 AM, Martin Beck <martin.beck@hpi.uni- potsdam.de> wrote: Hi Eliot,
Eliot Miranda wrote:
Hi All, anyone know the x86/IA32 really well? If so, read on.
Otherwise save
yourself the yawn.
[...]
(my emphasis added). But neither the Bochs simulator nor my
Intel Core Duo
set the flags when doing sarl $1, %eax when %eax contains -1.
Have I
misread, or is the manual wrong?
I cannot confirm this. Using this simple C-Program:
int calc(int i) { return i >> 1; }
int main() { printf("%i\n", calc(-1)); }
my GCC 4.3.2 generates a sarl %eax instruction as the assembler output shows. Debugging it with Kdbg shows a change of the flags after the instruction. In fact, CF and SF are set as (more or less) expected. I also have a Intel Core 2 Duo.
Regards, Martin
On Thu, Jan 22, 2009 at 12:53 PM, David Farber dfarber@numenor.com wrote:
Eliot - I know you've already moved past this problem, but in the future, gcc -S foo.c will create foo.s with the assembly generated by gcc.
Um, I know :) Trouble is gcc also optimizes so it may not always generate the code you expect. For example,
issignedshift(v) { return (v >> 1) < 0 ? 1 : 0; }
will, with -O4, generate
movl 4(%esp), %eax sarl $31,%eax ret
because it works out this is the quickest way to generate a 1 if v is negative and doesn't generate a compare at all.
BTW, I've been abusing gcc's -S output for a long time. Back in the 80's I used to generate direct-threaded-code VMs using gcc where I would edit the -S output with sed to produce the opcodes for the threaded code machine stripped of the prolog and epilog gcc would produce. I've also produced JIT-compiled BitBlt by similar means with a number of different compilers. -S has been my friend for many years.
Cheers! Eliot
David
On Jan 21, 2009, at 2:08 PM, Eliot Miranda wrote:
Hi Martin, can you send me the assembly? Or show me the opcodes? When I try this it doesn't work. So I must be doing something differently. and Hi! Robert Hirschfeld mentioned you when he and I met last week.
On Wed, Jan 21, 2009 at 11:06 AM, Martin Beck < martin.beck@hpi.uni-potsdam.de> wrote:
Hi Eliot,
Eliot Miranda wrote:
Hi All, anyone know the x86/IA32 really well? If so, read on. Otherwise
save
yourself the yawn.
[...]
(my emphasis added). But neither the Bochs simulator nor my Intel Core
Duo
set the flags when doing sarl $1, %eax when %eax contains -1. Have I misread, or is the manual wrong?
I cannot confirm this. Using this simple C-Program:
int calc(int i) { return i >> 1; }
int main() { printf("%i\n", calc(-1)); }
my GCC 4.3.2 generates a sarl %eax instruction as the assembler output shows. Debugging it with Kdbg shows a change of the flags after the instruction. In fact, CF and SF are set as (more or less) expected. I also have a Intel Core 2 Duo.
Regards, Martin
2009/1/22 Eliot Miranda eliot.miranda@gmail.com:
On Thu, Jan 22, 2009 at 12:53 PM, David Farber dfarber@numenor.com wrote:
Eliot - I know you've already moved past this problem, but in the future, gcc -S foo.c will create foo.s with the assembly generated by gcc.
Um, I know :) Trouble is gcc also optimizes so it may not always generate the code you expect. For example, issignedshift(v) { return (v >> 1) < 0 ? 1 : 0; } will, with -O4, generate movl 4(%esp), %eax sarl $31,%eax ret because it works out this is the quickest way to generate a 1 if v is negative and doesn't generate a compare at all. BTW, I've been abusing gcc's -S output for a long time. Back in the 80's I used to generate direct-threaded-code VMs using gcc where I would edit the -S output with sed to produce the opcodes for the threaded code machine stripped of the prolog and epilog gcc would produce. I've also produced JIT-compiled BitBlt by similar means with a number of different compilers. -S has been my friend for many years.
i'm dreaming to have
listing := Object compile: 'yourself ^self' options: '-S'
:)
Cheers! Eliot
David On Jan 21, 2009, at 2:08 PM, Eliot Miranda wrote:
Hi Martin, can you send me the assembly? Or show me the opcodes? When I try this it doesn't work. So I must be doing something differently. and Hi! Robert Hirschfeld mentioned you when he and I met last week. On Wed, Jan 21, 2009 at 11:06 AM, Martin Beck martin.beck@hpi.uni-potsdam.de wrote:
Hi Eliot,
Eliot Miranda wrote:
Hi All, anyone know the x86/IA32 really well? If so, read on. Otherwise save yourself the yawn.
[...]
(my emphasis added). But neither the Bochs simulator nor my Intel Core Duo set the flags when doing sarl $1, %eax when %eax contains -1. Have I misread, or is the manual wrong?
I cannot confirm this. Using this simple C-Program:
int calc(int i) { return i >> 1; }
int main() { printf("%i\n", calc(-1)); }
my GCC 4.3.2 generates a sarl %eax instruction as the assembler output shows. Debugging it with Kdbg shows a change of the flags after the instruction. In fact, CF and SF are set as (more or less) expected. I also have a Intel Core 2 Duo.
Regards, Martin
On Jan 22, 2009, at 1:59 PM, Eliot Miranda wrote:
On Thu, Jan 22, 2009 at 12:53 PM, David Farber dfarber@numenor.com wrote: Eliot - I know you've already moved past this problem, but in the future, gcc -S foo.c will create foo.s with the assembly generated by gcc.
Um, I know :) Trouble is gcc also optimizes so it may not always generate the code you expect. For example,
issignedshift(v) { return (v >> 1) < 0 ? 1 : 0; }
will, with -O4, generate
movl 4(%esp), %eax sarl $31,%eax ret
because it works out this is the quickest way to generate a 1 if v is negative and doesn't generate a compare at all.
BTW, I've been abusing gcc's -S output for a long time. Back in the 80's I used to generate direct-threaded-code VMs using gcc where I would edit the -S output with sed to produce the opcodes for the threaded code machine stripped of the prolog and epilog gcc would produce. I've also produced JIT-compiled BitBlt by similar means with a number of different compilers. -S has been my friend for many years.
Cheers! Eliot
Ok, ok. It's just that when I looked at the assembler output for Martin's example, it looked like it covered the case you were fighting. (I didn't step through it with a debugger.)
.text .globl _calc _calc: pushl %ebp movl %esp, %ebp subl $8, %esp movl 8(%ebp), %eax sarl %eax leave ret
Then you said "can you send me the assembly? Or show me the opcodes?" instead of something like "What gcc version/flags are you using."
A thousand apologies for having impugned your knowledge of gcc.
I will now go away before you taunt me a second time.
:)
David
On 22-Jan-09, at 12:59 PM, Eliot Miranda wrote:
BTW, I've been abusing gcc's -S output for a long time.
-fverbose-asm
is also helpful... well unless you rather map registers to local variables in your head because you know it should work that way.
-- = = = ======================================================================== John M. McIntosh johnmci@smalltalkconsulting.com Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com = = = ========================================================================
Eliot Miranda <eliot.miranda <at> gmail.com> writes:
Hi All,
anyone know the x86/IA32 really well? If so, read on. Otherwise save
yourself the yawn.
I just tried to save an instruction in Cog;s generated bitShift: primitive.
It seems to me that SARL (shift arithmetic right long) should set the sign flag based on the result, in fact it says as much in the manual; I quote from IA-32 Intel® Architecture Software Developer's Manual Volume 2B: Instruction Set Reference, N-Z p 4-192
Hi Eliot, I guess you are adressing case of SmallInteger, otherwise I would understand optimize as using some MMX 64 or 128 bits arithmetic (like PSRLLQ). If relevant, check my trivial optimizations for large ints at http://bugs.squeak.org/view.php?id=7109
Nicolas
On Wed, Jan 21, 2009 at 12:38 PM, Nicolas Cellier ncellier@ifrance.comwrote:
Eliot Miranda <eliot.miranda <at> gmail.com> writes:
Hi All,
anyone know the x86/IA32 really well? If so, read on. Otherwise
save yourself the yawn.
I just tried to save an instruction in Cog;s generated bitShift:
primitive. It seems to me that SARL (shift arithmetic right long) should set the sign flag based on the result, in fact it says as much in the manual; I quote from IA-32 Intel(R) Architecture Software Developer's Manual Volume 2B: Instruction Set Reference, N-Z p 4-192
Hi Eliot, I guess you are adressing case of SmallInteger, otherwise I would understand optimize as using some MMX 64 or 128 bits arithmetic (like PSRLLQ). If relevant, check my trivial optimizations for large ints at http://bugs.squeak.org/view.php?id=7109
Hi Nicholas,
yes I'm doing SmallInteger and also trying to keep the JIT very simple initially so no MMX registers or instructions in the stage one JIT until I do floating-point. I'll take a look at these. Thanks.
http://bugs.squeak.org/view.php?id=7109
Nicolas
Eliot Miranda wrote:
Hi All,
anyone know the x86/IA32 really well? If so, read on. Otherwise
save yourself the yawn.
I just tried to save an instruction in Cog;s generated bitShift: primitive. It seems to me that SARL (shift arithmetic right long) should set the sign flag based on the result, in fact it says as much in the manual; I quote from IA-32 Intel® Architecture Software Developer's Manual Volume 2B: Instruction Set Reference, N-Z p 4-192
Flags Affected
The CF flag contains the value of the last bit shifted out of the destination operand; it is unde-
fined for SHL and SHR instructions where the count is greater than or equal to the size (in bits)
of the destination operand. The OF flag is affected only for 1-bit shifts (see "Description"
above); otherwise, it is undefined. The SF, ZF, and PF flags are set according to the result. If the
count is 0, the flags are not affected. For a non-zero count, the AF flag is undefined.
(my emphasis added). But neither the Bochs simulator nor my Intel Core Duo set the flags when doing sarl $1, %eax when %eax contains -1. Have I misread, or is the manual wrong?
Interesting. FWIW, the AMD64 arch manual (vol 3, p.220) also says that an SAR will affect the SF flag.
Regards,
-Martin
apologies; my bad. I'd used the wrong branch. jump greater (if 0 > v) is not the same as jump (if v) negative . I live and learn. Sorry for the noise.
On Wed, Jan 21, 2009 at 10:21 AM, Eliot Miranda eliot.miranda@gmail.comwrote:
Hi All, anyone know the x86/IA32 really well? If so, read on. Otherwise save yourself the yawn.
I just tried to save an instruction in Cog;s generated bitShift: primitive. It seems to me that SARL (shift arithmetic right long) should set the sign flag based on the result, in fact it says as much in the manual; I quote fro m IA-32 Intel(R) Architecture Software Developer's Manual Volume 2B: Instruction Set Reference, N-Z p 4-192
Flags Affected
The CF flag contains the value of the last bit shifted out of the destination operand; it is unde-
fined for SHL and SHR instructions where the count is greater than or equal to the size (in bits)
of the destination operand. The OF flag is affected only for 1-bit shifts (see "Description"
above); otherwise, it is undefined. The SF, ZF, and PF flags are set according to the result. If the
count is 0, the flags are not affected. For a non-zero count, the AF flag is undefined.
(my emphasis added). But neither the Bochs simulator nor my Intel Core Duo set the flags when doing sarl $1, %eax when %eax contains -1. Have I misread, or is the manual wrong?
TIA
Eliot
squeak-dev@lists.squeakfoundation.org