Hi Philipe,<br><br><div class="gmail_quote">On Mon, Mar 16, 2009 at 10:52 PM, Philippe Marschall <span dir="ltr"><<a href="mailto:philippe.marschall@gmail.com">philippe.marschall@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<div><div></div><div class="h5">2009/3/16 Eliot Miranda <<a href="mailto:eliot.miranda@gmail.com">eliot.miranda@gmail.com</a>>:<br>
><br>
><br>
> On Mon, Mar 16, 2009 at 2:15 PM, Philippe Marschall<br>
> <<a href="mailto:philippe.marschall@gmail.com">philippe.marschall@gmail.com</a>> wrote:<br>
>><br>
>> 2009/3/16 Eliot Miranda <<a href="mailto:eliot.miranda@gmail.com">eliot.miranda@gmail.com</a>>:<br>
>> ><br>
>> ><br>
>> > On Sun, Mar 15, 2009 at 1:57 PM, Nicolas Cellier<br>
>> > <<a href="mailto:nicolas.cellier.aka.nice@gmail.com">nicolas.cellier.aka.nice@gmail.com</a>> wrote:<br>
>> >><br>
>> >> 2009/3/15 Hans-Martin Mosner <<a href="mailto:hmm@heeg.de">hmm@heeg.de</a>><br>
>> >>><br>
>> >>> nicolas cellier schrieb:<br>
>> >>> > Hans,<br>
>> >>> > Tagging/untagging could be very fast! See my other post<br>
>> >>> ><br>
>> >>> > 1) UnTagging a double= No op<br>
>> >>> > 2) Tagging a double= a isnan test (so as to have a representable nan<br>
>> >>> > in Smalltalk)<br>
>> >>> > 3) This trick does not had any extra cost to tagging/untagging of<br>
>> >>> > other oops<br>
>> >>> That's true for a 64-bit processor, and on such hardware I see the<br>
>> >>> advantages of this scheme.<br>
>> >>> For 32-bit hardware, it won't work.<br>
>> >>> Hopefully we'll all have suitable hardware in the near future...<br>
>> >>> But for example, I'm running 32-bit linux here on my 64-bit AMD<br>
>> >>> processor just because the WLAN card I'm using only has a 32-bit<br>
>> >>> Windows<br>
>> >>> driver, and ndiswrapper on 64-bit linux would require a 64-bit driver<br>
>> >>> to<br>
>> >>> work correctly (which is somewhat stupid IMHO but I'm not going to<br>
>> >>> hack<br>
>> >>> ndiswrapper).<br>
>> >>> In the real world, there are tons of silly constraints like this which<br>
>> >>> still prevent people from fully using 64-bit hardware.<br>
>> >>><br>
>> >>> Cheers,<br>
>> >>> Hans-Martin<br>
>> >><br>
>> >> Of course, most of the nice properties come from the 64 bits<br>
>> >> adressing...<br>
>> >> Hey, wait, I don't even have a 64 processor in my house!<br>
>> >> For the fun I imagine we could emulate by spanning each oop over two<br>
>> >> int32<br>
>> >> typedef struct {int32 high,low;} oop;<br>
>> >> I would expect a slower VM by roughly a factor 2 - except for double<br>
>> >> arithmetic...<br>
>> ><br>
>> > In theory, but only for memory-limited symbolic applications. If you<br>
>> > have<br>
>> > an application that fits entirely in cache then I would expect parity.<br>
>> > The<br>
>> > argument for symbolic applications is that a 64-bit symbolic app has to<br>
>> > move<br>
>> > twice the data as a 32-bit symbolic app because each symbolic object is<br>
>> > twice the size.<br>
>><br>
>> Couldn't you compress the oops? AFAIK HotSpot was the last remaining<br>
>> JVM that got this.<br>
><br>
> I don't see the point. Memory is cheap, getting cheaper.<br>
<br>
</div></div>But memory access isn't.<br>
<div class="im"><br>
> 64-bits means<br>
> extremely cheap address space. Why slow down the critical path to save<br>
> space?<br>
<br>
</div>Because it's faster (because you have to move around fewer data) an<br>
gets you closer to 32bit speed.<br>
<br>
<a href="http://wikis.sun.com/display/HotSpotInternals/CompressedOops" target="_blank">http://wikis.sun.com/display/HotSpotInternals/CompressedOops</a><br>
<a href="http://blog.juma.me.uk/2008/10/14/32-bit-or-64-bit-jvm-how-about-a-hybrid/#comments" target="_blank">http://blog.juma.me.uk/2008/10/14/32-bit-or-64-bit-jvm-how-about-a-hybrid/#comments</a><br>
<a href="http://www.lowtek.ca/roo/2008/java-performance-in-64bit-land/" target="_blank">http://www.lowtek.ca/roo/2008/java-performance-in-64bit-land/</a><br>
<a href="http://www.devwebsphere.com/devwebsphere/2008/10/websphere-nd-70.html" target="_blank">http://www.devwebsphere.com/devwebsphere/2008/10/websphere-nd-70.html</a><br>
<a href="http://webspherecommunity.blogspot.com/2008/10/64-bit-performance-thoughputmemory.html" target="_blank">http://webspherecommunity.blogspot.com/2008/10/64-bit-performance-thoughputmemory.html</a><br></blockquote>
<div><br></div><div>OK, and this is a reasonable stop-gap until machines catch up with the potential of the 64-bit address space. It reminds me of segmented approaches to 16-bit limits on PDP-11s, 8086s et al. Basically these guys are scaling 32-bit oops by 8, allowing a maximum heap size of 32Gb and 4G small objects. There are other approaches like using an indirection table for intra-segment object references and using 32-bit oops within a segment, which would fit well with a Train algorithm.</div>
<div><br></div><div>My gut feels like these stop gaps are a temporary thing. After all if speed was so compelling we'd see lots of small 16-bit apps in places like Windows where there used to be good support for 16-bit code until quite recently. But in fact 16-bit apps have died the death and we favour the regularity of 32-bit code. Somewhat analogously Smalltalk trades perofrmance for regularity. So I don't find these approaches particularly compelling. In any case they require engineering teams that can afford to support multiple memory models in the VM, something I'm not going to assume in Cog :)</div>
<div><br></div><div>Thanks for the links.</div><div><br></div><div>Best</div><div>Eliot<br></div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<br>
Cheers<br>
<font color="#888888">Philippe<br>
<br>
</font></blockquote></div><br>