<html><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><br><div><div>Am 18.03.2009 um 04:47 schrieb Eliot Miranda:</div><br class="Apple-interchange-newline"><blockquote type="cite"><br><br><div class="gmail_quote">On Tue, Mar 17, 2009 at 2:45 PM, Claus Kick <span dir="ltr">&lt;<a href="mailto:claus_kick@web.de">claus_kick@web.de</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"> Eliot Miranda wrote:<br> *snip*<div class="im"><br> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <br> ...and SPARC is one of the worst 64-bit implementations out there.<br> &nbsp;Question, how much bigger is a 64-bit literal load instruction vs a 32-bit<br> literal load in x86/x86-64 and SPARC32/SPARC64?<br> </blockquote> <br></div> Interesting though off topic tidbit, therefore an OT question: is that aimed at SPARC as a (for lack of better word) architecture or do you have a specific implementation in mind? (*curious*)</blockquote><div><br></div><div> I know nothing about SPARC internals and so cannot suggest an implementation.</div><div><br></div><div>Part of my complaint is the name, Scaleable Processor ARCitecture. &nbsp;The current SPARC requires 6 (reads it and weep, _6_) 32-bit instructions to synthesize an arbitrary 64-bit literal. &nbsp;It hasn't scaled to 64-</div></div></blockquote>As far as I know Scalable Processor ARChitecture was meant to make it possible to add processors. Sun has built servers with up to 112 processor with nearly linear performance gain.</div><div>So in this regard SPARC is scalable.</div><div><br><blockquote type="cite"><div class="gmail_quote"><div>bits; consequently there are a range of addressing models in 64-bit SPARC compilers, 20-something-bits 40-something bits (I forget the details) and 64-bits. &nbsp;By contrast there are 10-byte instructions that do 64-bit literals loads in x86-64. &nbsp;So a 200% overhead vs a 25% overhead.</div> <div><br></div></div></blockquote>I assume that you met RISC vs. CISC paradigms here. It's the heart of the RISC idea to have just a few but fast address modes and instructions.</div><div>SPARC was faster than x86 until around UltraSPARC-III introduction / introduction of Pentium 4. Since then Intel and AMD surpassed SPARC with their x86 ISA.</div><div><br><blockquote type="cite"><div class="gmail_quote"><div>One can try and use the branch and link instruction to&nbsp;jump over the literal,&nbsp;grab the pc and indirect through it, but IIRC that's a slow 5 word sequence that can't be used in leaf routines. &nbsp;But this is off the top of my head so don't quote me.</div> <div><br></div><div>I would have thought that somehow one could define a three word instruction saying "load the next two words into a register and skip them: or, if the anachronism of the delay slot must still be respected, a 4 word instruction saying "load the two words after the following instruction into a register and skip them, executing the instruction in the delay slot".</div> </div><br><div><br></div> <br></blockquote></div>Regards<div>Andreas</div></body></html>