[Vm-dev] PICs (was: RISC-V J Extension)
Das.Linux at gmx.de
Wed Jul 25 06:34:56 UTC 2018
> On 25.07.2018, at 01:18, Jecel Assumpcao Jr. <jecel at merlintec.com> wrote:
> While it is bad form to move a private discussion to (or back to) a
> public forum, some of these links might be interesting to people here
> and I have been unable to send emails to Tobias after my initial reply.
> An attempt on Wednesday and on Friday made mcrelay.correio.biz complain
> that mx00.emig.gmx.net[ refused to talk to it and an attempt from my old
> 1991 email account on Monday complained about the email address though
> it was ok as far as I can tell.
bummer. Sorry for my ISP…
Then let's continue here.
> Tobias wrote:
>> Jecel wrote:
>>> [new direction: emulate bytecodes and RISC-V]
>> That'a an interesting take.
>> I can only watch from afar, but its all interesting. (for example that guy
>> who does RISC-V cpu in TTL chips: https://www.youtube.com/channel/UCBcljXmuXPok9kT_VGA3adg )
> It is an interesting project. I was annoyed by his claim to have the
> first homebrew TTL 32 bit processor since in the late 1990s a group of
> students at the MIT processor design course implemented the Beta
> processor in TTLs instead of using FPGAs like all other groups (before
> or since). Sadly, all information about this has been eliminated from
> the web and can't even be found in archive.org.
Yea but as far as I can see that person is fine with being corrected, so maybe someone should tell him? :)
> I tried to get the local universities to teach RISC-V to their students
> instead of their own educational RISC processors but they are too
> emotionally attached to their designs.
>> Sounds reasonable. Let's have them know dynamic languages are also still there ;)
>> (I mean, you're very familiar with both Smalltalk and Self...)
> Mario Wolczko has been involved in Java since the late 1990s but was
> part of the Self group before that and had created the Mushroom
> Smalltalk computer before that.
> Boris Shingarov is currently involved with Java but has given a lot of
> talk about Smalltalk VMs and was involved in Squeak back in the OS/2
> With me, that was 3 out of 6 people at the meeting representing the
> Smalltalk viewpoint. We shall see if that will have any practical
that sounds great input to that project.
>> The TLB is somewhat maintained by the CPU to manage the translation of virtual addresses to physical ones.
>> I can imagine something similar, like a branch, that upon return, updates a filed
>> in a PIC buffer, such that the next time the branch is only taken if a register (eg, class of the object)
>> is different or so.
> Ok, Mario actually mentioned that with today's advanced branch
> prediction hardware we might want to re-evaluate PICs. In this case you
> wouldn't be using the TLBs but the BTB (Branch Target Buffer) hardware.
> Mario might have actually been thinking about Urs Hölzle's ECOOP 95
> paper, which was a slightly different subject.
> They were looking at the different kinds of software implementation of
> method dispatch (not only PICs) and the effects of processors executing
> more and more instructions per clock cycle. That might make a scheme
> that is bad for a simple RISC (due to many tests, for example) actually
> work well on an advanced out-of-order processor (due to the test being
> "free" since they execute in parallel with the main code). They didn't
> look at branch prediction hardware, but it certainly would have a huge
> impact. Several of the later papers focused on branch prediction:
Today I learned about BTB…
>>> For SiliconSqueak I actually had two different PIC instructions. They
>>> modified how the instruction cache works. Normally the instruction cache
>>> is accessed by hashing the 32 bit value of the PC except for the lowest
>>> bits which select a byte in the cache line, but after a PIC instruction
>>> the hash used a 64 bit value that combined the PC (all bits) and the
>>> pointer to the receiver's class. The resulting cache line was fetched
>>> and instructions executed in sequence even though the PC didn't change.
>>> Any branch or call instruction would restart normal execution at the new
>> Sounds neat!
>>> So a PIC entry takes up exactly one cache line. A PIC can have as many
>>> entries as needed and the instruction takes the same time to execute no
>>> matter how many entries there are (not taking into account cache
>> Wow thats incredible.
>>> The second PIC instruction works exactly like the first but it supplies
>>> a different value to be used in place of the current PC. That allows
>>> different call sites to share PIC entries if needed, though that might
>>> be more complicated than it is worth.
>> Maybe. What I like about PICs per send site is that you can essentially use them
>> as data source for dynamic feedback (what "types" where actually seen at this send site?)
>> and one probably would need some instructions to fetch those infos from the PIC.
> One of the papers in that list is the 1997 techical report "The Space
> Overhead of Customization". One of the reasons that Java won over Self
> was that its simple interpreter ran on 8MB machines that most of Sun's
> customers had while Self needed 24MB workstations which were rare (but
> would be very common just two years later). Part of that was due to
> compiling a new version of native code for every different type of
> receiver even if the different versions didn't really help.
> My idea of allowing PICs to be optionally shared was that this would
> allow customization to be limited in certain cases to save memory. It
> would cause a loss of information about types seen at a call site, but
> that doesn't always have a great impact on performance.
That makes a lot of sense. Maybe there's a way to have both variants…
More information about the Vm-dev