[squeak-dev] Prepare for Thousands of Cores --- oh my Chip

Sun Jul 6 16:50:02 UTC 2008

On Jul 5, 2008, at 6:40 PM, Peter William Lount wrote:

> Todd Blanchard wrote:
>> This was pretty much the messages from Apple at WWDC recently as  
>> well.
>> Their next os version has several technologies based around this  
>> idea.
>> The shift is upon us.
>>
>
> Yeah, Apple is talking about two different approaches - program  
> parallelism with multi-cores and data parallelism with GPGPUs from  
> the likes of NVidia and AMD-ATI or possibly P.A.Semi (just a wild  
> guess on P.A.Semi as their chips could be made with many many cores  
> soon).
>
> And NO Smalltalk hasn't caught up yet. Just half a year ago in this  
> very forum thread people were arguing against generic fully multi- 
> threading of Smalltalk virtual machines. Cincom is against it.  
> Instantiantions has been quite and likely won't do much.

And in my opinion, the people who were arguing against it won the  
argument.  Concerns were raised about the cache-thrashing that could  
result, and relevant empirical research was linked to that seemed to  
validate these concerns.

> Only a few brave intrepid explorers get it and now we have  
> experiments like HydraVM for croquet/squeak.

Perhaps I misunderstood what you meant in the previous part of the  
paragraph.  Hydra is explicitly one-thread-per-image for 1) simplicity  
of implementation, 2) simplicity of use and 3) because many-threads- 
per-image hasn't been shown to be even theoretically desirable.

> Most smalltalks and smalltalkers are deeply stuck in the past of one  
> native thread. Most in fact are not good at multi-threading with  
> smalltalk non-native threads!!! It's difficult to learn and get  
> right which is one motivator behind those wanting to take the easy  
> road - one native thread per image,

Right, *one* motivator.

> but that's the wrong route (in my view and obviously in others view  
> as well) because it isn't general purpose enough. It involves hard  
> work. No way around it.

If you want to open up this discussion again, please bring some new  
facts.  Why would cache-thrashing not be an issue when running 64  
cores on a single image?  I'm willing to be convinced, but I haven't  
seen even a sketch of a design that would avoid this.

>
>
> Igor, how will we gain access to writing for chips like NVidia when  
> they keep it all secret?

Keep what secret?  Both AMD and NVIDIA have exposed low-level  
instructions sets for their processors.  AMD's is called CTM, and I  
can't remember the name of NVIDIA's.  These instruction sets are at  
approximately the level of x86 assembly (i.e. low-level, but still  
portable across different GPU models).

> Use C with CUDA?

One approach is to use CUDA just like Croquet uses OpenGL.  What's the  
difference?

Cheers,
Josh

> Or hyjack OpenCL (to be part of LLVM and clang frontend if I'm not  
> mistaken) when Apple gets it working?
>
> Cheers,
>
> peter
>
>

[squeak-dev] Prepare for Thousands of Cores --- oh my Chip - it's full of cores!