[squeak-dev] GPGPU

Fri Oct 30 06:15:55 UTC 2009

On Oct 28, 2009, at 1:23 PM, Casimiro de Almeida Barreto wrote:

> Em 28-10-2009 15:24, Josh Gargus escreveu:
>> I agree with Casmiro's response... GPUs aren't suitable for running
>> Smalltalk code.  Larrabee might be interesting, since it will have 16
>> or more x86 processors, but it's difficult to see how to utilize the
>> powerful vector processor attached to each x86.
> Here I see two opportunities. The first would be to follow the  
> advice of
> mr. Ingalls and start to develop a generic VM and related classes to
> deal with parallel processing (something I think is extremely delayed
> since multicore processors are around for such a long time) and IMHO,
> not dealing with SMP processing prevents dealing with NUMA processing
> where the advantages of smalltalk should be astounding.
>
> The second is to provide squeak with solid intrinsic vector processing
> capabilities which would reopen the field of high performance
> applications in science and engineering and also for more mundane
> applications like game industry.
>>
>> Your question was more specifically about running something like  
>> Slang
>> on it.  It's important to remember that Slang isn't Smalltalk, it's C
>> with Smalltalk syntax (i.e. all Slang language constructs are
>> implemented by a simple 1-1 mapping onto the corresponding C language
>> feature).  So yes, it would be possible to run something like Slang  
>> on
>> a GPU.  Presumably, you would want to take the integration one step
>> farther than with Slang, and automatically compile the generated
>> OpenCL or CUDA code instead of dumping it to an external file.
>>
>> Instead of thinking of running Smalltalk on the GPU, I would think
>> about writing a DSL (domain-specific language) for a particular class
>> of problems that can be solved well on the GPU.  Then I would think
>> about how to integrate this DSL nicely into Smalltalk.
>
> That's sort of my idea :)
>
> I'm not considering CUDA at the moment because it would be more  
> specific
> to NVIDIA architecture. Currently the GPU market is shared mostly
> between NVIDA and AMD/ATI and AMD says they won't support CUDA on  
> their
> GPUs (just follow
> http://www.amdzone.com/index.php/news/video-cards/11775-no-cuda-on-radeon 
>  as
> an example). It's a pitty since last year it was reported that RADEON
> compatibility in CUDA was almost complete. Besides there are licensing
> issues

Again, I'm not sure what issues you are referring to.  Are you talking  
about practical issues that would prevent people from deploying Squeak  
GPGPU code?  If so, I don't think that there are any issues.  Unlike,  
say, Microsoft, the GPU vendors have much less incentive to lock you  
into their platforms; they just want to sell more GPUs, and they won't  
do that by introducing gratuitous licensing roadblocks.

Or perhaps you're more motivated by FSF-esque notions of software  
freedom?  It's true that there is a free OpenGL implementation and no  
free OpenCL implementation, yet.  However, the specification is open,  
and it's a matter of time before a free implementation is available.   
For example, my understanding is that Tungsten Graphics' Gallium3D  
framework is designed to support OpenCL as well as OpenGL.

> and I just don't want to have "wrappers".

Could you elaborate a bit about the "solid intrinsic vector processing  
capabilities" that you are thinking of, and in particular how they go  
beyond being mere wrappers?

It seems like a layered approach is the way to go.  Assuming that  
OpenCL is the target, the lowest layer (and the first useful artifact)  
would be a wrapper for the OpenCL function calls and a model for  
managing memory outside of Squeak's object-memory (perhaps using  
Aliens).

The next layer would be a more natural integration of OpenCL's  
sequencing/synchronization primitives (such as "events") into Squeak.

After that, the sky's the limit... here's where Squeak could really  
shine.

Do you agree with this characterization?

> It's obvious that I know many of the problems dealt by CUDA and  
> OpenCL:
> the variable number and size of pipelines, problems with numeric
> representation and FP precision, etc... etc... etc...

Yeah, sorry... I feel like I made bad assumptions in my initial  
response.

> And I know it
> would be much easier just to write some wrappers or, easier yet, to
> develop things in C/C++ & glue them with FFI. But then, what would be
> the gain to squeak & the smalltalk community?

That's the spirit!  :-D

Cheers,
Josh

>>
>> Sean McDermid has done something like this with C#, LINQ, HLSL, and
>> Direct3D (http://bling.codeplex.com/).  He's not doing GPGPU per se,
>> but the point is how seamless is his integration with C#.
>>
>> Cheers,
>> Josh
>>
> Best regards,
>
> CdAB
>
>