C++ parser in Smalltalk?

List overview All Threads
Download

newer

older

Re: [squeak-dev] Different...

Peter William Lount

30 Jun 2008 30 Jun '08

11:10 a.m.

Hi,

Does anyone know of a working C++ parser written in Smalltalk? Just the front end would be fine, it doesn't have to generate any code. I just want to have a Smalltalk/Squeak program read the darn stuff so I can play with the information.

If not what would you suggest as an approach for having one with a minimum of effort?

Thanks in advance,

Peter

Show replies by date

Damien Cassou

30 Jun 30 Jun

1:03 p.m.

On Mon, Jun 30, 2008 at 11:10 AM, Peter William Lount peter@smalltalk.org wrote:

...

Does anyone know of a working C++ parser written in Smalltalk? Just the front end would be fine, it doesn't have to generate any code. I just want to have a Smalltalk/Squeak program read the darn stuff so I can play with the information.

If not what would you suggest as an approach for having one with a minimum of effort?

I would try to plug an already existing c++ compiler to Squeak.

-- Damien Cassou Peter von der Ahé: «I'm beginning to see why Gilad wished us good luck». (http://blogs.sun.com/ahe/entry/override_snafu)

Peter William Lount

1:42 p.m.

Damien Cassou wrote:

...

I would try to plug an already existing c++ compiler to Squeak.

Hi,

That would be a prudent approach saving a considerable about of development time.

Does anyone know how to get GNU C++ and MS Visual Studio 9.0 C/C++ to emit the C++ parse trees in a XML or other easy to parse format for Squeak?

Cheers,

Peter

Michael Haupt

1:52 p.m.

Hi Peter,

On Mon, Jun 30, 2008 at 1:42 PM, Peter William Lount peter@smalltalk.org wrote:

...

Does anyone know how to get GNU C++ and MS Visual Studio 9.0 C/C++ to emit the C++ parse trees in a XML or other easy to parse format for Squeak?

perhaps this helps: http://www.gccxml.org/

Best,

Michael

Peter William Lount

2:12 p.m.

...

perhaps this helps: http://www.gccxml.org/

Best,

Michael

Hi,

Cool. That looks sweet. I noticed that it covers:

Supported Compilers

GCC-XML can simulate any of the following compilers: GCC: Versions 4.2, 4.1, 4.0, 3.4, 3.3, 3.2, 2.95.x Visual C++: Versions 8, 7.1, 7.0, and 6 (sp5) Borland, Intel, SGI: formerly supported but no longer tested

Not VC9.0 but 8 might be close enough... or is it?

Of course one could run the source for gcc-xml or gcc itself through gcc-xml and then have an xml representation of a c++ program that parses c++ and compiles programs!

The next step.

C/C++ to Smalltalk translator anyone?

Peter

Michael Haupt

2:27 p.m.

Hi Peter,

On Mon, Jun 30, 2008 at 2:12 PM, Peter William Lount peter@smalltalk.org wrote:

...

C/C++ to Smalltalk translator anyone?

multiple inheritance, anyone? ;-)

Have fun,

Michael

Peter William Lount

2:40 p.m.

...

peter@smalltalk.org wrote:

...
C/C++ to Smalltalk translator anyone?

multiple inheritance, anyone? ;-)

Have fun,

Michael

hi,

Taking back the base technologies into the fold of Smalltalk Style Systems.

Smalltalk and C were both designed with operating systems in mind.

Unix had the prevailing popular OS design.

Smalltalk Style Systems have the prevailing superior Messaging-Objects design.

Absorb and transform ALL C/C++ code into something new jettisoning the current bases in C/C++ with transformations into Smalltalk Style Systems.

Be license aware and appropriate of course.

Just a thought.

Now to try some cool visualizations of large C++ ick. Shivers.

Cheers,

Peter

Frank Lesser

4:01 p.m.

Hi Peter,

we did a MSIL to C++ backend in our LSW DotNet Reflection-Browser some years back. It avoided the need of C++ parser because we decompiled MSIL..

Just curious what it is for ? we introduced Smalltalk deompiler backend just to increase readability of .NET code for Smalltalkers.

Frank

_____

From: squeak-dev-bounces@lists.squeakfoundation.org [mailto:squeak-dev-bounces@lists.squeakfoundation.org] On Behalf Of Peter William Lount Sent: Monday, June 30, 2008 2:40 PM To: The general-purpose Squeak developers list Subject: Re: [squeak-dev] C++ parser in Smalltalk?

mailto:peter@smalltalk.org peter@smalltalk.org wrote:

C/C++ to Smalltalk translator anyone?

multiple inheritance, anyone? ;-)

Have fun,

Michael

hi,

Taking back the base technologies into the fold of Smalltalk Style Systems.

Smalltalk and C were both designed with operating systems in mind.

Unix had the prevailing popular OS design.

Smalltalk Style Systems have the prevailing superior Messaging-Objects design.

Absorb and transform ALL C/C++ code into something new jettisoning the current bases in C/C++ with transformations into Smalltalk Style Systems.

Be license aware and appropriate of course.

Just a thought.

Now to try some cool visualizations of large C++ ick. Shivers.

Cheers,

Peter

Peter William Lount

9:28 p.m.

Hi Frank,

Your project seems interesting. I'd like to know more. Any links? Papers?

I need to learn a lot of ick stuff really fast and unfortunately that stuff is really icky, yup the ick is a number of very large monster C/C++ systems with tons of core assembly thrown in for extra fun. Some custom visualizations like I've done for learning monster Smalltalk systems will save a lot of time.

I'm into accelerated learning of detailed systems using the visual cortex of our brains since the human visual system has massive bandwidth that word based and auditory thought channels lack, although I suppose one could convert large systems into a symphony. Anyway visual representations of large systems can help in quickly learning about how they are constructed and identify where one needs to focus extra attention.

On a recent large Smalltalk project the visual map required about eight feet by three feet just to map out the connections between the larger object assemblies. It helped provide an overview of the system. Programmers who'd been working with the system for years had no idea that it was shaped that way.

There is a video from a few years back on Channel 9 over at ick, Microsoft, where they tell of a very large map of their 5,000 + DLLs for XP. They built it by reading the raw DLLs and determining the links between them all. They consider their OS a fractured system living in these DLLs which we know as DLL Hell. It's a Hell for them too! Ah, the fun of eating your own technology. They found redundant code (sometimes 12+ copies of the same function which leads to all sorts of fun fixing bugs and providing security patches) and were better able to reduce their icky factor a little making XP more stable and less tangled that their prior systems.

Aside from the visualization aspect I'd like to computer the System Brittleness Factor (LSBF) for each system to see how rigid or flexible the code base is. This helps identify where it can be improved and where code can be shrunk by increasing flexibility through merging of methods/classes that really are similar. As we know C/C++ code is more "rigid" due to it's use of typed variables which very strictly limits the object flow paths through the program. Even with C++ Templates which enables a measure of polymorphism for C++ programs the rigidity can be measured. Typically the code needed for a system expands when typed variables are added. This is a problem for many reasons including comprehension due to the increased brain bandwidth required to simply read the ick.

Also for the other reasons I stated in the earlier emails: "All your languages and systems belong to us [Smalltalk Style Systems]." The sCurge of C based systems has been with us way to long, it's time to take back the night and the day. ;-) Gotta have fun...

Check out the awesome work of LLVM. http://www.LLVM.org. Runtime Dynamic Recompiled on the Fly C based systems on the way and in part financed by Apple. Liberation from GCC is on the horizon. Imagine a Squeak that can recompile it's VM on the fly and then "hop" over to the new one dropping the old version from memory!!! We do this all the time in Smalltalk, it'll be nice for C to finally catch up after four decades! It's also nice to see a vendor like Apple attempting to bring this capability to their C based operating systems and applications technologies.

All the best,

Peter

[ | peter at smalltalk dot org ]

ps. Ick is a technical term referring to the ick factor of a system. Ick is the opposite of elegant, beauty, simplicity. I work to identify ick and remove it from systems when possible or simply to fix the ick so that it doesn't stop a system from working.

Igor Stasenko

10:10 p.m.

2008/6/30 Peter William Lount peter@smalltalk.org:

...

Hi Frank,

Your project seems interesting. I'd like to know more. Any links? Papers?

I need to learn a lot of ick stuff really fast and unfortunately that stuff is really icky, yup the ick is a number of very large monster C/C++ systems with tons of core assembly thrown in for extra fun. Some custom visualizations like I've done for learning monster Smalltalk systems will save a lot of time.

I'm into accelerated learning of detailed systems using the visual cortex of our brains since the human visual system has massive bandwidth that word based and auditory thought channels lack, although I suppose one could convert large systems into a symphony. Anyway visual representations of large systems can help in quickly learning about how they are constructed and identify where one needs to focus extra attention.

On a recent large Smalltalk project the visual map required about eight feet by three feet just to map out the connections between the larger object assemblies. It helped provide an overview of the system. Programmers who'd been working with the system for years had no idea that it was shaped that way.

There is a video from a few years back on Channel 9 over at ick, Microsoft, where they tell of a very large map of their 5,000 + DLLs for XP. They built it by reading the raw DLLs and determining the links between them all. They consider their OS a fractured system living in these DLLs which we know as DLL Hell. It's a Hell for them too! Ah, the fun of eating your own technology. They found redundant code (sometimes 12+ copies of the same function which leads to all sorts of fun fixing bugs and providing security patches) and were better able to reduce their icky factor a little making XP more stable and less tangled that their prior systems.

Aside from the visualization aspect I'd like to computer the System Brittleness Factor (LSBF) for each system to see how rigid or flexible the code base is. This helps identify where it can be improved and where code can be shrunk by increasing flexibility through merging of methods/classes that really are similar. As we know C/C++ code is more "rigid" due to it's use of typed variables which very strictly limits the object flow paths through the program. Even with C++ Templates which enables a measure of polymorphism for C++ programs the rigidity can be measured. Typically the code needed for a system expands when typed variables are added. This is a problem for many reasons including comprehension due to the increased brain bandwidth required to simply read the ick.

Also for the other reasons I stated in the earlier emails: "All your languages and systems belong to us [Smalltalk Style Systems]." The sCurge of C based systems has been with us way to long, it's time to take back the night and the day. ;-) Gotta have fun...

Check out the awesome work of LLVM. http://www.LLVM.org. Runtime Dynamic Recompiled on the Fly C based systems on the way and in part financed by Apple. Liberation from GCC is on the horizon. Imagine a Squeak that can recompile it's VM on the fly and then "hop" over to the new one dropping the old version from memory!!! We do this all the time in Smalltalk, it'll be nice for C to finally catch up after four decades! It's also nice to see a vendor like Apple attempting to bring this capability to their C based operating systems and applications technologies.

If you able to compile things at run time, then why compiling C at all? See Exupery & friends.

...

All the best,

Peter

[ | peter at smalltalk dot org ]

ps. Ick is a technical term referring to the ick factor of a system. Ick is the opposite of elegant, beauty, simplicity. I work to identify ick and remove it from systems when possible or simply to fix the ick so that it doesn't stop a system from working.

-- Best regards, Igor Stasenko AKA sig.

David Zmick

10:12 p.m.

speed

On Mon, Jun 30, 2008 at 4:10 PM, Igor Stasenko siguctua@gmail.com wrote:

...

2008/6/30 Peter William Lount peter@smalltalk.org:

...
Hi Frank,

Your project seems interesting. I'd like to know more. Any links? Papers?

I need to learn a lot of ick stuff really fast and unfortunately that

stuff

...
is really icky, yup the ick is a number of very large monster C/C++

systems

...
with tons of core assembly thrown in for extra fun. Some custom visualizations like I've done for learning monster Smalltalk systems will save a lot of time.

I'm into accelerated learning of detailed systems using the visual cortex

of

...
our brains since the human visual system has massive bandwidth that word based and auditory thought channels lack, although I suppose one could convert large systems into a symphony. Anyway visual representations of large systems can help in quickly learning about how they are constructed and identify where one needs to focus extra attention.

On a recent large Smalltalk project the visual map required about eight

feet

...
by three feet just to map out the connections between the larger object assemblies. It helped provide an overview of the system. Programmers

who'd

...
been working with the system for years had no idea that it was shaped

that

...
way.

There is a video from a few years back on Channel 9 over at ick,

Microsoft,

...
where they tell of a very large map of their 5,000 + DLLs for XP. They

built

...
it by reading the raw DLLs and determining the links between them all.

They

...
consider their OS a fractured system living in these DLLs which we know

as

...
DLL Hell. It's a Hell for them too! Ah, the fun of eating your own technology. They found redundant code (sometimes 12+ copies of the same function which leads to all sorts of fun fixing bugs and providing

security

...
patches) and were better able to reduce their icky factor a little making

XP

...
more stable and less tangled that their prior systems.

Aside from the visualization aspect I'd like to computer the System Brittleness Factor (LSBF) for each system to see how rigid or flexible

the

...
code base is. This helps identify where it can be improved and where code can be shrunk by increasing flexibility through merging of

methods/classes

...
that really are similar. As we know C/C++ code is more "rigid" due to

it's

...
use of typed variables which very strictly limits the object flow paths through the program. Even with C++ Templates which enables a measure of polymorphism for C++ programs the rigidity can be measured. Typically the code needed for a system expands when typed variables are added. This is

a

...
problem for many reasons including comprehension due to the increased

brain

...
bandwidth required to simply read the ick.

Also for the other reasons I stated in the earlier emails: "All your languages and systems belong to us [Smalltalk Style Systems]." The sCurge

of

...
C based systems has been with us way to long, it's time to take back the night and the day. ;-) Gotta have fun...

Check out the awesome work of LLVM. http://www.LLVM.org http://www.llvm.org/.

Runtime Dynamic

...
Recompiled on the Fly C based systems on the way and in part financed by Apple. Liberation from GCC is on the horizon. Imagine a Squeak that can recompile it's VM on the fly and then "hop" over to the new one dropping

the

...
old version from memory!!! We do this all the time in Smalltalk, it'll be nice for C to finally catch up after four decades! It's also nice to see

a

...
vendor like Apple attempting to bring this capability to their C based operating systems and applications technologies.

If you able to compile things at run time, then why compiling C at all? See Exupery & friends.

...
All the best,

Peter

[ | peter at smalltalk dot org ]

ps. Ick is a technical term referring to the ick factor of a system. Ick

is

...
the opposite of elegant, beauty, simplicity. I work to identify ick and remove it from systems when possible or simply to fix the ick so that it doesn't stop a system from working.

-- Best regards, Igor Stasenko AKA sig.

-- David Zmick /dz0004455\ http://dz0004455.googlepages.com http://dz0004455.blogspot.com

Peter William Lount

10:35 p.m.

Hi,

...

If you able to compile things at run time, then why compiling C at all? See Exupery & friends. - Igor

Exupery seems very interesting.

There is lots of C/C++ code out there in use in many projects.

I can't always control what technologies clients use or which is of interest to reuse.

Take a code base such as OpenBSD or FreeBSD or NetBSD which is almost entirely C/C++ based and evolve it to the next level.

Moose seems interesting. Thanks for that link. Very interesting indeed.

Speed is always fun - fast programs and fast cars that is.

Cheers,

Peter

[ | peter at smalltalk dot org ] value

Igor Stasenko

1 Jul 1 Jul

5:21 a.m.

2008/6/30 Peter William Lount peter@smalltalk.org:

...

Hi,

...
If you able to compile things at run time, then why compiling C at all? See Exupery & friends. - Igor

Exupery seems very interesting.

There is lots of C/C++ code out there in use in many projects.

I can't always control what technologies clients use or which is of interest to reuse.

Take a code base such as OpenBSD or FreeBSD or NetBSD which is almost entirely C/C++ based and evolve it to the next level.

Moose seems interesting. Thanks for that link. Very interesting indeed.

Speed is always fun - fast programs and fast cars that is.

Moreover, if you looking for speed, just take a look at Huemul Smalltalk :) Its generates native code using Exupery, bypassing a bytecode at all. Moreover, i made a compiler, which can be translate smalltalk code to low-level native code, without even need of using external tools and writing primitives in C. Guess, how faster it could be compared to bytecode-driven VM , written and compiled by C compiler. In the end it would be possible to implement a self-sustained system without need of writing a single line of code in C.

...

Cheers,

Peter

[ | peter at smalltalk dot org ] value

-- Best regards, Igor Stasenko AKA sig.

David Zmick

5:36 a.m.

i could never get huemel to work :(

On Mon, Jun 30, 2008 at 11:21 PM, Igor Stasenko siguctua@gmail.com wrote:

...

2008/6/30 Peter William Lount peter@smalltalk.org:

...
Hi,

...
If you able to compile things at run time, then why compiling C at all? See Exupery & friends. - Igor

Exupery seems very interesting.

There is lots of C/C++ code out there in use in many projects.

I can't always control what technologies clients use or which is of

interest

...
to reuse.

Take a code base such as OpenBSD or FreeBSD or NetBSD which is almost entirely C/C++ based and evolve it to the next level.

Moose seems interesting. Thanks for that link. Very interesting indeed.

Speed is always fun - fast programs and fast cars that is.

Moreover, if you looking for speed, just take a look at Huemul Smalltalk :) Its generates native code using Exupery, bypassing a bytecode at all. Moreover, i made a compiler, which can be translate smalltalk code to low-level native code, without even need of using external tools and writing primitives in C. Guess, how faster it could be compared to bytecode-driven VM , written and compiled by C compiler. In the end it would be possible to implement a self-sustained system without need of writing a single line of code in C.

...
Cheers,

Peter

[ | peter at smalltalk dot org ] value

-- Best regards, Igor Stasenko AKA sig.

-- David Zmick /dz0004455\ http://dz0004455.googlepages.com http://dz0004455.blogspot.com

bryce＠kampjes.demon.co.uk

10:47 p.m.

Igor Stasenko writes:

...

2008/6/30 Peter William Lount peter@smalltalk.org:

...
Hi, Speed is always fun - fast programs and fast cars that is.

Moreover, if you looking for speed, just take a look at Huemul Smalltalk :)

You could also look at Exupery itself, I think Exupery is about as fast as Huemul. Huemul was much faster until it got a few extra language features that cost performance. Exupery's been through the same loop, fast, add features, then add more optimisations to regain the speed.

Bryce

P.S. That's one reason I don't like the idea of a write protect bit in the object header. It adds yet another thing to check for every single write into an object. Small costs add up on common basic operations.

Write protection could be implemented using similar tricks to the write barrier. Then send optimisation will help reduce the costs when it's used. When it's not used, there's no cost.

Bryce

Eliot Miranda

11:44 p.m.

On Tue, Jul 1, 2008 at 1:47 PM, bryce@kampjes.demon.co.uk wrote:

...

Igor Stasenko writes:

...
2008/6/30 Peter William Lount peter@smalltalk.org:

...
Hi, Speed is always fun - fast programs and fast cars that is.

Moreover, if you looking for speed, just take a look at Huemul Smalltalk

:)

You could also look at Exupery itself, I think Exupery is about as fast as Huemul. Huemul was much faster until it got a few extra language features that cost performance. Exupery's been through the same loop, fast, add features, then add more optimisations to regain the speed.

Bryce

P.S. That's one reason I don't like the idea of a write protect bit in the object header. It adds yet another thing to check for every single write into an object. Small costs add up on common basic operations.

Actually one can be clever about this. yes one has to check for inst var assignment. But for at:put: one can fold the check into other activities. For example, in my VisualWorks implementation the write-protect bit was put very close to and more significant than the size field in the object header.

An at:put: has to extract the size of the array for the bounds check. The size field might indicate an overflow size (for large arrays their size doesn't fit in the header size field and requires an additional word in front of the header to store the actual size. The overflow is indicated by the size field being at its maximum value or somethign similar).

So in at:put: code masks off the size field and the write-protect bit so that when the check is made for an overflow size a write-protected object appears to have an overflow size. So the check for write-protect is done only in the arm that fetches the overflow size. This makes the test free for most array accesses since most arrays are small enough to not need an overflow size (at least in VW).

Write protection could be implemented using similar tricks to the

...

write barrier. Then send optimisation will help reduce the costs when it's used. When it's not used, there's no cost.

I don't understand this. Can you explain the write-barrier tricks and the send optimization that eliminates them?

I think per-object write-protection is very useful. Its very useful for read-only literals, OODBs, proxies (distributed objects), debugging, etc. Amongst Smalltalks I think VisualAge had it first and I did it for VW round about 2002. I did it again for Squeak at Cadence. In both the VW and Squeak cases the performance degradation was less than 5% for standard benchmarks. Its cheap enough not to be noticed and there's lots more fat in the Squeak VM one can cut to more than regain performance.

So unlike, say, named primitives for the core primitives, this is something I am in favour of. It is a cost well worth paying for the added functionality.

Igor Stasenko

2 Jul 2 Jul

1:32 a.m.

2008/7/2 Eliot Miranda eliot.miranda@gmail.com:

...

On Tue, Jul 1, 2008 at 1:47 PM, bryce@kampjes.demon.co.uk wrote:

...
Igor Stasenko writes:

...
2008/6/30 Peter William Lount peter@smalltalk.org:

...
Hi, Speed is always fun - fast programs and fast cars that is.

Moreover, if you looking for speed, just take a look at Huemul

Smalltalk :)

You could also look at Exupery itself, I think Exupery is about as fast as Huemul. Huemul was much faster until it got a few extra language features that cost performance. Exupery's been through the same loop, fast, add features, then add more optimisations to regain the speed.

Bryce

P.S. That's one reason I don't like the idea of a write protect bit in the object header. It adds yet another thing to check for every single write into an object. Small costs add up on common basic operations.

Actually one can be clever about this. yes one has to check for inst var assignment. But for at:put: one can fold the check into other activities. For example, in my VisualWorks implementation the write-protect bit was put very close to and more significant than the size field in the object header.

An at:put: has to extract the size of the array for the bounds check. The size field might indicate an overflow size (for large arrays their size doesn't fit in the header size field and requires an additional word in front of the header to store the actual size. The overflow is indicated by the size field being at its maximum value or somethign similar).

So in at:put: code masks off the size field and the write-protect bit so that when the check is made for an overflow size a write-protected object appears to have an overflow size. So the check for write-protect is done only in the arm that fetches the overflow size. This makes the test free for most array accesses since most arrays are small enough to not need an overflow size (at least in VW).

...
Write protection could be implemented using similar tricks to the write barrier. Then send optimisation will help reduce the costs when it's used. When it's not used, there's no cost.

I don't understand this. Can you explain the write-barrier tricks and the send optimization that eliminates them?

I think per-object write-protection is very useful. Its very useful for read-only literals, OODBs, proxies (distributed objects), debugging, etc. Amongst Smalltalks I think VisualAge had it first and I did it for VW round about 2002. I did it again for Squeak at Cadence. In both the VW and Squeak cases the performance degradation was less than 5% for standard benchmarks. Its cheap enough not to be noticed and there's lots more fat in the Squeak VM one can cut to more than regain performance.

So unlike, say, named primitives for the core primitives, this is something I am in favour of. It is a cost well worth paying for the added functionality.

Well, i don't think that write-protect (AKA immutable bit) is of great importance. There are simple and fool prof concept, used in E: - do not expose critical resources outside your model. Then, since you can't have reference to object(s) you may want to modify, you can't do any harm.

Then in 99% cases the check is redundant.

-- Best regards, Igor Stasenko AKA sig.

bryce＠kampjes.demon.co.uk

3 Jul 3 Jul

11:56 p.m.

Eliot Miranda writes:

...

...
P.S. That's one reason I don't like the idea of a write protect bit in the object header. It adds yet another thing to check for every single write into an object. Small costs add up on common basic operations.

Actually one can be clever about this. yes one has to check for inst var assignment. But for at:put: one can fold the check into other activities. For example, in my VisualWorks implementation the write-protect bit was put very close to and more significant than the size field in the object header.

<snip, neat optimisation>

...

...
Write protection could be implemented using similar tricks to the write barrier. Then send optimisation will help reduce the costs when it's used. When it's not used, there's no cost.

I don't understand this. Can you explain the write-barrier tricks and the send optimization that eliminates them?

Automatically create a new hidden subclass of the class that acts appropriately for every write. The write protection can then be encoded in the class. The only overhead is that required by dynamic dispatch which we're already paying for.

...

I think per-object write-protection is very useful. Its very useful for read-only literals, OODBs, proxies (distributed objects), debugging, etc. Amongst Smalltalks I think VisualAge had it first and I did it for VW round about 2002. I did it again for Squeak at Cadence. In both the VW and Squeak cases the performance degradation was less than 5% for standard benchmarks. Its cheap enough not to be noticed and there's lots more fat in the Squeak VM one can cut to more than regain performance.

So unlike, say, named primitives for the core primitives, this is something I am in favour of. It is a cost well worth paying for the added functionality.

And when we're twice as fast as VisualWorks is now it'll be a 10% overhead. Twice as fast as VisualWorks is the original goal for Exupery.

Immutability or change tracking can be provided more efficiently purely inside the image when it's needed.

Bryce

Igor Stasenko

4 Jul 4 Jul

12:43 a.m.

2008/7/4 bryce@kampjes.demon.co.uk:

...

Eliot Miranda writes:

...
...
P.S. That's one reason I don't like the idea of a write protect bit in the object header. It adds yet another thing to check for every single write into an object. Small costs add up on common basic operations.

Actually one can be clever about this. yes one has to check for inst var assignment. But for at:put: one can fold the check into other activities. For example, in my VisualWorks implementation the write-protect bit was put very close to and more significant than the size field in the object header.

<snip, neat optimisation>

...
...
Write protection could be implemented using similar tricks to the write barrier. Then send optimisation will help reduce the costs when it's used. When it's not used, there's no cost.

I don't understand this. Can you explain the write-barrier tricks and the send optimization that eliminates them?

Automatically create a new hidden subclass of the class that acts appropriately for every write. The write protection can then be encoded in the class. The only overhead is that required by dynamic dispatch which we're already paying for.

Yes, it looks more like a capability-based approach. It is better in a ways, that we can introduce wide range of capabilities in future while keep object formats intact. The question is however, how to introduce such model without compatibility conflicts. Some magic with mixins/traits come in mind: to be able to define a behavior which can be turned on/off/switch for a particular object. Really, why we should pay with state(flags), when actually its a subject of different capabilities, which should be reflected by behavior.

...

...
I think per-object write-protection is very useful. Its very useful for read-only literals, OODBs, proxies (distributed objects), debugging, etc. Amongst Smalltalks I think VisualAge had it first and I did it for VW round about 2002. I did it again for Squeak at Cadence. In both the VW and Squeak cases the performance degradation was less than 5% for standard benchmarks. Its cheap enough not to be noticed and there's lots more fat in the Squeak VM one can cut to more than regain performance.

So unlike, say, named primitives for the core primitives, this is something I am in favour of. It is a cost well worth paying for the added functionality.

And when we're twice as fast as VisualWorks is now it'll be a 10% overhead. Twice as fast as VisualWorks is the original goal for Exupery.

Immutability or change tracking can be provided more efficiently purely inside the image when it's needed.

Bryce

-- Best regards, Igor Stasenko AKA sig.

Eliot Miranda

1:01 a.m.

On Thu, Jul 3, 2008 at 2:56 PM, bryce@kampjes.demon.co.uk wrote:

...

Eliot Miranda writes:

...
...
P.S. That's one reason I don't like the idea of a write protect bit in the object header. It adds yet another thing to check for every single write into an object. Small costs add up on common basic operations.

Actually one can be clever about this. yes one has to check for inst

var

...
assignment. But for at:put: one can fold the check into other

activities.

...
For example, in my VisualWorks implementation the write-protect bit was

put

...
very close to and more significant than the size field in the object

header.

<snip, neat optimisation>

...
...
Write protection could be implemented using similar tricks to the write barrier. Then send optimisation will help reduce the costs when it's used. When it's not used, there's no cost.

I don't understand this. Can you explain the write-barrier tricks and

the

...
send optimization that eliminates them?

Automatically create a new hidden subclass of the class that acts appropriately for every write. The write protection can then be encoded in the class. The only overhead is that required by dynamic dispatch which we're already paying for.

Ah, ok. This doesn't work in general. e.g. one can't turn on immutablity in the middle of a method that assigns to inst vars. The method in the subclass would need to be different and mapping pcs at runtime is, uh, decidely nontrivial. It works for access to arrays but not to objects in general.

...

I think per-object write-protection is very useful. Its very useful for

...
read-only literals, OODBs, proxies (distributed objects), debugging,

etc.

...
Amongst Smalltalks I think VisualAge had it first and I did it for VW

round

...
about 2002. I did it again for Squeak at Cadence. In both the VW and Squeak cases the performance degradation was less than 5% for standard benchmarks. Its cheap enough not to be noticed and there's lots more

fat in

...
the Squeak VM one can cut to more than regain performance.

So unlike, say, named primitives for the core primitives, this is

something

...
I am in favour of. It is a cost well worth paying for the added functionality.

And when we're twice as fast as VisualWorks is now it'll be a 10% overhead. Twice as fast as VisualWorks is the original goal for Exupery.

Where are you on that?

...

Immutability or change tracking can be provided more efficiently purely inside the image when it's needed.

Bryce

bryce＠kampjes.demon.co.uk

6 Jul 6 Jul

3:57 p.m.

Eliot Miranda writes:

...

On Thu, Jul 3, 2008 at 2:56 PM, bryce@kampjes.demon.co.uk wrote:

...
Eliot Miranda writes:

...
...
Write protection could be implemented using similar tricks to the write barrier. Then send optimisation will help reduce the costs when it's used. When it's not used, there's no cost.

I don't understand this. Can you explain the write-barrier tricks and

the

...
send optimization that eliminates them?

Automatically create a new hidden subclass of the class that acts appropriately for every write. The write protection can then be encoded in the class. The only overhead is that required by dynamic dispatch which we're already paying for.

Ah, ok. This doesn't work in general. e.g. one can't turn on immutablity in the middle of a method that assigns to inst vars. The method in the subclass would need to be different and mapping pcs at runtime is, uh, decidely nontrivial. It works for access to arrays but not to objects in general.

A more complicated variant would be to use compiler created accessors for all variable access then rely on inlining to remove the overhead.

The catch here is this would force some de-optimisation when switching write protection on and off. Worst case, de-optimisation of inlined code can require a full object memory scan to find all the contexts that require deoptimisation.

Fast deoptimisation is one of the strongest reasons I can see for a context cache in a system that does inlining. For pure speed, the value of the context cache will be reduced because inlining should remove the most frequent sends.

...

...
I think per-object write-protection is very useful. Its very useful for

...
read-only literals, OODBs, proxies (distributed objects), debugging,

etc.

...
Amongst Smalltalks I think VisualAge had it first and I did it for VW

round

...
about 2002. I did it again for Squeak at Cadence. In both the VW and Squeak cases the performance degradation was less than 5% for standard benchmarks. Its cheap enough not to be noticed and there's lots more

fat in

...
the Squeak VM one can cut to more than regain performance.

So unlike, say, named primitives for the core primitives, this is

something

...
I am in favour of. It is a cost well worth paying for the added functionality.

And when we're twice as fast as VisualWorks is now it'll be a 10% overhead. Twice as fast as VisualWorks is the original goal for Exupery.

Where are you on that?

Progress is good though will be a little slow over the summer.

I'm working towards the 1.0 release. That's going well. The last release looks reasonably reliable and the current release compiles much quicker than previously. I've fixed the performance of the register allocator so that it doesn't blow out like it used to. Now it always takes about 50% of the compile time. Cascade support has also been added so all major language features are now supported. There's a bug in the current development version that'll need fixing before the next minor release.

The work towards 1.0 now involves adding primitives which the interpreter inlines and tuning. The current engine should be able to provide a nice performance improvement for Squeak. 1.0 will be worse than VisualWorks for overall performance as send performance is worse though still twice as good as Squeak's interpreter. 1.0 should be a little faster than VisualWorks for bytecode performance though I'd guess it's a bit slower now because I haven't done any bytecode tuning in the last few years.

The current releases are good enough to play with. The next one will be much nicer to play with than the current released version due to faster compilation.

Compilation is still much slower than it needs to be, so far I've favoured simplicity, debuggability, and testability over compile time performance. For instance every compiler stage copies it's input to create the output even for stages that only do a few optimisations. Besides the register allocator all optimisations are simple linear time tree traversals.

After 1.0, the plan is to add full dynamic message inlining in 2.0, then an SSA optimiser in 3.0. Exupery's goals are similar to AoSTa's, the major differences are the code generator is in the image, and Exupery doesn't stop execution to optimise. It compiles in the background then registers the compiled method which will be used by later calls. Exupery relies on the interpreter to execute code that isn't used frequently enough to be worth compiling or that isn't currently compiled.

Bryce

P.S. Being able to control precisely what's in the code cache makes debugging crashes much easier. A common trick when debugging Exupery bugs is to recompile everything that was compiled when it crashed then try to reproduce the crash. Once it's reproduced, it's possible to do binary chop of the compiled methods to get down to the few that are required to reproduce the problem.

That would be much more difficult to do with a HPS style system which compiles the method before execution. Of course HPS's style has it's own advantages.

Frank Lesser

3 Jul 3 Jul

12:14 a.m.

Hi Peter,

Sorry for the late & nil answer - I am real busy right now - will write you later.

I would prefer to use Michael Haupt's proposal - using GCCXML - but you have to convert it to a DLL & use it via FFI.

Frank

Michael van der Gulik

1 Jul 1 Jul

3:07 a.m.

On Tue, Jul 1, 2008 at 12:12 AM, Peter William Lount peter@smalltalk.org wrote:

...

C/C++ to Smalltalk translator anyone?

If I were doing this, I'd investigate making a back-end for GCC that generates Smalltalk bytecodes. Then we could compile many C, C++, Fortran and Java programs to the Squeak VM :-).

Gulik.

-- http://people.squeakfoundation.org/person/mikevdg http://gulik.pbwiki.com/

David Zmick

3:36 a.m.

I would love to help you with this, give me anything you want me to TRY to do! I will try to be as helpful as possible.

On Mon, Jun 30, 2008 at 9:07 PM, Michael van der Gulik mikevdg@gmail.com wrote:

...

On Tue, Jul 1, 2008 at 12:12 AM, Peter William Lount peter@smalltalk.org wrote:

...
C/C++ to Smalltalk translator anyone?

If I were doing this, I'd investigate making a back-end for GCC that generates Smalltalk bytecodes. Then we could compile many C, C++, Fortran and Java programs to the Squeak VM :-).

Gulik.

-- http://people.squeakfoundation.org/person/mikevdg http://gulik.pbwiki.com/

-- David Zmick /dz0004455\ http://dz0004455.googlepages.com http://dz0004455.blogspot.com

David Pennell

30 Jun 30 Jun

10:22 p.m.

Have you looked at Moose and iPlasmahttp://moose.unibe.ch/docs/faq/importWithiPlasma?_s=uvLJlShgNrtzLJkM&_k=tbNfIvHe&_n&51 ? I only have a vague notion that this may be applicable.

-david

On Mon, Jun 30, 2008 at 4:10 AM, Peter William Lount peter@smalltalk.org wrote:

...

Hi,

Does anyone know of a working C++ parser written in Smalltalk? Just the front end would be fine, it doesn't have to generate any code. I just want to have a Smalltalk/Squeak program read the darn stuff so I can play with the information.

If not what would you suggest as an approach for having one with a minimum of effort?

Thanks in advance,

Peter

5796

Age (days ago)

5802

Last active (days ago)

squeak-dev@lists.squeakfoundation.org

24 comments

10 participants

tags (0)

participants (10)

bryce＠kampjes.demon.co.uk
Damien Cassou
David Pennell
David Zmick
Eliot Miranda
Frank Lesser
Igor Stasenko
Michael Haupt
Michael van der Gulik
Peter William Lount