[squeak-dev] hydra mac vm update

Thu May 8 03:20:18 UTC 2008

On May 7, 2008, at 5:54 PM, Igor Stasenko wrote:

> John, can you commit your update to SVN repository, so i can see
> what's going on and maybe can be helpful.
> I don't having Mac, so the only i can help with is the code  
> analysis :)

Likely tomorrow once I sort thru what I did.

However this code is flawed, we end up with ioEnqueueEventInto looping  
forever given just the right interactions between
the main vm and the secondary VM.

Perhaps someone can suggest something better, so I'll live it as an  
exercise to the reader to figure out why it doesn't work as desired.

Note I just coded up a 	ioMutexLock(eventQueueLock); where  
eventQueueLock  is a non-recursive mutex to work around the issue
by fully guarding all the code in ioEnqueueEventInto and  
ioDequeueEventFrom

//   The OSAtomicCompareAndSwap() operations compare oldValue to  
*theValue, and set *theValue to newValue if
//   the comparison is equal.  The comparison and assignment occur as  
one atomic operation.				

bool    OSAtomicCompareAndSwap32Barrier( int32_t __oldValue, int32_t  
__newValue, volatile int32_t *__theValue );
int32_t AtomicCAS(volatile int32_t *__theValue, int32_t  
__newValue,int32_t __oldValue);

int32_t AtomicCAS(volatile int32_t *__theValue, volatile int32_t  
__newValue,volatile int32_t __oldValue) {
	volatile int32_t old = *__theValue;
	OSAtomicCompareAndSwap32Barrier(__oldValue, __newValue,__theValue );
	return old;
}

/* from os-x man OSAtomicCompareAndSwap32Barrier
      These functions are thread and multiprocessor safe.  For each  
function, there is a version that does
      and another that does not incorporate a memory barrier.   
Barriers strictly order memory access on a
      weakly-ordered architecture such as PPC.  All loads and stores  
executed in sequential program order
      before the barrier will complete before any load or store  
executed after the barrier.  On a uniproces-sor, uniprocessor,
      sor, the barrier operation is typically a nop.  On a  
multiprocessor, the barrier can be quite expen-sive. expensive.
      sive.

      Most code will want to use the barrier functions to insure that  
memory shared between threads is prop-erly properly
      erly synchronized.  For example, if you want to initialize a  
shared data structure and then atomically
      increment a variable to indicate that the initialization is  
complete, then you must use OSAtomicIncre-ment32Barrier()  
OSAtomicIncrement32Barrier()
      ment32Barrier() to ensure that the stores to your data structure  
complete before the atomic add.  Like-wise, Likewise,
      wise, the consumer of that data structure must use  
OSAtomicDecrement32Barrier(), in order to ensure
      that their loads of the structure are not executed before the  
atomic decrement.  On the other hand, if
      you are simply incrementing a global counter, then it is safe  
and potentially much faster to use
      OSAtomicIncrement32().  If you are unsure which version to use,  
prefer the barrier variants as they are
      safer.
*/

/*
on windows this is
#define AtomicCAS(value_ptr, new_value, comparand)  
InterlockedCompareExchange(value_ptr Destination , new_value Exchange,  
comparand Comparand)
Destination
A pointer to the destination value. The sign is ignored.
Exchange
The exchange value. The sign is ignored.
Comparand
The value to compare to Destination. The sign is ignored.
Return Value
The function returns the initial value of the Destination parameter.
Remarks
The function compares the Destination value with the Comparand value.  
If the Destination value is equal to the Comparand value, the Exchange  
value is stored in the address specified by Destination. Otherwise, no  
operation is performed.
The parameters for this function must be aligned on a 32-bit boundary;  
otherwise, the function will behave unpredictably on multiprocessor  
x86 systems and any non-x86 systems.
The interlocked functions provide a simple mechanism for synchronizing  
access to a variable that is shared by multiple threads. This function  
is atomic with respect to calls to other interlocked functions.
This function is implemented using a compiler intrinsic where  
possible. For more information, see the header file and  
_InterlockedCompareExchange.
This function generates a full memory barrier (or fence) to ensure  
that memory operations are completed in order.
*/

void ioInitEventQueue(struct vmEventQueue * queue)
{
	queue->head.next = 0;
	queue->tail = &queue->head;
}

void ioEnqueueEventInto(struct vmEvent * event , struct vmEventQueue *  
queue)
{
	struct vmEvent * tail;
	struct vmEvent * old_next;

	/* add event to tail */
	event->next = 0;
	do {
		tail = queue->tail;
		old_next = (struct vmEvent *)AtomicCAS(&tail->next, event, 0);
		if (old_next != 0)
		{
			AtomicCAS(&queue->tail, old_next, tail);
		}
	} while (old_next != 0);
	AtomicCAS(&queue->tail, event, tail);

}

struct vmEvent * ioDequeueEventFrom(struct vmEventQueue * queue)
{
	struct vmEvent * event;
	struct vmEvent * oldhead;

	do {
		event = queue->head.next;
		if (!event) {
			return 0;
		}
		oldhead = (struct vmEvent *)AtomicCAS(&queue->head.next, event- 
 >next, event);
	} while (oldhead != event);

	/* To prevent queue damage, when queue tail points to just dequeued  
event,
		we should replace tail with head */
	if (event->next == 0)
	{
		/* queue->head.next should be 0, put it as tail */
		if ( 0 == AtomicCAS(&event->next, &queue->head, 0))
		{
			AtomicCAS(&queue->tail, &queue->head, event);
		}
	}
	return event;
}

--
= 
= 
= 
========================================================================
John M. McIntosh <johnmci at smalltalkconsulting.com>
Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
= 
= 
= 
========================================================================