Hi Frank,


On Mon, Jan 13, 2014 at 11:20 PM, Frank Shearar <frank.shearar@gmail.com> wrote:
On 13 January 2014 23:42, David T. Lewis <lewis@mail.msen.com> wrote:
> On Tue, Jan 14, 2014 at 12:28:45AM +0100, Nicolas Cellier wrote:
>> 2014/1/14 karl ramberg <karlramberg@gmail.com>
>>
>> > TraitsFileOutTest is failing
>> >
>> > Correct log attached this time
>> >
>> > Cheers,
>> > Karl
>> >
>> > id:     #[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
>> means the file was closed...
>>
>> I observed increasing failure rate related to randomly closed change log
>> the last few months (particularly when loading/merging with MC), it would
>> be good to identify the root cause...
>>
>
> That sounds like exactly the problem that Frank was describing in another
> thread. He was ending up with a corrupt changes file IIRC, and the problems
> he described seemed to be related to primSize checks on a changes file that
> had been closed (this would presumably be a process trying to append to the
> changes file when some other process had closed the file stream).

I'm relieved it's not just me, but sad that it's not just me
experiencing this. So my understanding is that changes are stored by
accessing SourceFiles at: 2. If you want a readonly copy of sources or
changes, you use CurrentReadOnlySourceFiles at: 2. I suppose the first
step is verifying that nothing referencing these _directly_ does
things like closing files (I'd be tempted to look at CROSF first.)

forgive me for not replying earlier but I needed help to track this down.  We suffered a very similar symptom a while ago at Cadence.  It turns out that in our case the issue was the system running out of file handles because it left lots of fils open.  The cause was with Monticello package loading and the fix was to use CurrentReadOnlySourceFiles more.  I'm attaching a changeset in the hope that this is the same issue and that our changeset can point you towards a fix.  

HTH
--
best,
Eliot