Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 24 Jan 2008 16:56:29 +0200
From:      Andriy Gapon <avg@icyb.net.ua>
To:        freebsd-geom@freebsd.org
Cc:        Pawel Jakub Dawidek <pjd@freebsd.org>
Subject:   Re: gjournal on 6.2: Cannot delete /var/.deleted/#613759
Message-ID:  <4798A71D.6090902@icyb.net.ua>
In-Reply-To: <474020BD.4030305@icyb.net.ua>
References:  <4732E3C6.5060205@icyb.net.ua> <47343AC5.8090103@icyb.net.ua> <6EBC07A8-054F-476A-8DF5-B54124CEB339@freebsd.org> <4735D203.8010109@icyb.net.ua> <474020BD.4030305@icyb.net.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
on 18/11/2007 13:23 Andriy Gapon said the following:
> on 10/11/2007 17:45 Andriy Gapon said the following:
>> on 09/11/2007 14:38 Eric Anderson said the following:
>>> When inodes are reused, their gen count should go up (or NFS handles  
>>> would get broken quickly).  The file is probably being removed in- 
>>> between the readdir and the remove.
>>>
>> Eric,
>> thank you for the reply and the hint. I will try to add i_gen to a name
>> that gets assigned to gjournal-managed files under .deleted and see how
>> that works.
>>
> 
> Tried and it didn't help. The following was obtained during jdk build:
> 
> kernel: UFS_GJGC: Cannot delete /var/.deleted/#1202150:230144382 (error=2)
> [some seconds later]
> $ find /var/ -inum 1202150
> /var/tmp/tmp/hsperfdata_root/72795
> kernel: UFS_GJGC: Cannot delete /var/.deleted/#1202150:230145003 (error=2)
> [some seconds later]
> $ find /var/ -inum 1202150
> /var/tmp/tmp/hsperfdata_root/81211
> ^^^^^^^^^^^^ - btw, my /tmp is symlink to /var/tmp/tmp
> 
> So, even adding generation count doesn't fix the issue. Thus it seems
> that there seems to be some other kind of race condition in 6.x gjournal
> code.
> 

BTW, more data points.
I am doing buildworld with various -jX options and /tmp being on a
partition with gjournal6. This is on SMP machine with 2-core CPU.
If I do buildworld without -j or with -j2, then everything is OK; if I
use -j3 or higher, then the build fails very soon with 'interrupted
system call' and the messages like the quoted above appear in the system
log.
I think that it is curious that the maximum ok number of make jobs is
the same as the number of logical CPUs.

-- 
Andriy Gapon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4798A71D.6090902>