Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 13 Dec 2006 19:09:15 +1100
From:      "Jan Mikkelsen" <janm@transactionware.com>
To:        "Scott Long" <scottl@samsco.org>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: g_vfs_done() failures on 6.2-RC1
Message-ID:  <00d401c71e8d$fb60de00$3301a8c0@janmxp>
References:  <00a601c71e7f$ed63f7a0$3301a8c0@janmxp> <457FAAFD.1080707@samsco.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Scott Long wrote:
> Jan Mikkelsen wrote:
>
>> - Daichi Goto's unionfs-p16 has been applied.
>> - The Areca driver is 1.20.00.12 from the Areca website.
>> - sym(4) patch (see PR/89550), but no sym controller present.
>> - SMP + FAST_IPSEC + SUIDDIR + device crypto.
>>
>> So:  I've seen this problem on a few machines under heavy I/O load, with 
>> ataraid and with arcmsr.  I've seen others report similar problems, but 
>> I've seen no resolution.  Does anyone have any idea what the problem is? 
>> Has anyone else seen similar problems?  Where to from here?
>>
>> Thanks,
>>
>
> You mention that you are using a driver from the Areca website.  Have
> you tried using the stock driver that comes with FreeBSD?  I don't know
> if it will be better or not, but I was planning on doing a refresh of
> the stock driver, and I'd hate to introduce instability that wasn't there 
> before.

I haven't run it recently.  I can roll back to the stock driver and see 
whether I see it again.  However, I can't always reproduce the problem, so I 
probably can't prove the absence of the problem.

I mentioned that I have seen similar problems on machines with ataraid, like 
this:

DOH! ata_alloc_composite failed! (x5)
FAILURE - out of memory in ata_raid_init_request (x6)
g_vfs_done():ar0s3f[WRITE(offset=113324673024, length=2048)]error = 5
g_vfs_done():ar0s3f[WRITE(offset=113325062144, length=2048)]error = 5
g_vfs_done():ar0s3f[WRITE(offset=113325127680, length=2048)]error = 5
g_vfs_done():ar0s3f[WRITE(offset=113325242368, length=2048)]error = 5
g_vfs_done():ar0s3f[WRITE(offset=113325256704, length=2048)]error = 5
g_vfs_done():ar0s3f[WRITE(offset=113325275136, length=2048)]error = 5

However, looking at this again, I'm not sure that the problem is identical 
anymore because the offset seems to be within the partition rather than just 
plain wrong (assuming the units of the offset message are bytes).  These 
messages are from an HP DL145G1 with two SATA drives and ataraid.

The workload that caused these messages is very similar:  Heavy I/O during 
multiple concurrent removes of deep trees on a filesystem with softupdates, 
system needs a reboot to get back on track.

Thanks,

Jan.

PS:  Any news on importing the sym(4) patch?




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?00d401c71e8d$fb60de00$3301a8c0>