Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 16 Oct 2000 11:47:23 +0200
From:      Andre Albsmeier <andre.albsmeier@mchp.siemens.de>
To:        dbhague@allstor-sw.co.uk
Cc:        Andre Albsmeier <andre.albsmeier@mchp.siemens.de>, freebsd-scsi@FreeBSD.org, freebsd-fs@FreeBSD.org, smcintyre@allstor-sw.co.uk
Subject:   Re: Stressed SCSI subsystem locks up the system
Message-ID:  <20001016114723.A22193@curry.mchp.siemens.de>
In-Reply-To: <8025697A.00340E6C.00@mail.plasmon.co.uk>; from dbhague@allstor-sw.co.uk on Mon, Oct 16, 2000 at 10:28:34AM %2B0100
References:  <8025697A.00340E6C.00@mail.plasmon.co.uk>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 16-Oct-2000 at 10:28:34 +0100, dbhague@allstor-sw.co.uk wrote:
> Andre,
> What were your SCSI errors ?

Oct 13 10:05:28 <kern.crit> server /ktry: (da2:ahc0:0:2:0): data overrun detected in Data-out phase. +Tag == 0xe.
Oct 13 10:09:40 <kern.crit> server /ktry: (da2:ahc0:0:2:0): Have seen Data Phase.  Length = 65536. +NumSGs = 16.

These appeared with the 3940AU. When replacing it with two 2940 everything
worked great for several days now. I am keeping it this way now. When
Justin does some driver changes, I will try my 3940AU again...

	-Andre


> 
> We have one system that has now run for five days without failure.   Today we
> will start to deconstruct this unit, any advice would be welcome.
> 
> We also ran five system over the weekend and all but the one, the IDE system,
> failed.
> These were:
>    A repeat of the passing system above, failed with
>      Bad blocks 135666304, inode 5142534
>      6 seconds later, Bad blocks 135666304, inode 5634466
>      then, panic ffs_blkfree: freeing free frag, this is on the /RAID partition.
>    Test run against an IDE  disk, still running but slowly
>    Test run against a SCSI disk
>    Test run using a Symbios dual SCSI card,
>    Test running FreeBSD 3.0
> 
> Two of the above tests have got struck in iowait, for example.
> root       451  0.0  0.1   368  172  p0  D    Fri06PM   0:17.77 rm -rf /RAID/5
> root       454  0.0  0.2   368  196  p0  D    Fri06PM   0:17.85 rm -rf /RAID/7
> root       455  0.0  0.2   368  196  p0  D    Fri06PM   0:17.42 rm -rf /RAID/1
> root       457  0.0  0.2   368  196  p0  D    Fri06PM   0:17.44 rm -rf /RAID/2
> root       459  0.0  0.2   368  196  p0  D    Fri06PM   0:17.71 rm -rf /RAID/6
> root       461  0.0  0.2   368  196  p0  D    Fri06PM   0:17.10 rm -rf /RAID/4
> root       463  0.0  0.2   368  196  p0  D    Fri06PM   0:17.56 rm -rf /RAID/3
> 
> Just a few minutes ago cron started to die with a signal 10, we don't think this
> is relevant but...
> Oct 16 09:55:02 birch /kernel: pid 3551 (cron), uid 0: exited on signal 10 (core
>  dumped)
> Oct 16 10:00:00 birch /kernel: pid 3555 (cron), uid 0: exited on signal 10 (core
>  dumped)
> Oct 16 10:00:00 birch /kernel: pid 3556 (cron), uid 0: exited on signal 10 (core
>  dumped)
> Oct 16 10:05:01 birch /kernel: pid 3558 (cron), uid 0: exited on signal 10 (core
>  dumped)
> Oct 16 10:10:00 birch /kernel: pid 3560 (cron), uid 0: exited on signal 10 (core
>  dumped)
> Oct 16 10:15:00 birch /kernel: pid 3562 (cron), uid 0: exited on signal 10 (core
>  dumped)
> Oct 16 10:20:00 birch /kernel: pid 3564 (cron), uid 0: exited on signal 10 (core
>  dumped)
> 
> Regards Dave
> 

-- 
Micro$oft: Which virus will you get today?


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-fs" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20001016114723.A22193>