Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 8 Mar 2008 18:49:32 -0800
From:      Jeremy Chadwick <koitsu@freebsd.org>
To:        Eilko Bos <eilko@bos-zuidema.nl>
Cc:        Remco van Bekkum <remco@spacemarines.us>, Joe Peterson <joe@skyrush.com>, Nikolaj Farrell <nixx@freebsd.se>, freebsd-stable@freebsd.org, Michael Haro <mharo@FreeBSD.org>
Subject:   Re: ad8: TIMEOUT - WRITE_DMA errors UFS 7.0-RC1
Message-ID:  <20080309024932.GB92566@eos.sc1.parodius.com>
In-Reply-To: <20080308235049.GA74522@webmail.home.brasapen.org>
References:  <479BAC09.7040505@freebsd.se> <20080126223750.GA8397@marshal.spacemarines.us> <479BC21D.10607@skyrush.com> <4cd036390801270001u72363b72v84231956b173bf73@mail.gmail.com> <20080308235049.GA74522@webmail.home.brasapen.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, Mar 09, 2008 at 12:50:49AM +0100, Eilko Bos wrote:
> >From the keyboard of Michael Haro, written on Sun, Jan 27, 2008 at 12:01:03AM -0800:
> > > Can anyone else using 7.0 who hasn't already (especially those using ZFS)
> > > check his/her /var/log/messages for disk TIMEOUTs or other disk error
> > > messages?  If this is widespread, I think the chances re slim that it is a
> > > hardware problem in every case.
> > 
> > I've had this problem with Hitachi sata drives using a promise sata controller.
> 
> I am using 2 160Gb Maxtor disks in geom_mirror. With 6.3 it runs fine. I 
> upgraded to 7.0-RELEASE and after install problems started. Disk TIMEOUTs
> freezed the box as soon as I initiated a lot of disk activity (e.g. make
> buildworld of building a kernel).
> 
> I 'downgraded' the box to 6.3 again (had to rebuild the mirror because it was
> touched by a newer gmirror) and now the problems have gone again. I have the
> strong impression it is not hardware bot rather 7.0-RELEASE related.
> Actually I want to get rid of the box at home (want to carry it to a datacenter)
> but if it can be helpfull I am willing to have it for another week or two at
> home to upgrade/downgrade/etc. with it.


You should consider posting the SMART stats for both /dev/ad0 and
/dev/ad1 (because you didn't disclose which disk is showing problems; I
assume ad0 though).  You can get these by installing
ports/sysutils/smartmontools and using "smartctl -a ad0" (same for ad1).

This can help narrow down the problem.  Confirming it's *not* a disk
issue is helpful.

More importantly: if the problem is easily reproducable for you under
RELENG_7, you should contact Scott Long, who has offered to help track
this problem down.

-- 
| Jeremy Chadwick                                    jdc at parodius.com |
| Parodius Networking                           http://www.parodius.com/ |
| UNIX Systems Administrator                      Mountain View, CA, USA |
| Making life hard for others since 1977.                  PGP: 4BD6C0CB |




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080309024932.GB92566>