Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 16 May 2008 17:42:06 +0200
From:      Willy Offermans <Willy@Offermans.Rompen.nl>
To:        Kris Kennaway <kris@FreeBSD.org>
Cc:        freebsd-stable@FreeBSD.org
Subject:   Re: g_vfs_done error third part--PLEASE HELP!
Message-ID:  <20080516154206.GB9388@wiz.vpn.offrom.nl>
In-Reply-To: <482D816C.4060402@FreeBSD.org>
References:  <20080421190403.GA4625@wiz.vpn.offrom.nl> <20080421201047.GB6884@slackbox.xs4all.nl> <20080516121414.GD4618@wiz.vpn.offrom.nl> <482D816C.4060402@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Hello Kris,

On Fri, May 16, 2008 at 02:43:24PM +0200, Kris Kennaway wrote:
> Willy Offermans wrote:
> >Hello Roland and FreeBSD friends,
> >
> >I'm sorry to be so quite for a while, but I went away for a vacation.
> >But now I'm back, I like to solve this issue.
> >
> >
> >On Mon, Apr 21, 2008 at 10:10:47PM +0200, Roland Smith wrote:
> >>On Mon, Apr 21, 2008 at 09:04:03PM +0200, Willy Offermans wrote:
> >>>Dear FreeBSD friends,
> >>>
> >>>It is already the third time that I report this error. Can someone help
> >>>me in solving this issue?
> >>Probably the reason that you hear so little is that you provide so
> >>little information. Most of us are not clairvoyant.
> >> 
> >>>Over and over again and always after heavy disk I/O I see the following
> >>>errors in the log files. If I force ar0s1g to unmount the machine
> >>>spontaneously reboots. Nothing seriously seems to be damaged by this
> >>>act, but anyway I cannot afford something bad happening to this
> >>>production machine.
> >>Why would you force an unmount?
> >
> >Otherwise the device keeps on reporting to be unavailable and cannot be
> >unmounted:
> >
> >sun# umount /share/
> >umount: unmount of /share failed: Resource temporarily unavailable
> >
> >>>Apr 18 20:02:19 sun kernel: 
> >>>g_vfs_done():ar0s1g[WRITE(offset=290725068800, length=4096)]error = 5
> >>>
> >>>I have no clue what the errors mean, since offsets of 290725068800,
> >>>290725072896, and 290725074944 seem to be ridiculous. Does anybody 
> >>>have a clue what is going on?
> >>For starters, how big is ar0s1g? If the offset is in bytes, it is around
> >>270 GB, which is not that unusual in this day and age.
> >
> >I have to admit that I was a bit confused by an offset value of 
> >290725068800. There is no indication of a unit, so I assumed that it
> >was sector but probably it is simply bytes and then indeed the number
> >does make sense.
> >>>I'm using FreeBSD 7.0, but found the error being reported before with
> >>>previous versions of FreeBSD. I can and will provide more details on
> >>>demand.
> >>What does 'df' say?
> >
> >Filesystem  1K-blocks     Used     Avail Capacity  Mounted on
> >/dev/ar0s1a  20308398   230438  18453290     1%    /
> >devfs               1        1         0   100%    /dev
> >/dev/ar0s1d  21321454  3814482  15801256    19%    /usr
> >/dev/ar0s1e  50777034  5331686  41383186    11%    /var
> >/dev/ar0s1f 101554150 18813760  74616058    20%    /home
> >/dev/ar0s1g 274977824 34564876 218414724    14%    /share
> >
> >pretty normal I would say.
> >
> >>Did you notice any file corruption in the filesystem on ar0s1g?
> >
> >No the two disks are brand new and I did not encounter any noticeable
> >file corruption. However I assume that nowadays bad sectors on HD are
> >handled by the hardware and do not need any user interaction to correct
> >this. But maybe I'm totally wrong.
> >
> >>Unmount the filesystem and run fsck(8) on it. Does it report any errors?
> >
> >sun# fsck /dev/ar0s1g 
> >** /dev/ar0s1g
> >** Last Mounted on /share
> >** Phase 1 - Check Blocks and Sizes
> >INCORRECT BLOCK COUNT I=34788357 (272 should be 264)
> >CORRECT? [yn] y
> >
> >INCORRECT BLOCK COUNT I=34789217 (296 should be 288)
> >CORRECT? [yn] y
> >
> >** Phase 2 - Check Pathnames
> >** Phase 3 - Check Connectivity
> >** Phase 4 - Check Reference Counts
> >** Phase 5 - Check Cyl groups
> >FREE BLK COUNT(S) WRONG IN SUPERBLK
> >SALVAGE? [yn] y
> >
> >SUMMARY INFORMATION BAD
> >SALVAGE? [yn] y
> >
> >BLK(S) MISSING IN BIT MAPS
> >SALVAGE? [yn] y
> >
> >182863 files, 17282440 used, 120206472 free (12448 frags, 15024253
> >blocks, 0.0% fragmentation)
> >
> >***** FILE SYSTEM MARKED CLEAN *****
> >
> >***** FILE SYSTEM WAS MODIFIED *****
> >
> >The usual stuff I would say.
> 
> No, any form of filesystem corruption is not usual.
> 
> >
> >>>Any hints are very much appreciated.
> >>Did you manage to create a partition larger than the disk is (using
> >>newfs's -s switch)? In that case it could be that you're trying to write
> >>past the end of the device.
> >
> >No, look to the following output:
> >
> >sun# bsdlabel -A /dev/ar0s1
> ># /dev/ar0s1:
> >type: unknown
> >disk: amnesiac
> >label: 
> >flags:
> >bytes/sector: 512
> >sectors/track: 63
> >tracks/cylinder: 255
> >sectors/cylinder: 16065
> >cylinders: 60799
> >sectors/unit: 976751937
> >rpm: 3600
> >interleave: 1
> >trackskew: 0
> >cylinderskew: 0
> >headswitch: 0           # milliseconds
> >track-to-track seek: 0  # milliseconds
> >drivedata: 0 
> >
> >8 partitions:
> >#        size   offset    fstype   [fsize bsize bps/cpg]
> >  a: 41943040        0    4.2BSD        0     0     0 
> >  b:  8388608 41943040      swap                    
> >  c: 976751937        0    unused        0     0         # "raw"
> >part, don't edit
> >  d: 44040192 50331648    4.2BSD     2048 16384 28552 
> >  e: 104857600 94371840    4.2BSD     2048 16384 28552 
> >  f: 209715200 199229440    4.2BSD     2048 16384 28552 
> >  g: 567807297 408944640    4.2BSD     2048 16384 28552 
> >
> >/dev/ar0s1g starts after 408944640*512/1024/1024=199680MB
> >
> >
> >So I have to conclude that the write error message does make sense and
> >that something seems to be wrong with the disks. The next question is
> >what can I do about it? Should I return the disks to the shop and ask
> >for new ones?
> 
> #define EIO             5               /* Input/output error */
> 
> At least one of your disks is toast.
> 
> Kris
> 

I doubt it, but you could be right. Do you have any suggestions to
investigate any further?

Since this is a production machine I have to operate extremely
carefully. I will transfer the data to other disks and will reboot
and run the system from the other disks. Then I will reformat the disks
again and restore the data. Lets see what happens.

-- 
Met vriendelijke groeten,
With kind regards,
Mit freundlichen Gruessen,
De jrus wah,

Willy

*************************************
W.K. Offermans
Home:   +31 45 544 49 44
Mobile: +31 653 27 16 23
e-mail: Willy@Offermans.Rompen.nl

                                       Powered by ....

                                            (__)
                                         \\\'',)
                                           \/  \ ^
                                           .\._/_)

                                       www.FreeBSD.org



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080516154206.GB9388>