From owner-freebsd-fs@FreeBSD.ORG Tue Jul 27 06:57:17 2004 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 146FA16A4CE for ; Tue, 27 Jul 2004 06:57:17 +0000 (GMT) Received: from pimout2-ext.prodigy.net (pimout2-ext.prodigy.net [207.115.63.101]) by mx1.FreeBSD.org (Postfix) with ESMTP id D3ED843D1F for ; Tue, 27 Jul 2004 06:57:15 +0000 (GMT) (envelope-from julian@elischer.org) Received: from elischer.org (adsl-68-121-219-69.dsl.snfc21.pacbell.net [68.121.219.69])i6R6vDUK232192 for ; Tue, 27 Jul 2004 02:57:14 -0400 Message-ID: <4105FCC9.4000407@elischer.org> Date: Mon, 26 Jul 2004 23:57:13 -0700 From: Julian Elischer User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.4b) Gecko/20030524 X-Accept-Language: en, hu MIME-Version: 1.0 To: fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Subject: anyone know a flesystem corruption that can do this? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Jul 2004 06:57:17 -0000 This may be hardware but it acts a lot like persistant bad data somewhere. Spot the (not so) deliberate error (in the buf)! One system has been falling over. this is related to it.. Is there any way a filesystem can be corrupted so that this can happen? the resid and b_bcount for this request are bigger than the buffer size! The write will fail because of dscheck but the system survives.. However when the file in question is read back from memory it goes into an infinite loop on this buf, hanging the system. uiomove moves 0 bytes once the bulk of the buffer has been moved but teh resid is still non-zero, so it just cycles forever. ( can get into the debugger so a trace of that is available too. this is a trace of the original write. list . (gdb) list 278 } 279 } 280 return (1); 281 282 bad_bcount: 283 printf( 284 "dscheck(%s): b_bcount %ld is not on a sector boundary (ssize %d)\n", 285 devtoname(bp->b_dev), bp->b_bcount, ssp->dss_secsize); 286 bp->b_error = EINVAL; 287 goto bad; b_bufsize = 0x4000, b_bcount = 0x4020 (!) [debugger] #7 0xc01b48c2 in dscheck (bp=0xcc9767dc, ssp=0xc235b000) at /usr/prod/system/VERS_4_8_BRANCH/src/sys/kern/subr_diskslice.c:283 #8 0xc01b4379 in diskstrategy (bp=0xcc9767dc) at /usr/prod/system/VERS_4_8_BRANCH/src/sys/kern/subr_disk.c:246 #9 0xc01e3b78 in spec_strategy (ap=0xd6a90ec0) at /usr/prod/system/VERS_4_8_BRANCH/src/sys/miscfs/specfs/spec_vnops.c:479 #10 0xc01e35a1 in spec_vnoperate (ap=0xd6a90ec0) at /usr/prod/system/VERS_4_8_BRANCH/src/sys/miscfs/specfs/spec_vnops.c:119 #11 0xc0272ded in ufs_vnoperatespec (ap=0xd6a90ec0) at /usr/prod/system/VERS_4_8_BRANCH/src/sys/ufs/ufs/ufs_vnops.c:2394 #12 0xc02726ed in ufs_strategy (ap=0xd6a90f04) at vnode_if.h:944 #13 0xc0272dbd in ufs_vnoperate (ap=0xd6a90f04) at /usr/prod/system/VERS_4_8_BRANCH/src/sys/ufs/ufs/ufs_vnops.c:2376 #14 0xc01d045a in bwrite (bp=0xcc9767dc) at vnode_if.h:944 #15 0xc01d5eb6 in vop_stdbwrite (ap=0xd6a90f68) at /usr/prod/system/VERS_4_8_BRANCH/src/sys/kern/vfs_default.c:344 #16 0xc01d5d01 in vop_defaultop (ap=0xd6a90f68) at /usr/prod/system/VERS_4_8_BRANCH/src/sys/kern/vfs_default.c:152 #17 0xc0272dbd in ufs_vnoperate (ap=0xd6a90f68) at /usr/prod/system/VERS_4_8_BRANCH/src/sys/ufs/ufs/ufs_vnops.c:2376 #18 0xc01d13cb in vfs_bio_awrite (bp=0xcc9767dc) at vnode_if.h:1193 #19 0xc01d1ae2 in flushbufqueues () at /usr/prod/system/VERS_4_8_BRANCH/src/sys/kern/vfs_bio.c:1930 #20 0xc01d1979 in buf_daemon () at /usr/prod/system/VERS_4_8_BRANCH/src/sys/kern/vfs_bio.c:1855 (gdb) p *bp $6 = { b_hash = { le_next = 0x0, le_prev = 0xcc9376e4 }, b_vnbufs = { tqe_next = 0x0, tqe_prev = 0xcc9f9274 }, b_freelist = { tqe_next = 0xcc9d8a60, tqe_prev = 0xc036c8b8 }, b_act = { tqe_next = 0x0, tqe_prev = 0xc2324000 }, b_flags = 0x21021024, b_qindex = 0x0, b_xflags = 0x2, b_lock = { lk_interlock = { lock_data = 0x0 }, lk_flags = 0x400, lk_sharecount = 0x0, lk_waitcount = 0x0, lk_exclusivecount = 0x1, lk_prio = 0x14, lk_wmesg = 0xc0326390 "bufwait", lk_timo = 0x0, lk_lockholder = 0xfffffffe }, b_error = 0x16, b_bufsize = 0x4000, b_runningbufspace = 0x4000, b_bcount = 0x4020, b_resid = 0x4020, b_dev = 0xc2394b80, b_data = 0xceadf000 "Ã\236=}h\002\212^M\022\210ä\201\213\016\001\035ê[He̳J0Òvô«\177ÏÒ\233סã×Ö\200)XE\233I#\224`\026=i\226\021K\025Ü¢A\222", b_kvabase = 0xceadf000 "Ã\236=}h\002\212^M\022\210ä\201\213\016\001\035ê[He̳J0Òvô«\177ÏÒ\233סã×Ö\200)XE\233I#\224`\026=i\226\021K\025Ü¢A\222", b_kvasize = 0x4000, b_lblkno = 0xf0c, b_blkno = 0x2a91e3a0, b_offset = 0x3c30000, b_iodone = 0, b_iodone_chain = 0x0, b_vp = 0xd7766600, b_dirtyoff = 0x0, b_dirtyend = 0x0, b_rcred = 0x0, b_wcred = 0x0, b_pblkno = 0x1f7fff7f, b_saveaddr = 0x0, b_driver1 = 0x0, b_driver2 = 0x0, b_caller1 = 0x0, b_caller2 = 0x0, b_pager = { pg_spc = 0x0, pg_reqpage = 0x0 }, b_cluster = { [...]