From owner-freebsd-fs@FreeBSD.ORG Sun Sep 25 08:35:56 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6B6B5106564A for ; Sun, 25 Sep 2011 08:35:56 +0000 (UTC) (envelope-from yanegomi@gmail.com) Received: from mail-qy0-f182.google.com (mail-qy0-f182.google.com [209.85.216.182]) by mx1.freebsd.org (Postfix) with ESMTP id 2C16D8FC08 for ; Sun, 25 Sep 2011 08:35:55 +0000 (UTC) Received: by qyk4 with SMTP id 4so5679435qyk.13 for ; Sun, 25 Sep 2011 01:35:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:cc:content-type; bh=nRA3LCbT/2DbjIcbtFnjYkbMjNOjTXHJx1RjI+8Niqo=; b=mChY/oMx9G2mjtUfuqQiLNhr/qm11Wmo178FmcIDs8L1CEmZvlE9RlqmLcJgvlcpJs tVumNTtFVsAOupnhDV/dAAG6pYxHmeY+wL+7oe6iLBtmBCZM+Rrj78oROi6yshCvQbmP 3j7QQZMOQBstj8VQpbUS8uaFmAKhOvyOQqzLM= MIME-Version: 1.0 Received: by 10.224.183.205 with SMTP id ch13mr139026qab.274.1316937965534; Sun, 25 Sep 2011 01:06:05 -0700 (PDT) Received: by 10.224.74.82 with HTTP; Sun, 25 Sep 2011 01:06:05 -0700 (PDT) Date: Sun, 25 Sep 2011 01:06:05 -0700 Message-ID: From: Garrett Cooper To: current@freebsd.org, freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Cc: Xin LI Subject: Need to force sync(2) before umounting UFS1 filesystems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 25 Sep 2011 08:35:56 -0000 Hi, I've been doing builds with FreeNAS recently on 9.x-BETA2 machines recently and I've noticed that I need to add 2 'sync's prior to each umount command in the nanobsd scripts when running repeated builds, otherwise it fails with: umount: unmount of /scratch/freenas/obj.amd64/_.mnt failed: Device busy Running fstat -f in the sh EXIT trap doesn't reveal anything helpful: USER CMD PID FD MOUNT INUM MODE SZ|DV R/W root fstat 79637 wd /scratch/freenas/obj.amd64/_.mnt 2 drwxr-xr-x 512 r Talking to Xin yesterday, he was convinced that this was a filesystem//kern bug. Before I file a PR, I'm wondering if anyone else has seen this issue.. Thanks! -Garrett PS I've seen the above behavior on the following systems.. FreeBSD bayonetta.local 9.0-BETA2 FreeBSD 9.0-BETA2 #0 r225653M: Tue Sep 20 08:36:49 PDT 2011 gcooper@bayonetta.local:/usr/obj/usr/src/sys/BAYONETTA amd64 FreeBSD burnout.ixsystems.com 9.0-BETA1 FreeBSD 9.0-BETA1 #0 r224989: Sun Aug 21 14:12:11 PDT 2011 gcooper@burnout.ixsystems.com:/usr/obj/usr/src/sys/BURNOUT amd64 FreeBSD fallout.local 9.0-BETA2 FreeBSD 9.0-BETA2 #10 r225587M: Thu Sep 15 09:07:08 PDT 2011 root@fallout.local:/usr/obj/usr/src/sys/FALLOUT amd64 FreeBSD streetfighter.ixsystems.com 9.0-BETA2 FreeBSD 9.0-BETA2 #0 r225558: Wed Sep 14 20:29:45 PDT 2011 gcooper@streetfighter.ixsystems.com:/usr/obj/usr/src/sys/STREETFIGHTER amd64 From owner-freebsd-fs@FreeBSD.ORG Sun Sep 25 09:50:11 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8D4FB1065677; Sun, 25 Sep 2011 09:50:11 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru [IPv6:2a01:4f8:131:60a2::2]) by mx1.freebsd.org (Postfix) with ESMTP id 2BC118FC13; Sun, 25 Sep 2011 09:50:11 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (unknown [IPv6:2001:470:923f:1:d578:b545:b004:4d]) (Authenticated sender: lev@serebryakov.spb.ru) by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id 8F06C4AC1C; Sun, 25 Sep 2011 13:50:09 +0400 (MSD) Date: Sun, 25 Sep 2011 13:50:03 +0400 From: Lev Serebryakov Organization: FreeBSD X-Priority: 3 (Normal) Message-ID: <1376234334.20110925135003@serebryakov.spb.ru> To: Garrett Cooper In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=windows-1251 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org, Xin LI , current@freebsd.org Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: lev@FreeBSD.org List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 25 Sep 2011 09:50:11 -0000 Hello, Garrett. You wrote 25 =F1=E5=ED=F2=FF=E1=F0=FF 2011 =E3., 12:06:05: > Talking to Xin yesterday, he was convinced that this was a > filesystem//kern bug. Before I file a PR, I'm wondering if anyone else > has seen this issue.. Yes, and I posted message about it in embedded@ (Message-ID <1175277342.20110821215629@serebryakov.spb.ru>), I've got additional question from Warner Losh about base (underlying) file system, without any additional reaction. --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-fs@FreeBSD.ORG Sun Sep 25 10:14:56 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8FC19106566B; Sun, 25 Sep 2011 10:14:56 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-gy0-f182.google.com (mail-gy0-f182.google.com [209.85.160.182]) by mx1.freebsd.org (Postfix) with ESMTP id 3F4948FC0A; Sun, 25 Sep 2011 10:14:55 +0000 (UTC) Received: by gyf2 with SMTP id 2so4484162gyf.13 for ; Sun, 25 Sep 2011 03:14:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=je17U9DFjYNvf2+1WAyuw3UN31LzjWzxkTpkqxoMgdI=; b=Gkdo7Fljhz/wYmYItBAzITWUpNBa/e78HaQWhlax00NU3CSZs32aweSzElPZMpA+iw Xrr0dTNXZn3Wz3kUSl+T/0+/BBTValKuyujDItzcfqGamkjqcoESsv6AMH5X68yF+Cv2 Q+9RFPGI1BbpUVXJMyyRHRooeqDF0reCQNs98= MIME-Version: 1.0 Received: by 10.236.124.97 with SMTP id w61mr32018081yhh.106.1316943877148; Sun, 25 Sep 2011 02:44:37 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.236.111.42 with HTTP; Sun, 25 Sep 2011 02:44:37 -0700 (PDT) In-Reply-To: References: Date: Sun, 25 Sep 2011 17:44:37 +0800 X-Google-Sender-Auth: 2knPi_PEEMgPKNdBFuHOYVqgBNk Message-ID: From: Adrian Chadd To: Garrett Cooper Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-fs@freebsd.org, Xin LI , current@freebsd.org Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 25 Sep 2011 10:14:56 -0000 Now that you mention it, yes. adrian From owner-freebsd-fs@FreeBSD.ORG Sun Sep 25 15:51:34 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3E46B1065675 for ; Sun, 25 Sep 2011 15:51:34 +0000 (UTC) (envelope-from nowakpl@platinum.linux.pl) Received: from platinum.linux.pl (platinum.edu.pl [81.161.192.4]) by mx1.freebsd.org (Postfix) with ESMTP id C7C508FC15 for ; Sun, 25 Sep 2011 15:51:33 +0000 (UTC) Received: by platinum.linux.pl (Postfix, from userid 87) id A62BB47E23; Sun, 25 Sep 2011 17:33:25 +0200 (CEST) X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on platinum.linux.pl X-Spam-Level: X-Spam-Status: No, score=-1.0 required=3.0 tests=ALL_TRUSTED,AWL autolearn=disabled version=3.3.2 Received: from [172.19.191.2] (078088011125.bialystok.vectranet.pl [78.88.11.125]) by platinum.linux.pl (Postfix) with ESMTPA id 7246747E1D for ; Sun, 25 Sep 2011 17:33:22 +0200 (CEST) Message-ID: <4E7F49A7.1020909@platinum.linux.pl> Date: Sun, 25 Sep 2011 17:32:55 +0200 From: Adam Nowacki User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-GB; rv:1.9.2.22) Gecko/20110902 Thunderbird/3.1.14 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-2; format=flowed Content-Transfer-Encoding: 7bit Subject: ZFS and 3ware controller resets X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 25 Sep 2011 15:51:34 -0000 I have a 20 disk storage system, every now and then a disk dies and causes 3ware controller to reset because of disk timeouts. This cuts out ZFS from all disks, even healthy ones and the system requires a hard reset. Two issues here: 1) Why the controller has to reset? Thats a completely insane way of dealing with drive timeout. 2) ZFS not reopening the disk after controller reset. FreeBSD version: 8.1-RELEASE-p1 /c0 Driver Version = 3.80.06.003 /c0 Model = 9650SE-16ML /c0 Available Memory = 224MB /c0 Firmware Version = FE9X 4.10.00.007 /c0 Bios Version = BE9X 4.08.00.002 /c0 Boot Loader Version = BL9X 3.08.00.001 pool: zp2 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM zp2 ONLINE 0 0 0 raidz2 ONLINE 0 0 0 da1p1 ONLINE 0 0 0 da2p1 ONLINE 0 0 0 da3p1 ONLINE 0 0 0 da4p1 ONLINE 0 0 0 da5p1 ONLINE 0 0 0 da6p1 ONLINE 0 0 0 da7p1 ONLINE 0 0 0 da9p1 ONLINE 0 0 0 da8p1 ONLINE 0 0 0 da10p1 ONLINE 0 0 0 Then when disk starts behaving: twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 (da3:twa0:0:3:0): READ(10). CDB: 28 0 a3 f4 e7 60 0 0 8 0 (da3:twa0:0:3:0): CAM status: SCSI Status Error (da3:twa0:0:3:0): SCSI status: Check Condition (da3:twa0:0:3:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 (da3:twa0:0:3:0): READ(10). CDB: 28 0 a5 4 83 80 0 0 80 0 (da3:twa0:0:3:0): CAM status: SCSI Status Error (da3:twa0:0:3:0): SCSI status: Check Condition (da3:twa0:0:3:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 (da3:twa0:0:3:0): READ(10). CDB: 28 0 a5 4 83 80 0 0 80 0 (da3:twa0:0:3:0): CAM status: SCSI Status Error (da3:twa0:0:3:0): SCSI status: Check Condition (da3:twa0:0:3:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 (da3:twa0:0:3:0): READ(10). CDB: 28 0 a5 4 83 80 0 0 80 0 (da3:twa0:0:3:0): CAM status: SCSI Status Error (da3:twa0:0:3:0): SCSI status: Check Condition (da3:twa0:0:3:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 (da3:twa0:0:3:0): READ(10). CDB: 28 0 a5 4 83 80 0 0 80 0 (da3:twa0:0:3:0): CAM status: SCSI Status Error (da3:twa0:0:3:0): SCSI status: Check Condition (da3:twa0:0:3:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 (da3:twa0:0:3:0): READ(10). CDB: 28 0 a5 4 83 80 0 0 80 0 (da3:twa0:0:3:0): CAM status: SCSI Status Error (da3:twa0:0:3:0): SCSI status: Check Condition (da3:twa0:0:3:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 (da3:twa0:0:3:0): READ(10). CDB: 28 0 cb 7c 43 b8 0 0 10 0 (da3:twa0:0:3:0): CAM status: SCSI Status Error (da3:twa0:0:3:0): SCSI status: Check Condition (da3:twa0:0:3:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 (da3:twa0:0:3:0): READ(10). CDB: 28 0 ce e5 ca 30 0 0 20 0 (da3:twa0:0:3:0): CAM status: SCSI Status Error (da3:twa0:0:3:0): SCSI status: Check Condition (da3:twa0:0:3:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) (da3:twa0:0:3:0): READ(10). CDB: 28 0 a4 2d 2d f8 0 0 8 0 (da3:twa0:0:3:0): CAM status: SCSI Status Error (da3:twa0:0:3:0): SCSI status: Check Condition (da3:twa0:0:3:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 (da3:twa0:0:3:0): READ(10). CDB: 28 0 cb 91 7c f8 0 0 20 0 (da3:twa0:0:3:0): CAM status: SCSI Status Error (da3:twa0:0:3:0): SCSI status: Check Condition (da3:twa0:0:3:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) twa0: Request 72 timed out! twa0: INFO: (0x16: 0x1108): Resetting controller...: twa0: INFO: (0x04: 0x005E): Cache synchronization completed: unit=0 twa0: INFO: (0x04: 0x005E): Cache synchronization completed: unit=3 twa0: INFO: (0x04: 0x0001): Controller reset occurred: resets=1 twa0: [ITHREAD] (da1:twa0:0:1:0): lost device (da2:twa0:0:2:0): lost device (da3:twa0:0:3:0): lost device (da4:twa0:0:4:0): lost device (da5:twa0:0:5:0): lost device (da6:twa0:0:6:0): lost device (da7:twa0:0:7:0): lost device (da8:twa0:0:8:0): lost device (da9:twa0:0:9:0): lost device (da10:twa0:0:10:0): lost device (da11:twa0:0:11:0): lost device (da12:twa0:0:12:0): lost device (da13:twa0:0:13:0): lost device (da1:twa0:0:1:0): removing device entry da1 at twa0 bus 0 scbus0 target 1 lun 0 da1: Fixed Direct Access SCSI-5 device da1: 100.000MB/s transfers da1: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) (da2:twa0:0:2:0): removing device entry da2 at twa0 bus 0 scbus0 target 2 lun 0 da2: Fixed Direct Access SCSI-5 device da2: 100.000MB/s transfers da2: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) (da3:twa0:0:3:0): removing device entry da3 at twa0 bus 0 scbus0 target 3 lun 0 da3: Fixed Direct Access SCSI-5 device da3: 100.000MB/s transfers da3: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) (da4:twa0:0:4:0): removing device entry da4 at twa0 bus 0 scbus0 target 4 lun 0 da4: Fixed Direct Access SCSI-5 device da4: 100.000MB/s transfers da4: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) (da5:twa0:0:5:0): removing device entry da5 at twa0 bus 0 scbus0 target 5 lun 0 da5: Fixed Direct Access SCSI-5 device da5: 100.000MB/s transfers da5: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) (da6:twa0:0:6:0): removing device entry da6 at twa0 bus 0 scbus0 target 6 lun 0 da6: Fixed Direct Access SCSI-5 device da6: 100.000MB/s transfers da6: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) (da7:twa0:0:7:0): removing device entry da7 at twa0 bus 0 scbus0 target 7 lun 0 da7: Fixed Direct Access SCSI-5 device da7: 100.000MB/s transfers da7: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) (da8:twa0:0:8:0): removing device entry da8 at twa0 bus 0 scbus0 target 8 lun 0 da8: Fixed Direct Access SCSI-5 device da8: 100.000MB/s transfers da8: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) (da9:twa0:0:9:0): removing device entry da9 at twa0 bus 0 scbus0 target 9 lun 0 da9: Fixed Direct Access SCSI-5 device da9: 100.000MB/s transfers da9: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) (da10:twa0:0:10:0): removing device entry da10 at twa0 bus 0 scbus0 target 10 lun 0 da10: Fixed Direct Access SCSI-5 device da10: 100.000MB/s transfers da10: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) (da11:twa0:0:11:0): removing device entry da11 at twa0 bus 0 scbus0 target 11 lun 0 da11: Fixed Direct Access SCSI-5 device da11: 100.000MB/s transfers da11: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) (da12:twa0:0:12:0): removing device entry da12 at twa0 bus 0 scbus0 target 12 lun 0 da12: Fixed Direct Access SCSI-5 device da12: 100.000MB/s transfers da12: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) (da13:twa0:0:13:0): removing device entry da13 at twa0 bus 0 scbus0 target 13 lun 0 da13: Fixed Direct Access SCSI-5 device da13: 100.000MB/s transfers da13: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) pool: zp2 state: ONLINE status: One or more devices are faulted in response to IO failures. action: Make sure the affected devices are connected, then run 'zpool clear'. see: http://www.sun.com/msg/ZFS-8000-HC scrub: none requested config: NAME STATE READ WRITE CKSUM zp2 ONLINE 7 11 0 raidz2 ONLINE 16 32 0 da1p1 ONLINE 4 10 0 da2p1 ONLINE 4 10 0 da3p1 ONLINE 5 642 1 da4p1 ONLINE 3 8 0 da5p1 ONLINE 3 12 0 da6p1 ONLINE 3 12 0 da7p1 ONLINE 3 12 0 da9p1 ONLINE 3 12 0 da8p1 ONLINE 3 14 0 da10p1 ONLINE 3 10 0 errors: 10 data errors, use '-v' for a list From owner-freebsd-fs@FreeBSD.ORG Sun Sep 25 16:01:02 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7F1931065670 for ; Sun, 25 Sep 2011 16:01:02 +0000 (UTC) (envelope-from rmh.aybabtu@gmail.com) Received: from mail-yi0-f54.google.com (mail-yi0-f54.google.com [209.85.218.54]) by mx1.freebsd.org (Postfix) with ESMTP id 3ADEB8FC1C for ; Sun, 25 Sep 2011 16:01:01 +0000 (UTC) Received: by yia13 with SMTP id 13so4763716yia.13 for ; Sun, 25 Sep 2011 09:01:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=8SynN7BQtyKLsErnkV/lwM3cel0kR9njNFXYP5lICqE=; b=TjCOM2inUEAks2XuHiDay+agqm6g7Zl6XxBpfe7/dYODV6hrvfEzPiOgmtMqQ2EteD PEiFLsaSdlEPsYPoMQNnuHtGM5lRtPOIUZLlqNwru6Z1LRCpYPp+5ZNFKXhNI9YRP1Iw CZ2/YNpj4I+MGwDOZu9Uk2FebnTDzGpipPYf4= MIME-Version: 1.0 Received: by 10.42.148.7 with SMTP id p7mr5480694icv.191.1316964747189; Sun, 25 Sep 2011 08:32:27 -0700 (PDT) Sender: rmh.aybabtu@gmail.com Received: by 10.42.229.4 with HTTP; Sun, 25 Sep 2011 08:32:27 -0700 (PDT) In-Reply-To: References: <201108102152.p7ALqUl4075207@red.freebsd.org> <201108102200.p7AM0Nu9026320@freefall.freebsd.org> Date: Sun, 25 Sep 2011 17:32:27 +0200 X-Google-Sender-Auth: qcz4iqtg2weZe2XCBN6yPm1DyVs Message-ID: From: Robert Millan To: FreeBSD-gnats-submit@freebsd.org, freebsd-bugs@freebsd.org Content-Type: text/plain; charset=UTF-8 Cc: Josef Karthauser , Adrian Chadd , freebsd-fs@freebsd.org Subject: Re: kern/159663: sockets don't work though nullfs mounts X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 25 Sep 2011 16:01:02 -0000 2011/9/24 Robert Millan : > I found a thread from 2007 with further discussion about this problem: > > http://lists.freebsd.org/pipermail/freebsd-fs/2007-February/002669.html Hi, I've looked at the situation in a bit more detail, for now only with sockets in mind (not named pipes). My understanding is (please correct me if I'm wrong): - nullfs holds reference counts for each vnode, but sockets have their own mechanism for reference counting (so_count / soref / sorele). vnode reference counting doesn't protect against socket being closed, which would leave a stale pointer in the upper nullfs layer. - Increasing the reference count of the socket itself can't be done in null_nodeget() because this function is merely a getter whose call doesn't indicate any meaningful event. - It's not clear to me that there's any event in time where the socket reference can be increased. If mounting a nullfs were that event, then all existing sockets would be soref'ed but we wouldn't be soref'ing future sockets created in the lower layer after the mount. This doesn't seem correct. - Possible solution: null_nodeget() semantics are replaced with something that actually allows vnodes in the upper layer to be created and destroyed. - Possible solution: upper layer has a memory structure to keep track of which sockets in the lower layer have been soref'ed. From owner-freebsd-fs@FreeBSD.ORG Sun Sep 25 16:59:50 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 47EE51065670 for ; Sun, 25 Sep 2011 16:59:50 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta01.westchester.pa.mail.comcast.net (qmta02.westchester.pa.mail.comcast.net [76.96.62.24]) by mx1.freebsd.org (Postfix) with ESMTP id EAC328FC15 for ; Sun, 25 Sep 2011 16:59:49 +0000 (UTC) Received: from omta24.westchester.pa.mail.comcast.net ([76.96.62.76]) by qmta01.westchester.pa.mail.comcast.net with comcast id d3me1h0031ei1Bg514zqeD; Sun, 25 Sep 2011 16:59:50 +0000 Received: from koitsu.dyndns.org ([67.180.84.87]) by omta24.westchester.pa.mail.comcast.net with comcast id d4zn1h01G1t3BNj3k4zob9; Sun, 25 Sep 2011 16:59:49 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 868E8102C31; Sun, 25 Sep 2011 09:59:46 -0700 (PDT) Date: Sun, 25 Sep 2011 09:59:46 -0700 From: Jeremy Chadwick To: Adam Nowacki Message-ID: <20110925165946.GA42447@icarus.home.lan> References: <4E7F49A7.1020909@platinum.linux.pl> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4E7F49A7.1020909@platinum.linux.pl> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and 3ware controller resets X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 25 Sep 2011 16:59:50 -0000 On Sun, Sep 25, 2011 at 05:32:55PM +0200, Adam Nowacki wrote: > I have a 20 disk storage system, every now and then a disk dies and > causes 3ware controller to reset because of disk timeouts. This cuts > out ZFS from all disks, even healthy ones and the system requires a > hard reset. > Two issues here: > 1) Why the controller has to reset? Thats a completely insane way of > dealing with drive timeout. > 2) ZFS not reopening the disk after controller reset. > > FreeBSD version: 8.1-RELEASE-p1 > > /c0 Driver Version = 3.80.06.003 > /c0 Model = 9650SE-16ML > /c0 Available Memory = 224MB > /c0 Firmware Version = FE9X 4.10.00.007 > /c0 Bios Version = BE9X 4.08.00.002 > /c0 Boot Loader Version = BL9X 3.08.00.001 > > pool: zp2 > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > zp2 ONLINE 0 0 0 > raidz2 ONLINE 0 0 0 > da1p1 ONLINE 0 0 0 > da2p1 ONLINE 0 0 0 > da3p1 ONLINE 0 0 0 > da4p1 ONLINE 0 0 0 > da5p1 ONLINE 0 0 0 > da6p1 ONLINE 0 0 0 > da7p1 ONLINE 0 0 0 > da9p1 ONLINE 0 0 0 > da8p1 ONLINE 0 0 0 > da10p1 ONLINE 0 0 0 > > > Then when disk starts behaving: > > > twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 > (da3:twa0:0:3:0): READ(10). CDB: 28 0 a3 f4 e7 60 0 0 8 0 > (da3:twa0:0:3:0): CAM status: SCSI Status Error > (da3:twa0:0:3:0): SCSI status: Check Condition > (da3:twa0:0:3:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) > twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 > twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 > twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 > twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 > (da3:twa0:0:3:0): READ(10). CDB: 28 0 a5 4 83 80 0 0 80 0 > (da3:twa0:0:3:0): CAM status: SCSI Status Error > (da3:twa0:0:3:0): SCSI status: Check Condition > (da3:twa0:0:3:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) > twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 > (da3:twa0:0:3:0): READ(10). CDB: 28 0 a5 4 83 80 0 0 80 0 > (da3:twa0:0:3:0): CAM status: SCSI Status Error > (da3:twa0:0:3:0): SCSI status: Check Condition > (da3:twa0:0:3:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) > twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 > (da3:twa0:0:3:0): READ(10). CDB: 28 0 a5 4 83 80 0 0 80 0 > (da3:twa0:0:3:0): CAM status: SCSI Status Error > (da3:twa0:0:3:0): SCSI status: Check Condition > (da3:twa0:0:3:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) > twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 > (da3:twa0:0:3:0): READ(10). CDB: 28 0 a5 4 83 80 0 0 80 0 > (da3:twa0:0:3:0): CAM status: SCSI Status Error > (da3:twa0:0:3:0): SCSI status: Check Condition > (da3:twa0:0:3:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) > twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 > (da3:twa0:0:3:0): READ(10). CDB: 28 0 a5 4 83 80 0 0 80 0 > (da3:twa0:0:3:0): CAM status: SCSI Status Error > (da3:twa0:0:3:0): SCSI status: Check Condition > (da3:twa0:0:3:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) > twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 > twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 > twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 > twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 > twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 > (da3:twa0:0:3:0): READ(10). CDB: 28 0 cb 7c 43 b8 0 0 10 0 > (da3:twa0:0:3:0): CAM status: SCSI Status Error > (da3:twa0:0:3:0): SCSI status: Check Condition > (da3:twa0:0:3:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) > twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 > twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 > twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 > twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 > twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 > (da3:twa0:0:3:0): READ(10). CDB: 28 0 ce e5 ca 30 0 0 20 0 > (da3:twa0:0:3:0): CAM status: SCSI Status Error > (da3:twa0:0:3:0): SCSI status: Check Condition > (da3:twa0:0:3:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) > (da3:twa0:0:3:0): READ(10). CDB: 28 0 a4 2d 2d f8 0 0 8 0 > (da3:twa0:0:3:0): CAM status: SCSI Status Error > (da3:twa0:0:3:0): SCSI status: Check Condition > (da3:twa0:0:3:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) > twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 > twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 > twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 > (da3:twa0:0:3:0): READ(10). CDB: 28 0 cb 91 7c f8 0 0 20 0 > (da3:twa0:0:3:0): CAM status: SCSI Status Error > (da3:twa0:0:3:0): SCSI status: Check Condition > (da3:twa0:0:3:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) > twa0: Request 72 timed out! > twa0: INFO: (0x16: 0x1108): Resetting controller...: > twa0: INFO: (0x04: 0x005E): Cache synchronization completed: unit=0 > twa0: INFO: (0x04: 0x005E): Cache synchronization completed: unit=3 > twa0: INFO: (0x04: 0x0001): Controller reset occurred: resets=1 > twa0: [ITHREAD] > (da1:twa0:0:1:0): lost device > (da2:twa0:0:2:0): lost device > (da3:twa0:0:3:0): lost device > (da4:twa0:0:4:0): lost device > (da5:twa0:0:5:0): lost device > (da6:twa0:0:6:0): lost device > (da7:twa0:0:7:0): lost device > (da8:twa0:0:8:0): lost device > (da9:twa0:0:9:0): lost device > (da10:twa0:0:10:0): lost device > (da11:twa0:0:11:0): lost device > (da12:twa0:0:12:0): lost device > (da13:twa0:0:13:0): lost device > (da1:twa0:0:1:0): removing device entry > da1 at twa0 bus 0 scbus0 target 1 lun 0 > da1: Fixed Direct Access SCSI-5 device > da1: 100.000MB/s transfers > da1: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) > (da2:twa0:0:2:0): removing device entry > da2 at twa0 bus 0 scbus0 target 2 lun 0 > da2: Fixed Direct Access SCSI-5 device > da2: 100.000MB/s transfers > da2: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) > (da3:twa0:0:3:0): removing device entry > da3 at twa0 bus 0 scbus0 target 3 lun 0 > da3: Fixed Direct Access SCSI-5 device > da3: 100.000MB/s transfers > da3: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) > (da4:twa0:0:4:0): removing device entry > da4 at twa0 bus 0 scbus0 target 4 lun 0 > da4: Fixed Direct Access SCSI-5 device > da4: 100.000MB/s transfers > da4: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) > (da5:twa0:0:5:0): removing device entry > da5 at twa0 bus 0 scbus0 target 5 lun 0 > da5: Fixed Direct Access SCSI-5 device > da5: 100.000MB/s transfers > da5: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) > (da6:twa0:0:6:0): removing device entry > da6 at twa0 bus 0 scbus0 target 6 lun 0 > da6: Fixed Direct Access SCSI-5 device > da6: 100.000MB/s transfers > da6: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) > (da7:twa0:0:7:0): removing device entry > da7 at twa0 bus 0 scbus0 target 7 lun 0 > da7: Fixed Direct Access SCSI-5 device > da7: 100.000MB/s transfers > da7: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) > (da8:twa0:0:8:0): removing device entry > da8 at twa0 bus 0 scbus0 target 8 lun 0 > da8: Fixed Direct Access SCSI-5 device > da8: 100.000MB/s transfers > da8: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) > (da9:twa0:0:9:0): removing device entry > da9 at twa0 bus 0 scbus0 target 9 lun 0 > da9: Fixed Direct Access SCSI-5 device > da9: 100.000MB/s transfers > da9: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) > (da10:twa0:0:10:0): removing device entry > da10 at twa0 bus 0 scbus0 target 10 lun 0 > da10: Fixed Direct Access SCSI-5 device > da10: 100.000MB/s transfers > da10: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) > (da11:twa0:0:11:0): removing device entry > da11 at twa0 bus 0 scbus0 target 11 lun 0 > da11: Fixed Direct Access SCSI-5 device > da11: 100.000MB/s transfers > da11: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) > (da12:twa0:0:12:0): removing device entry > da12 at twa0 bus 0 scbus0 target 12 lun 0 > da12: Fixed Direct Access SCSI-5 device > da12: 100.000MB/s transfers > da12: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) > (da13:twa0:0:13:0): removing device entry > da13 at twa0 bus 0 scbus0 target 13 lun 0 > da13: Fixed Direct Access SCSI-5 device > da13: 100.000MB/s transfers > da13: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) > > pool: zp2 > state: ONLINE > status: One or more devices are faulted in response to IO failures. > action: Make sure the affected devices are connected, then run > 'zpool clear'. > see: http://www.sun.com/msg/ZFS-8000-HC > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > zp2 ONLINE 7 11 0 > raidz2 ONLINE 16 32 0 > da1p1 ONLINE 4 10 0 > da2p1 ONLINE 4 10 0 > da3p1 ONLINE 5 642 1 > da4p1 ONLINE 3 8 0 > da5p1 ONLINE 3 12 0 > da6p1 ONLINE 3 12 0 > da7p1 ONLINE 3 12 0 > da9p1 ONLINE 3 12 0 > da8p1 ONLINE 3 14 0 > da10p1 ONLINE 3 10 0 > > errors: 10 data errors, use '-v' for a list The behaviour here seems to match something reported here: http://www.freebsd.org/cgi/query-pr.cgi?pr=149968 Now before someone flames me and says "that's a different issue", one has to look closely at the driver diff. It seems that a different type of controller reset is implemented (soft vs. hard), amongst some other details. I am very inclined to believe an updated twa(4) driver will address your problem. I would suggest you try FreeBSD 8.2-STABLE instead. Do not try 8.2-RELEASE, as it will not have this fix; 8.2-RELEASE is from July 2010, while this commit was done September 2010. http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/twa/ Otherwise you can try to build src/sys/dev/twa from a RELENG_8 checkout on your 8.1 box, but I make no guarantees this will work. As for your comments about "why is a reset required, insane blah blah", this is often done when a single port itself cannot be reset (e.g. the controller firmware, or silicon itself, does not truly have a way of "hard resetting" a single port). Finally, I do not understand what you mean by "ZFS not reopening the disk after controller reset". You'll need to explain what you mean by that. And besides, when an underlying storage controller says "this disk is having problems" and drops it from the bus (which is what should be happening -- see beginning of my comments, your complaint, etc.), you **do not** want the OS to re-attach the same disk it just dropped, else you end up in this infinite loop where the controller is dropping a drive from the bus and reattaching, over and over. Makes no sense, even if the issue is bad cabling or otherwise. Administrator intervention is always required in this situation. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Sun Sep 25 17:15:48 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E29BF1065670 for ; Sun, 25 Sep 2011 17:15:47 +0000 (UTC) (envelope-from nowakpl@platinum.linux.pl) Received: from platinum.linux.pl (platinum.edu.pl [81.161.192.4]) by mx1.freebsd.org (Postfix) with ESMTP id 7B8BF8FC0A for ; Sun, 25 Sep 2011 17:15:47 +0000 (UTC) Received: by platinum.linux.pl (Postfix, from userid 87) id 2639E47E26; Sun, 25 Sep 2011 19:15:45 +0200 (CEST) X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on platinum.linux.pl X-Spam-Level: X-Spam-Status: No, score=-1.1 required=3.0 tests=ALL_TRUSTED,AWL autolearn=disabled version=3.3.2 Received: from [172.19.191.2] (078088011125.bialystok.vectranet.pl [78.88.11.125]) by platinum.linux.pl (Postfix) with ESMTPA id DE1D847E1D; Sun, 25 Sep 2011 19:15:40 +0200 (CEST) Message-ID: <4E7F61A2.5060908@platinum.linux.pl> Date: Sun, 25 Sep 2011 19:15:14 +0200 From: Adam Nowacki User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-GB; rv:1.9.2.22) Gecko/20110902 Thunderbird/3.1.14 MIME-Version: 1.0 To: Jeremy Chadwick References: <4E7F49A7.1020909@platinum.linux.pl> <20110925165946.GA42447@icarus.home.lan> In-Reply-To: <20110925165946.GA42447@icarus.home.lan> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and 3ware controller resets X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 25 Sep 2011 17:15:48 -0000 On 2011-09-25 18:59, Jeremy Chadwick wrote: > On Sun, Sep 25, 2011 at 05:32:55PM +0200, Adam Nowacki wrote: >> I have a 20 disk storage system, every now and then a disk dies and >> causes 3ware controller to reset because of disk timeouts. This cuts >> out ZFS from all disks, even healthy ones and the system requires a >> hard reset. >> Two issues here: >> 1) Why the controller has to reset? Thats a completely insane way of >> dealing with drive timeout. >> 2) ZFS not reopening the disk after controller reset. >> >> FreeBSD version: 8.1-RELEASE-p1 >> >> /c0 Driver Version = 3.80.06.003 >> /c0 Model = 9650SE-16ML >> /c0 Available Memory = 224MB >> /c0 Firmware Version = FE9X 4.10.00.007 >> /c0 Bios Version = BE9X 4.08.00.002 >> /c0 Boot Loader Version = BL9X 3.08.00.001 >> >> pool: zp2 >> state: ONLINE >> scrub: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> zp2 ONLINE 0 0 0 >> raidz2 ONLINE 0 0 0 >> da1p1 ONLINE 0 0 0 >> da2p1 ONLINE 0 0 0 >> da3p1 ONLINE 0 0 0 >> da4p1 ONLINE 0 0 0 >> da5p1 ONLINE 0 0 0 >> da6p1 ONLINE 0 0 0 >> da7p1 ONLINE 0 0 0 >> da9p1 ONLINE 0 0 0 >> da8p1 ONLINE 0 0 0 >> da10p1 ONLINE 0 0 0 >> >> >> Then when disk starts behaving: >> >> >> twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 >> (da3:twa0:0:3:0): READ(10). CDB: 28 0 a3 f4 e7 60 0 0 8 0 >> (da3:twa0:0:3:0): CAM status: SCSI Status Error >> (da3:twa0:0:3:0): SCSI status: Check Condition >> (da3:twa0:0:3:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) >> twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 >> twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 >> twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 >> twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 >> (da3:twa0:0:3:0): READ(10). CDB: 28 0 a5 4 83 80 0 0 80 0 >> (da3:twa0:0:3:0): CAM status: SCSI Status Error >> (da3:twa0:0:3:0): SCSI status: Check Condition >> (da3:twa0:0:3:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) >> twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 >> (da3:twa0:0:3:0): READ(10). CDB: 28 0 a5 4 83 80 0 0 80 0 >> (da3:twa0:0:3:0): CAM status: SCSI Status Error >> (da3:twa0:0:3:0): SCSI status: Check Condition >> (da3:twa0:0:3:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) >> twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 >> (da3:twa0:0:3:0): READ(10). CDB: 28 0 a5 4 83 80 0 0 80 0 >> (da3:twa0:0:3:0): CAM status: SCSI Status Error >> (da3:twa0:0:3:0): SCSI status: Check Condition >> (da3:twa0:0:3:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) >> twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 >> (da3:twa0:0:3:0): READ(10). CDB: 28 0 a5 4 83 80 0 0 80 0 >> (da3:twa0:0:3:0): CAM status: SCSI Status Error >> (da3:twa0:0:3:0): SCSI status: Check Condition >> (da3:twa0:0:3:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) >> twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 >> (da3:twa0:0:3:0): READ(10). CDB: 28 0 a5 4 83 80 0 0 80 0 >> (da3:twa0:0:3:0): CAM status: SCSI Status Error >> (da3:twa0:0:3:0): SCSI status: Check Condition >> (da3:twa0:0:3:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) >> twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 >> twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 >> twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 >> twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 >> twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 >> (da3:twa0:0:3:0): READ(10). CDB: 28 0 cb 7c 43 b8 0 0 10 0 >> (da3:twa0:0:3:0): CAM status: SCSI Status Error >> (da3:twa0:0:3:0): SCSI status: Check Condition >> (da3:twa0:0:3:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) >> twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 >> twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 >> twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 >> twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 >> twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 >> (da3:twa0:0:3:0): READ(10). CDB: 28 0 ce e5 ca 30 0 0 20 0 >> (da3:twa0:0:3:0): CAM status: SCSI Status Error >> (da3:twa0:0:3:0): SCSI status: Check Condition >> (da3:twa0:0:3:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) >> (da3:twa0:0:3:0): READ(10). CDB: 28 0 a4 2d 2d f8 0 0 8 0 >> (da3:twa0:0:3:0): CAM status: SCSI Status Error >> (da3:twa0:0:3:0): SCSI status: Check Condition >> (da3:twa0:0:3:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) >> twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 >> twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 >> twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=2 >> (da3:twa0:0:3:0): READ(10). CDB: 28 0 cb 91 7c f8 0 0 20 0 >> (da3:twa0:0:3:0): CAM status: SCSI Status Error >> (da3:twa0:0:3:0): SCSI status: Check Condition >> (da3:twa0:0:3:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) >> twa0: Request 72 timed out! >> twa0: INFO: (0x16: 0x1108): Resetting controller...: >> twa0: INFO: (0x04: 0x005E): Cache synchronization completed: unit=0 >> twa0: INFO: (0x04: 0x005E): Cache synchronization completed: unit=3 >> twa0: INFO: (0x04: 0x0001): Controller reset occurred: resets=1 >> twa0: [ITHREAD] >> (da1:twa0:0:1:0): lost device >> (da2:twa0:0:2:0): lost device >> (da3:twa0:0:3:0): lost device >> (da4:twa0:0:4:0): lost device >> (da5:twa0:0:5:0): lost device >> (da6:twa0:0:6:0): lost device >> (da7:twa0:0:7:0): lost device >> (da8:twa0:0:8:0): lost device >> (da9:twa0:0:9:0): lost device >> (da10:twa0:0:10:0): lost device >> (da11:twa0:0:11:0): lost device >> (da12:twa0:0:12:0): lost device >> (da13:twa0:0:13:0): lost device >> (da1:twa0:0:1:0): removing device entry >> da1 at twa0 bus 0 scbus0 target 1 lun 0 >> da1: Fixed Direct Access SCSI-5 device >> da1: 100.000MB/s transfers >> da1: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) >> (da2:twa0:0:2:0): removing device entry >> da2 at twa0 bus 0 scbus0 target 2 lun 0 >> da2: Fixed Direct Access SCSI-5 device >> da2: 100.000MB/s transfers >> da2: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) >> (da3:twa0:0:3:0): removing device entry >> da3 at twa0 bus 0 scbus0 target 3 lun 0 >> da3: Fixed Direct Access SCSI-5 device >> da3: 100.000MB/s transfers >> da3: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) >> (da4:twa0:0:4:0): removing device entry >> da4 at twa0 bus 0 scbus0 target 4 lun 0 >> da4: Fixed Direct Access SCSI-5 device >> da4: 100.000MB/s transfers >> da4: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) >> (da5:twa0:0:5:0): removing device entry >> da5 at twa0 bus 0 scbus0 target 5 lun 0 >> da5: Fixed Direct Access SCSI-5 device >> da5: 100.000MB/s transfers >> da5: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) >> (da6:twa0:0:6:0): removing device entry >> da6 at twa0 bus 0 scbus0 target 6 lun 0 >> da6: Fixed Direct Access SCSI-5 device >> da6: 100.000MB/s transfers >> da6: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) >> (da7:twa0:0:7:0): removing device entry >> da7 at twa0 bus 0 scbus0 target 7 lun 0 >> da7: Fixed Direct Access SCSI-5 device >> da7: 100.000MB/s transfers >> da7: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) >> (da8:twa0:0:8:0): removing device entry >> da8 at twa0 bus 0 scbus0 target 8 lun 0 >> da8: Fixed Direct Access SCSI-5 device >> da8: 100.000MB/s transfers >> da8: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) >> (da9:twa0:0:9:0): removing device entry >> da9 at twa0 bus 0 scbus0 target 9 lun 0 >> da9: Fixed Direct Access SCSI-5 device >> da9: 100.000MB/s transfers >> da9: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) >> (da10:twa0:0:10:0): removing device entry >> da10 at twa0 bus 0 scbus0 target 10 lun 0 >> da10: Fixed Direct Access SCSI-5 device >> da10: 100.000MB/s transfers >> da10: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) >> (da11:twa0:0:11:0): removing device entry >> da11 at twa0 bus 0 scbus0 target 11 lun 0 >> da11: Fixed Direct Access SCSI-5 device >> da11: 100.000MB/s transfers >> da11: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) >> (da12:twa0:0:12:0): removing device entry >> da12 at twa0 bus 0 scbus0 target 12 lun 0 >> da12: Fixed Direct Access SCSI-5 device >> da12: 100.000MB/s transfers >> da12: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) >> (da13:twa0:0:13:0): removing device entry >> da13 at twa0 bus 0 scbus0 target 13 lun 0 >> da13: Fixed Direct Access SCSI-5 device >> da13: 100.000MB/s transfers >> da13: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) >> >> pool: zp2 >> state: ONLINE >> status: One or more devices are faulted in response to IO failures. >> action: Make sure the affected devices are connected, then run >> 'zpool clear'. >> see: http://www.sun.com/msg/ZFS-8000-HC >> scrub: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> zp2 ONLINE 7 11 0 >> raidz2 ONLINE 16 32 0 >> da1p1 ONLINE 4 10 0 >> da2p1 ONLINE 4 10 0 >> da3p1 ONLINE 5 642 1 >> da4p1 ONLINE 3 8 0 >> da5p1 ONLINE 3 12 0 >> da6p1 ONLINE 3 12 0 >> da7p1 ONLINE 3 12 0 >> da9p1 ONLINE 3 12 0 >> da8p1 ONLINE 3 14 0 >> da10p1 ONLINE 3 10 0 >> >> errors: 10 data errors, use '-v' for a list > > The behaviour here seems to match something reported here: > > http://www.freebsd.org/cgi/query-pr.cgi?pr=149968 > > Now before someone flames me and says "that's a different issue", one > has to look closely at the driver diff. It seems that a different type > of controller reset is implemented (soft vs. hard), amongst some other > details. I am very inclined to believe an updated twa(4) driver will > address your problem. > > I would suggest you try FreeBSD 8.2-STABLE instead. Do not try > 8.2-RELEASE, as it will not have this fix; 8.2-RELEASE is from July > 2010, while this commit was done September 2010. > > http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/twa/ > > Otherwise you can try to build src/sys/dev/twa from a RELENG_8 > checkout on your 8.1 box, but I make no guarantees this will work. This system is already running on the patched driver, twa_cli reports driver version 3.80.06.003 > As for your comments about "why is a reset required, insane blah blah", > this is often done when a single port itself cannot be reset (e.g. the > controller firmware, or silicon itself, does not truly have a way of > "hard resetting" a single port). > > Finally, I do not understand what you mean by "ZFS not reopening the > disk after controller reset". You'll need to explain what you mean by > that. > > And besides, when an underlying storage controller says "this disk is > having problems" and drops it from the bus (which is what should be > happening -- see beginning of my comments, your complaint, etc.), you > **do not** want the OS to re-attach the same disk it just dropped, else > you end up in this infinite loop where the controller is dropping a > drive from the bus and reattaching, over and over. Makes no sense, even > if the issue is bad cabling or otherwise. Administrator intervention is > always required in this situation. I mean that not only the timeouting disk is affected but all disks that are on the controller. Every single one stops working for ZFS, you can see that in the zpool status output, each disk reports read and write errors. zpool clear won't fix it, ZFS simply loses access to all disks on the controller while for example dd can read from each disk just fine. Also on the same controller I have a disk with UFS filesystem, mounted when the controller resets, this survives the reset as if it didn't even happen. For ZFS the only fix is to hard reset the whole system. From owner-freebsd-fs@FreeBSD.ORG Sun Sep 25 19:07:20 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7920B1065677; Sun, 25 Sep 2011 19:07:20 +0000 (UTC) (envelope-from yanegomi@gmail.com) Received: from mail-qy0-f175.google.com (mail-qy0-f175.google.com [209.85.216.175]) by mx1.freebsd.org (Postfix) with ESMTP id E12068FC0C; Sun, 25 Sep 2011 19:07:19 +0000 (UTC) Received: by qyk10 with SMTP id 10so9138603qyk.13 for ; Sun, 25 Sep 2011 12:07:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=mnRDf6XnGenHLQO0vTfap5NScGSGfvsVR4aaEWsMSUo=; b=CCrlrZp1zOZGeW3Trpgc6MtUKzg7/q0pOZm9FOC51TqP98RqHI3NPss4GLtym6LYXv b/u0hp0NyDgFsweooZNZn6558IIg+/3j8nWsgJ6vK3MRjMAYWpLlr2gDmVCq3zvCUfSV oGGjOIYMd5Fqod7ZzViW5d0eAfDQoKC1/5IEk= MIME-Version: 1.0 Received: by 10.224.175.82 with SMTP id w18mr4364657qaz.374.1316977639011; Sun, 25 Sep 2011 12:07:19 -0700 (PDT) Received: by 10.224.74.82 with HTTP; Sun, 25 Sep 2011 12:07:18 -0700 (PDT) In-Reply-To: <1376234334.20110925135003@serebryakov.spb.ru> References: <1376234334.20110925135003@serebryakov.spb.ru> Date: Sun, 25 Sep 2011 12:07:18 -0700 Message-ID: From: Garrett Cooper To: lev@freebsd.org Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org, Xin LI , current@freebsd.org Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 25 Sep 2011 19:07:20 -0000 2011/9/25 Lev Serebryakov : > Hello, Garrett. > You wrote 25 =D3=C5=CE=D4=D1=C2=D2=D1 2011 =C7., 12:06:05: > >> =9A =9A Talking to Xin yesterday, he was convinced that this was a >> filesystem//kern bug. Before I file a PR, I'm wondering if anyone else >> has seen this issue.. > =9AYes, and I posted message about it in embedded@ (Message-ID > <1175277342.20110821215629@serebryakov.spb.ru>), I've got additional > question from Warner Losh about base (underlying) file system, without > any additional reaction. Thanks for the comments Adrian and Lev! I've filed PR 161016 to track the issue, because it might be due to changes in the SU code, md, or a subtle race condition in umount (highly unlikely, but it's been noted). -Garrett From owner-freebsd-fs@FreeBSD.ORG Sun Sep 25 22:23:52 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 570BC106566C; Sun, 25 Sep 2011 22:23:52 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id 580A58FC0C; Sun, 25 Sep 2011 22:23:50 +0000 (UTC) Received: by bkbzs8 with SMTP id zs8so6257710bkb.13 for ; Sun, 25 Sep 2011 15:23:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=from:to:cc:subject:references:x-comment-to:sender:date:message-id :user-agent:mime-version:content-type; bh=nVTseTzBHjAAJZGY79dc1b1Q/VQZAQ8f7WGJNfOr+KE=; b=coNH2oZuMLNODo4zGknIPGeOtiishYX3q+Xygbg34Xsi23NRX/Z+0Lpt7uyIES+rEQ M2SZBEx2DXy3g8/tMf/GzZ7P8HAaXGaYIknd+3BfQKowklIhykYYxVEh7rRSRYlDcd3D J60iO/FCA40wMxSIbWl6m4PF+N9HFUIZSl1S4= Received: by 10.204.133.7 with SMTP id d7mr4054865bkt.104.1316987887748; Sun, 25 Sep 2011 14:58:07 -0700 (PDT) Received: from localhost ([95.69.173.122]) by mx.google.com with ESMTPS id z7sm18631390bkt.5.2011.09.25.14.58.05 (version=TLSv1/SSLv3 cipher=OTHER); Sun, 25 Sep 2011 14:58:06 -0700 (PDT) From: Mikolaj Golub To: Robert Millan References: <201108102152.p7ALqUl4075207@red.freebsd.org> <201108102200.p7AM0Nu9026320@freefall.freebsd.org> X-Comment-To: Robert Millan Sender: Mikolaj Golub Date: Mon, 26 Sep 2011 00:58:03 +0300 Message-ID: <86k48wz3mc.fsf@kopusha.home.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (berkeley-unix) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Cc: Josef Karthauser , freebsd-bugs@freebsd.org, Adrian Chadd , freebsd-fs@freebsd.org, FreeBSD-gnats-submit@freebsd.org Subject: Re: kern/159663: sockets don't work though nullfs mounts X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 25 Sep 2011 22:23:52 -0000 --=-=-= Hi, On Sun, 25 Sep 2011 17:32:27 +0200 Robert Millan wrote: RM> 2011/9/24 Robert Millan : >> I found a thread from 2007 with further discussion about this problem: >> >> http://lists.freebsd.org/pipermail/freebsd-fs/2007-February/002669.html RM> Hi, RM> I've looked at the situation in a bit more detail, for now only with RM> sockets in mind (not named pipes). My understanding is (please RM> correct me if I'm wrong): RM> - nullfs holds reference counts for each vnode, but sockets have their RM> own mechanism for reference counting (so_count / soref / sorele). RM> vnode reference counting doesn't protect against socket being closed, RM> which would leave a stale pointer in the upper nullfs layer. RM> - Increasing the reference count of the socket itself can't be done in RM> null_nodeget() because this function is merely a getter whose call RM> doesn't indicate any meaningful event. RM> - It's not clear to me that there's any event in time where the socket RM> reference can be increased. If mounting a nullfs were that event, RM> then all existing sockets would be soref'ed but we wouldn't be RM> soref'ing future sockets created in the lower layer after the mount. RM> This doesn't seem correct. RM> - Possible solution: null_nodeget() semantics are replaced with RM> something that actually allows vnodes in the upper layer to be created RM> and destroyed. RM> - Possible solution: upper layer has a memory structure to keep track RM> of which sockets in the lower layer have been soref'ed. It looks like there is no need in setting vp->v_un = lowervp->v_un for VFIFO. They work without this modification bypassing vnode operations to lover node and lowervp->v_un is used. The issue is only with local sockets, because when bind or connnect is called for nullfs file the upper v_un is used. For me the approach "vp->v_un = lowervp->v_un" has many complications. May be it is much easier to use always only lower vnode? What we need for this is to make bind and connect get the lower vnode when they are called on nullfs file. As a proof of concept below is a patch that implements it. Currently I am not sure that vrele/vref magic is done properly, but it looks like it works for me. The issues with this approach I see so far: - we need an additional flag for namei; - nullfs can be unmounted with a socket file still being opened. -- Mikolaj Golub --=-=-= Content-Type: text/x-patch Content-Disposition: inline; filename=nullfs.sockets.patch Index: sys/sys/namei.h =================================================================== --- sys/sys/namei.h (revision 225716) +++ sys/sys/namei.h (working copy) @@ -149,7 +149,8 @@ struct nameidata { #define AUDITVNODE1 0x04000000 /* audit the looked up vnode information */ #define AUDITVNODE2 0x08000000 /* audit the looked up vnode information */ #define TRAILINGSLASH 0x10000000 /* path ended in a slash */ -#define PARAMASK 0x1ffffe00 /* mask of parameter descriptors */ +#define LOWERVNODE 0x20000000 /* if it is a stackable fs return lower vnode */ +#define PARAMASK 0x3ffffe00 /* mask of parameter descriptors */ #define NDHASGIANT(NDP) (((NDP)->ni_cnd.cn_flags & GIANTHELD) != 0) Index: sys/kern/uipc_usrreq.c =================================================================== --- sys/kern/uipc_usrreq.c (revision 225716) +++ sys/kern/uipc_usrreq.c (working copy) @@ -493,7 +493,7 @@ uipc_bind(struct socket *so, struct sockaddr *nam, restart: vfslocked = 0; - NDINIT(&nd, CREATE, MPSAFE | NOFOLLOW | LOCKPARENT | SAVENAME, + NDINIT(&nd, CREATE, MPSAFE | NOFOLLOW | LOCKPARENT | SAVENAME | LOWERVNODE, UIO_SYSSPACE, buf, td); /* SHOULD BE ABLE TO ADOPT EXISTING AND wakeup() ALA FIFO's */ error = namei(&nd); @@ -1268,7 +1268,7 @@ unp_connect(struct socket *so, struct sockaddr *na UNP_PCB_UNLOCK(unp); sa = malloc(sizeof(struct sockaddr_un), M_SONAME, M_WAITOK); - NDINIT(&nd, LOOKUP, MPSAFE | FOLLOW | LOCKLEAF, UIO_SYSSPACE, buf, + NDINIT(&nd, LOOKUP, MPSAFE | FOLLOW | LOCKLEAF | LOWERVNODE, UIO_SYSSPACE, buf, td); error = namei(&nd); if (error) Index: sys/fs/nullfs/null_vnops.c =================================================================== --- sys/fs/nullfs/null_vnops.c (revision 225756) +++ sys/fs/nullfs/null_vnops.c (working copy) @@ -365,16 +365,40 @@ null_lookup(struct vop_lookup_args *ap) vrele(lvp); } else { error = null_nodeget(dvp->v_mount, lvp, &vp); - if (error) + if (error) { vput(lvp); - else - *ap->a_vpp = vp; + } else if ((flags & LOWERVNODE) != 0) { + vref(lvp); + vrele(vp); + *ap->a_vpp = lvp; + } else { + *ap->a_vpp = vp; + } } } return (error); } static int +null_create(struct vop_create_args *ap) +{ + struct componentname *cnp = ap->a_cnp; + int flags = cnp->cn_flags; + int retval; + struct vnode *vp, *lvp; + + retval = null_bypass(&ap->a_gen); + if (retval == 0 && (flags & LOWERVNODE) != 0) { + vp = *ap->a_vpp; + lvp = NULLVPTOLOWERVP(vp); + vref(lvp); + vrele(vp); + *ap->a_vpp = lvp; + } + return (retval); +} + +static int null_open(struct vop_open_args *ap) { int retval; @@ -826,6 +850,7 @@ struct vop_vector null_vnodeops = { .vop_accessx = null_accessx, .vop_advlockpurge = vop_stdadvlockpurge, .vop_bmap = VOP_EOPNOTSUPP, + .vop_create = null_create, .vop_getattr = null_getattr, .vop_getwritemount = null_getwritemount, .vop_inactive = null_inactive, --=-=-=-- From owner-freebsd-fs@FreeBSD.ORG Mon Sep 26 11:07:01 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 69DD91065675 for ; Mon, 26 Sep 2011 11:07:01 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 57CA78FC19 for ; Mon, 26 Sep 2011 11:07:01 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p8QB71Da088137 for ; Mon, 26 Sep 2011 11:07:01 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p8QB70KZ088135 for freebsd-fs@FreeBSD.org; Mon, 26 Sep 2011 11:07:00 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 26 Sep 2011 11:07:00 GMT Message-Id: <201109261107.p8QB70KZ088135@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-fs@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Sep 2011 11:07:01 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/160801 fs [zfs] zfsboot on 8.2-RELEASE fails to boot from root-o o kern/160790 fs [fusefs] [panic] VPUTX: negative ref count with FUSE o kern/160777 fs [zfs] [hang] RAID-Z3 causes fatal hang upon scrub/impo o kern/160706 fs [zfs] zfs bootloader fails when a non-root vdev exists o kern/160591 fs [zfs] Fail to boot on zfs root with degraded raidz2 [r o kern/160410 fs [smbfs] [hang] smbfs hangs when transferring large fil o kern/160283 fs [zfs] [patch] 'zfs list' does abort in make_dataset_ha o kern/159971 fs [ffs] [panic] panic with soft updates journaling durin o kern/159930 fs [ufs] [panic] kernel core o kern/159418 fs [tmpfs] [panic] tmpfs kernel panic: recursing on non r o kern/159402 fs [zfs][loader] symlinks cause I/O errors o kern/159357 fs [zfs] ZFS MAXNAMELEN macro has confusing name (off-by- o kern/159356 fs [zfs] [patch] ZFS NAME_ERR_DISKLIKE check is Solaris-s o kern/159351 fs [nfs] [patch] - divide by zero in mountnfs() o kern/159251 fs [zfs] [request]: add FLETCHER4 as DEDUP hash option o kern/159233 fs [ext2fs] [patch] fs/ext2fs: finish reallocblk implemen o kern/159232 fs [ext2fs] [patch] fs/ext2fs: merge ext2_readwrite into o kern/159077 fs [zfs] Can't cd .. with latest zfs version o kern/159048 fs [smbfs] smb mount corrupts large files o kern/159045 fs [zfs] [hang] ZFS scrub freezes system o kern/158839 fs [zfs] ZFS Bootloader Fails if there is a Dead Disk o kern/158802 fs [amd] amd(8) ICMP storm and unkillable process. o kern/158711 fs [ffs] [panic] panic in ffs_blkfree and ffs_valloc o kern/158231 fs [nullfs] panic on unmounting nullfs mounted over ufs o f kern/157929 fs [nfs] NFS slow read o kern/157722 fs [geli] unable to newfs a geli encrypted partition o kern/157399 fs [zfs] trouble with: mdconfig force delete && zfs strip o kern/157179 fs [zfs] zfs/dbuf.c: panic: solaris assert: arc_buf_remov o kern/156797 fs [zfs] [panic] Double panic with FreeBSD 9-CURRENT and o kern/156781 fs [zfs] zfs is losing the snapshot directory, p kern/156545 fs [ufs] mv could break UFS on SMP systems o kern/156193 fs [ufs] [hang] UFS snapshot hangs && deadlocks processes o kern/156168 fs [nfs] [panic] Kernel panic under concurrent access ove o kern/156039 fs [nullfs] [unionfs] nullfs + unionfs do not compose, re o kern/155615 fs [zfs] zfs v28 broken on sparc64 -current o kern/155587 fs [zfs] [panic] kernel panic with zfs o kern/155411 fs [regression] [8.2-release] [tmpfs]: mount: tmpfs : No o kern/155199 fs [ext2fs] ext3fs mounted as ext2fs gives I/O errors o bin/155104 fs [zfs][patch] use /dev prefix by default when importing o kern/154930 fs [zfs] cannot delete/unlink file from full volume -> EN o kern/154828 fs [msdosfs] Unable to create directories on external USB o kern/154491 fs [smbfs] smb_co_lock: recursive lock for object 1 o kern/154447 fs [zfs] [panic] Occasional panics - solaris assert somew p kern/154228 fs [md] md getting stuck in wdrain state o kern/153996 fs [zfs] zfs root mount error while kernel is not located o kern/153847 fs [nfs] [panic] Kernel panic from incorrect m_free in nf o kern/153753 fs [zfs] ZFS v15 - grammatical error when attempting to u o kern/153716 fs [zfs] zpool scrub time remaining is incorrect o kern/153695 fs [patch] [zfs] Booting from zpool created on 4k-sector o kern/153680 fs [xfs] 8.1 failing to mount XFS partitions o kern/153520 fs [zfs] Boot from GPT ZFS root on HP BL460c G1 unstable o kern/153418 fs [zfs] [panic] Kernel Panic occurred writing to zfs vol o kern/153351 fs [zfs] locking directories/files in ZFS o bin/153258 fs [patch][zfs] creating ZVOLs requires `refreservation' s kern/153173 fs [zfs] booting from a gzip-compressed dataset doesn't w o kern/153126 fs [zfs] vdev failure, zpool=peegel type=vdev.too_small p kern/152488 fs [tmpfs] [patch] mtime of file updated when only inode o kern/152022 fs [nfs] nfs service hangs with linux client [regression] o kern/151942 fs [zfs] panic during ls(1) zfs snapshot directory o kern/151905 fs [zfs] page fault under load in /sbin/zfs o kern/151845 fs [smbfs] [patch] smbfs should be upgraded to support Un o bin/151713 fs [patch] Bug in growfs(8) with respect to 32-bit overfl o kern/151648 fs [zfs] disk wait bug o kern/151629 fs [fs] [patch] Skip empty directory entries during name o kern/151330 fs [zfs] will unshare all zfs filesystem after execute a o kern/151326 fs [nfs] nfs exports fail if netgroups contain duplicate o kern/151251 fs [ufs] Can not create files on filesystem with heavy us o kern/151226 fs [zfs] can't delete zfs snapshot o kern/151111 fs [zfs] vnodes leakage during zfs unmount o kern/150503 fs [zfs] ZFS disks are UNAVAIL and corrupted after reboot o kern/150501 fs [zfs] ZFS vdev failure vdev.bad_label on amd64 o kern/150390 fs [zfs] zfs deadlock when arcmsr reports drive faulted o kern/150336 fs [nfs] mountd/nfsd became confused; refused to reload n o kern/150207 fs zpool(1): zpool import -d /dev tries to open weird dev o kern/149208 fs mksnap_ffs(8) hang/deadlock o kern/149173 fs [patch] [zfs] make OpenSolaris installa o kern/149015 fs [zfs] [patch] misc fixes for ZFS code to build on Glib o kern/149014 fs [zfs] [patch] declarations in ZFS libraries/utilities o kern/149013 fs [zfs] [patch] make ZFS makefiles use the libraries fro o kern/148504 fs [zfs] ZFS' zpool does not allow replacing drives to be o kern/148490 fs [zfs]: zpool attach - resilver bidirectionally, and re o kern/148368 fs [zfs] ZFS hanging forever on 8.1-PRERELEASE o kern/148204 fs [nfs] UDP NFS causes overload o kern/148138 fs [zfs] zfs raidz pool commands freeze o kern/147903 fs [zfs] [panic] Kernel panics on faulty zfs device o kern/147881 fs [zfs] [patch] ZFS "sharenfs" doesn't allow different " o kern/147790 fs [zfs] zfs set acl(mode|inherit) fails on existing zfs o kern/147560 fs [zfs] [boot] Booting 8.1-PRERELEASE raidz system take o kern/147420 fs [ufs] [panic] ufs_dirbad, nullfs, jail panic (corrupt o kern/146941 fs [zfs] [panic] Kernel Double Fault - Happens constantly o kern/146786 fs [zfs] zpool import hangs with checksum errors o kern/146708 fs [ufs] [panic] Kernel panic in softdep_disk_write_compl o kern/146528 fs [zfs] Severe memory leak in ZFS on i386 o kern/146502 fs [nfs] FreeBSD 8 NFS Client Connection to Server s kern/145712 fs [zfs] cannot offline two drives in a raidz2 configurat o kern/145411 fs [xfs] [panic] Kernel panics shortly after mounting an o bin/145309 fs bsdlabel: Editing disk label invalidates the whole dev o kern/145272 fs [zfs] [panic] Panic during boot when accessing zfs on o kern/145246 fs [ufs] dirhash in 7.3 gratuitously frees hashes when it o kern/145238 fs [zfs] [panic] kernel panic on zpool clear tank o kern/145229 fs [zfs] Vast differences in ZFS ARC behavior between 8.0 o kern/145189 fs [nfs] nfsd performs abysmally under load o kern/144929 fs [ufs] [lor] vfs_bio.c + ufs_dirhash.c p kern/144447 fs [zfs] sharenfs fsunshare() & fsshare_main() non functi o kern/144416 fs [panic] Kernel panic on online filesystem optimization s kern/144415 fs [zfs] [panic] kernel panics on boot after zfs crash o kern/144234 fs [zfs] Cannot boot machine with recent gptzfsboot code o kern/143825 fs [nfs] [panic] Kernel panic on NFS client o bin/143572 fs [zfs] zpool(1): [patch] The verbose output from iostat o kern/143212 fs [nfs] NFSv4 client strange work ... o kern/143184 fs [zfs] [lor] zfs/bufwait LOR o kern/142878 fs [zfs] [vfs] lock order reversal o kern/142597 fs [ext2fs] ext2fs does not work on filesystems with real o kern/142489 fs [zfs] [lor] allproc/zfs LOR o kern/142466 fs Update 7.2 -> 8.0 on Raid 1 ends with screwed raid [re o kern/142306 fs [zfs] [panic] ZFS drive (from OSX Leopard) causes two o kern/142068 fs [ufs] BSD labels are got deleted spontaneously o kern/141897 fs [msdosfs] [panic] Kernel panic. msdofs: file name leng o kern/141463 fs [nfs] [panic] Frequent kernel panics after upgrade fro o kern/141305 fs [zfs] FreeBSD ZFS+sendfile severe performance issues ( o kern/141091 fs [patch] [nullfs] fix panics with DIAGNOSTIC enabled o kern/141086 fs [nfs] [panic] panic("nfs: bioread, not dir") on FreeBS o kern/141010 fs [zfs] "zfs scrub" fails when backed by files in UFS2 o kern/140888 fs [zfs] boot fail from zfs root while the pool resilveri o kern/140661 fs [zfs] [patch] /boot/loader fails to work on a GPT/ZFS- o kern/140640 fs [zfs] snapshot crash o kern/140068 fs [smbfs] [patch] smbfs does not allow semicolon in file o kern/139725 fs [zfs] zdb(1) dumps core on i386 when examining zpool c o kern/139715 fs [zfs] vfs.numvnodes leak on busy zfs p bin/139651 fs [nfs] mount(8): read-only remount of NFS volume does n o kern/139597 fs [patch] [tmpfs] tmpfs initializes va_gen but doesn't u o kern/139564 fs [zfs] [panic] 8.0-RC1 - Fatal trap 12 at end of shutdo o kern/139407 fs [smbfs] [panic] smb mount causes system crash if remot o kern/138662 fs [panic] ffs_blkfree: freeing free block o kern/138421 fs [ufs] [patch] remove UFS label limitations o kern/138202 fs mount_msdosfs(1) see only 2Gb o kern/136968 fs [ufs] [lor] ufs/bufwait/ufs (open) o kern/136945 fs [ufs] [lor] filedesc structure/ufs (poll) o kern/136944 fs [ffs] [lor] bufwait/snaplk (fsync) o kern/136873 fs [ntfs] Missing directories/files on NTFS volume o kern/136865 fs [nfs] [patch] NFS exports atomic and on-the-fly atomic p kern/136470 fs [nfs] Cannot mount / in read-only, over NFS o kern/135546 fs [zfs] zfs.ko module doesn't ignore zpool.cache filenam o kern/135469 fs [ufs] [panic] kernel crash on md operation in ufs_dirb o kern/135050 fs [zfs] ZFS clears/hides disk errors on reboot o kern/134491 fs [zfs] Hot spares are rather cold... o kern/133676 fs [smbfs] [panic] umount -f'ing a vnode-based memory dis o kern/133174 fs [msdosfs] [patch] msdosfs must support multibyte inter o kern/132960 fs [ufs] [panic] panic:ffs_blkfree: freeing free frag o kern/132397 fs reboot causes filesystem corruption (failure to sync b o kern/132331 fs [ufs] [lor] LOR ufs and syncer o kern/132237 fs [msdosfs] msdosfs has problems to read MSDOS Floppy o kern/132145 fs [panic] File System Hard Crashes o kern/131441 fs [unionfs] [nullfs] unionfs and/or nullfs not combineab o kern/131360 fs [nfs] poor scaling behavior of the NFS server under lo o kern/131342 fs [nfs] mounting/unmounting of disks causes NFS to fail o bin/131341 fs makefs: error "Bad file descriptor" on the mount poin o kern/130920 fs [msdosfs] cp(1) takes 100% CPU time while copying file o kern/130210 fs [nullfs] Error by check nullfs f kern/130133 fs [panic] [zfs] 'kmem_map too small' caused by make clea o kern/129760 fs [nfs] after 'umount -f' of a stale NFS share FreeBSD l o kern/129488 fs [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: o kern/129231 fs [ufs] [patch] New UFS mount (norandom) option - mostly o kern/129152 fs [panic] non-userfriendly panic when trying to mount(8) o kern/127787 fs [lor] [ufs] Three LORs: vfslock/devfs/vfslock, ufs/vfs f kern/127375 fs [zfs] If vm.kmem_size_max>"1073741823" then write spee o bin/127270 fs fsck_msdosfs(8) may crash if BytesPerSec is zero o kern/127029 fs [panic] mount(8): trying to mount a write protected zi f kern/126703 fs [panic] [zfs] _mtx_lock_sleep: recursed on non-recursi o kern/126287 fs [ufs] [panic] Kernel panics while mounting an UFS file o kern/125895 fs [ffs] [panic] kernel: panic: ffs_blkfree: freeing free s kern/125738 fs [zfs] [request] SHA256 acceleration in ZFS o kern/123939 fs [msdosfs] corrupts new files f sparc/123566 fs [zfs] zpool import issue: EOVERFLOW o kern/122380 fs [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash o bin/122172 fs [fs]: amd(8) automount daemon dies on 6.3-STABLE i386, o bin/121898 fs [nullfs] pwd(1)/getcwd(2) fails with Permission denied o bin/121366 fs [zfs] [patch] Automatic disk scrubbing from periodic(8 o bin/121072 fs [smbfs] mount_smbfs(8) cannot normally convert the cha o kern/120483 fs [ntfs] [patch] NTFS filesystem locking changes o kern/120482 fs [ntfs] [patch] Sync style changes between NetBSD and F f kern/120210 fs [zfs] [panic] reboot after panic: solaris assert: arc_ o kern/118912 fs [2tb] disk sizing/geometry problem with large array o kern/118713 fs [minidump] [patch] Display media size required for a k o bin/118249 fs [ufs] mv(1): moving a directory changes its mtime o kern/118126 fs [nfs] [patch] Poor NFS server write performance o kern/118107 fs [ntfs] [panic] Kernel panic when accessing a file at N o kern/117954 fs [ufs] dirhash on very large directories blocks the mac o bin/117315 fs [smbfs] mount_smbfs(8) and related options can't mount o kern/117314 fs [ntfs] Long-filename only NTFS fs'es cause kernel pani o kern/117158 fs [zfs] zpool scrub causes panic if geli vdevs detach on o bin/116980 fs [msdosfs] [patch] mount_msdosfs(8) resets some flags f o conf/116931 fs lack of fsck_cd9660 prevents mounting iso images with o kern/116583 fs [ffs] [hang] System freezes for short time when using o bin/115361 fs [zfs] mount(8) gets into a state where it won't set/un o kern/114955 fs [cd9660] [patch] [request] support for mask,dirmask,ui o kern/114847 fs [ntfs] [patch] [request] dirmask support for NTFS ala o kern/114676 fs [ufs] snapshot creation panics: snapacct_ufs2: bad blo o bin/114468 fs [patch] [request] add -d option to umount(8) to detach o kern/113852 fs [smbfs] smbfs does not properly implement DFS referral o bin/113838 fs [patch] [request] mount(8): add support for relative p o bin/113049 fs [patch] [request] make quot(8) use getopt(3) and show o kern/112658 fs [smbfs] [patch] smbfs and caching problems (resolves b o kern/111843 fs [msdosfs] Long Names of files are incorrectly created o kern/111782 fs [ufs] dump(8) fails horribly for large filesystems s bin/111146 fs [2tb] fsck(8) fails on 6T filesystem o kern/109024 fs [msdosfs] [iconv] mount_msdosfs: msdosfs_iconv: Operat o kern/109010 fs [msdosfs] can't mv directory within fat32 file system o bin/107829 fs [2TB] fdisk(8): invalid boundary checking in fdisk / w o kern/106107 fs [ufs] left-over fsck_snapshot after unfinished backgro o kern/104406 fs [ufs] Processes get stuck in "ufs" state under persist o kern/104133 fs [ext2fs] EXT2FS module corrupts EXT2/3 filesystems o kern/103035 fs [ntfs] Directories in NTFS mounted disc images appear o kern/101324 fs [smbfs] smbfs sometimes not case sensitive when it's s o kern/99290 fs [ntfs] mount_ntfs ignorant of cluster sizes s bin/97498 fs [request] newfs(8) has no option to clear the first 12 o kern/97377 fs [ntfs] [patch] syntax cleanup for ntfs_ihash.c o kern/95222 fs [cd9660] File sections on ISO9660 level 3 CDs ignored o kern/94849 fs [ufs] rename on UFS filesystem is not atomic o bin/94810 fs fsck(8) incorrectly reports 'file system marked clean' o kern/94769 fs [ufs] Multiple file deletions on multi-snapshotted fil o kern/94733 fs [smbfs] smbfs may cause double unlock o kern/93942 fs [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D o kern/92272 fs [ffs] [hang] Filling a filesystem while creating a sna o kern/91134 fs [smbfs] [patch] Preserve access and modification time a kern/90815 fs [smbfs] [patch] SMBFS with character conversions somet o kern/88657 fs [smbfs] windows client hang when browsing a samba shar o kern/88555 fs [panic] ffs_blkfree: freeing free frag on AMD 64 o kern/88266 fs [smbfs] smbfs does not implement UIO_NOCOPY and sendfi o bin/87966 fs [patch] newfs(8): introduce -A flag for newfs to enabl o kern/87859 fs [smbfs] System reboot while umount smbfs. o kern/86587 fs [msdosfs] rm -r /PATH fails with lots of small files o bin/85494 fs fsck_ffs: unchecked use of cg_inosused macro etc. o kern/80088 fs [smbfs] Incorrect file time setting on NTFS mounted vi o bin/74779 fs Background-fsck checks one filesystem twice and omits o kern/73484 fs [ntfs] Kernel panic when doing `ls` from the client si o bin/73019 fs [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino o kern/71774 fs [ntfs] NTFS cannot "see" files on a WinXP filesystem o bin/70600 fs fsck(8) throws files away when it can't grow lost+foun o kern/68978 fs [panic] [ufs] crashes with failing hard disk, loose po o kern/65920 fs [nwfs] Mounted Netware filesystem behaves strange o kern/65901 fs [smbfs] [patch] smbfs fails fsx write/truncate-down/tr o kern/61503 fs [smbfs] mount_smbfs does not work as non-root o kern/55617 fs [smbfs] Accessing an nsmb-mounted drive via a smb expo o kern/51685 fs [hang] Unbounded inode allocation causes kernel to loc o kern/51583 fs [nullfs] [patch] allow to work with devices and socket o kern/36566 fs [smbfs] System reboot with dead smb mount and umount o bin/27687 fs fsck(8) wrapper is not properly passing options to fsc o kern/18874 fs [2TB] 32bit NFS servers export wrong negative values t 249 problems total. From owner-freebsd-fs@FreeBSD.ORG Mon Sep 26 13:29:33 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 58BC7106572A for ; Mon, 26 Sep 2011 13:29:33 +0000 (UTC) (envelope-from erik@tefre.com) Received: from mta1-filtered.netlife.no (mail.netlife.no [62.92.26.226]) by mx1.freebsd.org (Postfix) with ESMTP id 10DEF8FC0C for ; Mon, 26 Sep 2011 13:29:32 +0000 (UTC) Received: from amavis2.netlife.no (amavishost [10.115.1.12]) by mta1-filtered.netlife.no (Postfix) with ESMTP id 6C5E7363BD3; Mon, 26 Sep 2011 15:11:09 +0200 (CEST) X-Virus-Scanned: amavisd-new at netlife.no Received: from mta1-submission.netlife.no ([62.92.26.226]) by amavis2.netlife.no (amavis2.netlife.no [10.115.1.12]) (amavisd-new, port 10026) with ESMTP id QqzBd4XbhTpV; Mon, 26 Sep 2011 15:11:09 +0200 (CEST) Received: from [10.0.0.78] (203.84-49-43.nextgentel.com [84.49.43.203]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: erik@tefre.com) by mta1-submission.netlife.no (Postfix) with ESMTPSA id 32FDF363BCC; Mon, 26 Sep 2011 15:11:09 +0200 (CEST) Message-ID: <4E8079EC.1000803@tefre.com> Date: Mon, 26 Sep 2011 15:11:08 +0200 From: Erik Stian Tefre User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.21) Gecko/20110831 Thunderbird/3.1.13 MIME-Version: 1.0 To: Adam Nowacki References: <4E7F49A7.1020909@platinum.linux.pl> In-Reply-To: <4E7F49A7.1020909@platinum.linux.pl> Content-Type: text/plain; charset=ISO-8859-2 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and 3ware controller resets X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Sep 2011 13:29:33 -0000 On 09/25/2011 05:32 PM, Adam Nowacki wrote: > I have a 20 disk storage system, every now and then a disk dies and > causes 3ware controller to reset because of disk timeouts. This cuts out > ZFS from all disks, even healthy ones and the system requires a hard reset. > Two issues here: > 1) Why the controller has to reset? Thats a completely insane way of > dealing with drive timeout. > 2) ZFS not reopening the disk after controller reset. > > FreeBSD version: 8.1-RELEASE-p1 > > /c0 Driver Version = 3.80.06.003 > /c0 Model = 9650SE-16ML > /c0 Available Memory = 224MB > /c0 Firmware Version = FE9X 4.10.00.007 > /c0 Bios Version = BE9X 4.08.00.002 > /c0 Boot Loader Version = BL9X 3.08.00.001 [...] I'd try upgrading from firmware 4.10.00.007 to the current stable firmware 4.10.00.021. That seems to have solved similar controller resets and OS/block device hangs for me on several servers. Those servers are running Linux by the way, but if there's a weakness in the old firmware I guess it may affect all operating systems. Take a look at the .021 firmeare release notes. If I remember correctly I think they mentioned better handling of this kind of drive events. -- Erik From owner-freebsd-fs@FreeBSD.ORG Mon Sep 26 14:01:00 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4B02A106566C; Mon, 26 Sep 2011 14:01:00 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 22BE68FC15; Mon, 26 Sep 2011 14:01:00 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id CA08D46B42; Mon, 26 Sep 2011 10:00:59 -0400 (EDT) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 6C3658A03E; Mon, 26 Sep 2011 10:00:59 -0400 (EDT) From: John Baldwin To: freebsd-fs@freebsd.org Date: Mon, 26 Sep 2011 09:50:23 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110617; KDE/4.5.5; amd64; ; ) References: In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201109260950.23373.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.6 (bigwig.baldwin.cx); Mon, 26 Sep 2011 10:00:59 -0400 (EDT) Cc: Andriy Gapon Subject: Re: bootloader block cache improvement X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Sep 2011 14:01:00 -0000 On Thursday, September 22, 2011 2:02:22 pm Artem Belevich wrote: > Hi, > > I've had ZFS-only box that boots off 8-drive raidz2 array. I've noticed > that on this machine it takes noticeably longer to load kernel and > modules than on a similar box that boots off 1-drive ZFS filesystem. > > It turns out that block cache in loader only caches data from one disk > only and invalidates the cache as soon as we read from another > drive. With ZFS reading from multiple drives when filesystem is on a > raidz pool the cache was effectively useless in that scenario. I've got > literally 0 hits reported by bcachestat command. One thing to keep in mind is that the cache was designed to optimize the experience with floppy disks. There is one implication from this which is non-obvious. Floppy drives do not have a reliable signal for "the disk has changed", so the loader flushes the cache on certain operations (file close, or at least it used to). I think the logic to flush when the unit changed is what provided that feature, but it's not as obvious now. It would be nice to still flush the cache for floppies, but I'm not sure how feasible that would be to maintain. At the very least you could perhaps add a bcache_flush_unit() and call that from bd_closedisk() in libi386/biosdisk.c if OD_FLAGS_FLOPPY is set. Aside from that, I think the patch is good. -- John Baldwin From owner-freebsd-fs@FreeBSD.ORG Mon Sep 26 16:53:37 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6D2C2106564A for ; Mon, 26 Sep 2011 16:53:37 +0000 (UTC) (envelope-from jusher71@yahoo.com) Received: from nm11.bullet.mail.ne1.yahoo.com (nm11.bullet.mail.ne1.yahoo.com [98.138.90.74]) by mx1.freebsd.org (Postfix) with SMTP id 14F588FC13 for ; Mon, 26 Sep 2011 16:53:36 +0000 (UTC) Received: from [98.138.90.51] by nm11.bullet.mail.ne1.yahoo.com with NNFMP; 26 Sep 2011 16:53:36 -0000 Received: from [98.138.87.12] by tm4.bullet.mail.ne1.yahoo.com with NNFMP; 26 Sep 2011 16:53:36 -0000 Received: from [127.0.0.1] by omp1012.mail.ne1.yahoo.com with NNFMP; 26 Sep 2011 16:53:36 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 265417.25824.bm@omp1012.mail.ne1.yahoo.com Received: (qmail 52782 invoked by uid 60001); 26 Sep 2011 16:53:36 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1317056016; bh=GcYNRvBp+8hn2SJdqqcAv3yFAs+hhSUCcibo0udsJL0=; h=X-YMail-OSG:Received:X-Mailer:Message-ID:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type; b=vfvYKejvENXbZGdm5MhNZI+SK6ZNCVHGbJ82XKhJXxmjANSTE90Jr3Kv2wYVkDfI9JyveTpTcuFUGBFSImU6DjJg/j4c+hFNZA3cAr8KMUm58F+4fL5b6LLlVSXuSdMX+tiIdQaxyDosu9jOTeDJNEw+Q0yz+aYqZX2ILitxQ1Q= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Mailer:Message-ID:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type; b=NEpBK1BXtlDpupNgiDGAUu2rOmQOoHuUc5GbdGszQ+H2EJ1GvEVXS7McITAbOv/69NmPiv0XJ/rJEG9LeMGSRJ3ytRzNDLLW6LinMC4kLytGuWUaXAgLr/sLyAj5mbgyLbt+dm8WvZ1Lh+Tafme/svn6BY07E1z+OpC3/KjYA9M=; X-YMail-OSG: 2gSnlDIVM1mEF.XQ5k3HO8Mv8qH1kjsalqR4XTHGDP6SScX H_qNDLRiQhNHJVSCF5lPf9h157d5FOWGzyDROrQdKTfTAveGUrtcIU3Ou4Zk 9emki0uZkwjw9zxVGIYOcNaG9nNhnGmmHG2aewbgxyGTT9_R_aczQE07spvc UpnROquAxi7XgNukLkML9cqvPGqF3uNJhVfvwf1upKnBbTjUe2vHqhCAhySj UfITm8z_u86r6gmSi6fgtlxbsQTBfOOzzRV7RS2zumbjawrZTb86CtMRdF0I ZS5J0kt3C_gextEsC0_gLe4XBfx6hB6tkBt1LD29aEVqs_hZ_nRvMNSSIc1s TAKVgD8d00thXokOCrvuGuvnPNoEbq63MEGHRPztsPQqP0gKDOfUxWRB.7c3 t8hjnug-- Received: from [193.138.216.101] by web121207.mail.ne1.yahoo.com via HTTP; Mon, 26 Sep 2011 09:53:36 PDT X-Mailer: YahooMailClassic/14.0.5 YahooMailWebService/0.8.114.317681 Message-ID: <1317056016.52758.YahooMailClassic@web121207.mail.ne1.yahoo.com> Date: Mon, 26 Sep 2011 09:53:36 -0700 (PDT) From: Jason Usher To: Peter Jeremy In-Reply-To: <20110923085408.GA16726@server.vk2pj.dyndns.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: freebsd-fs@freebsd.org Subject: Re: redux: 48 or 96 sata3 paths ... specific ZFS hardware proposal X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Sep 2011 16:53:37 -0000 Peter, et. al, --- On Fri, 9/23/11, Peter Jeremy wrote: > >About the only possible downside I can see to this > board is that it > >only takes 192GB of ram, but that is a LOT of ram ... > > Not for a 60TB ZFS system with lots (metadata in > particular) activity. > Even if you don't think you need 192GB, I suggest you > populate the > board with the largest DIMMs you can afford and leave a > bank of slots > free so you can fit more RAM in the future. Ok, I decided to go with the X8DAH+-F motherboard. Still has 7 pcie slots, and goes to 288 GB. That should be scalable well into the future. Now I am researching the drives - people seem to have fairly strong opinions on the 5k rpm hitachi 3TB vs. the 7k rpm hitachi 3TB drives. I'd like to optimize for performance, so I think it will be the 7200s.... From owner-freebsd-fs@FreeBSD.ORG Tue Sep 27 05:36:29 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id ADE9E106566C; Tue, 27 Sep 2011 05:36:29 +0000 (UTC) (envelope-from rmh.aybabtu@gmail.com) Received: from mail-iy0-f182.google.com (mail-iy0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id 3339B8FC0A; Tue, 27 Sep 2011 05:36:29 +0000 (UTC) Received: by iadk27 with SMTP id k27so8465007iad.13 for ; Mon, 26 Sep 2011 22:36:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=e5UrCSzdGIVb4uj4EJR0yND5O53QdOK2+pvqY5A6VfM=; b=ujs2lth5gdh/RnSYcCjXTrFcMGfbNwVwCO3bEeIuRZyoSit5TQ+/WpR0W9TbEnfHjq AIOUPoPLbLVOBaFRvubUKn6ZqiAo3ayUhoEcXeYhYpu0/0xY5gJ40M1mZ0iObt9qs7HX CzjPNacTnN87o0pTbXH9w2kFK9OSeylIVG59s= MIME-Version: 1.0 Received: by 10.42.177.72 with SMTP id bh8mr1065640icb.39.1317101788818; Mon, 26 Sep 2011 22:36:28 -0700 (PDT) Sender: rmh.aybabtu@gmail.com Received: by 10.42.217.74 with HTTP; Mon, 26 Sep 2011 22:36:28 -0700 (PDT) In-Reply-To: <86k48wz3mc.fsf@kopusha.home.net> References: <201108102152.p7ALqUl4075207@red.freebsd.org> <201108102200.p7AM0Nu9026320@freefall.freebsd.org> <86k48wz3mc.fsf@kopusha.home.net> Date: Tue, 27 Sep 2011 07:36:28 +0200 X-Google-Sender-Auth: 3zFgix_-aAgkznwWlF6k-PAYGRs Message-ID: From: Robert Millan To: Mikolaj Golub Content-Type: text/plain; charset=UTF-8 Cc: Josef Karthauser , freebsd-bugs@freebsd.org, Adrian Chadd , freebsd-fs@freebsd.org, FreeBSD-gnats-submit@freebsd.org Subject: Re: kern/159663: sockets don't work though nullfs mounts X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Sep 2011 05:36:29 -0000 2011/9/25 Mikolaj Golub : > As a proof of concept below is a patch that implements it. This works very well, I'm currently using your patch to run X11 over a nullfs-mounted /tmp. > The issues with this approach I see so far: > > - we need an additional flag for namei; What does this involve? From owner-freebsd-fs@FreeBSD.ORG Tue Sep 27 06:49:13 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D146D1065674; Tue, 27 Sep 2011 06:49:13 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from mail-fx0-f54.google.com (mail-fx0-f54.google.com [209.85.161.54]) by mx1.freebsd.org (Postfix) with ESMTP id 15BE48FC16; Tue, 27 Sep 2011 06:49:11 +0000 (UTC) Received: by fxg9 with SMTP id 9so8711933fxg.13 for ; Mon, 26 Sep 2011 23:49:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=from:to:cc:subject:organization:references:sender:date:in-reply-to :message-id:user-agent:mime-version:content-type; bh=0pPRj+2bdErfuLt2DQENbDyKCnBV4ZrVc/yxxGI0h10=; b=mcB4rI0CqNO/8GzGMvyuXd8768k37UZH64WzB18hr1svUWgiMYtwzPa7Fqt70f2AkJ wmFsuoqSFNWLCfL3dxv9IMcHTKy82MYpsExnimDuuVG9MzbU6+2H7fmxmrev0ddgY7nD JdpucBTt5bskNAQQuMZ5UswzqojifSMwC3kFY= Received: by 10.223.48.69 with SMTP id q5mr11953485faf.80.1317106151152; Mon, 26 Sep 2011 23:49:11 -0700 (PDT) Received: from localhost ([94.27.39.186]) by mx.google.com with ESMTPS id y8sm22643787faj.10.2011.09.26.23.49.09 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 26 Sep 2011 23:49:09 -0700 (PDT) From: Mikolaj Golub To: Robert Millan Organization: TOA Ukraine References: <201108102152.p7ALqUl4075207@red.freebsd.org> <201108102200.p7AM0Nu9026320@freefall.freebsd.org> <86k48wz3mc.fsf@kopusha.home.net> Sender: Mikolaj Golub Date: Tue, 27 Sep 2011 09:49:08 +0300 In-Reply-To: (Robert Millan's message of "Tue, 27 Sep 2011 07:36:28 +0200") Message-ID: <86litajx97.fsf@in138.ua3> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (berkeley-unix) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Cc: Josef Karthauser , freebsd-fs@freebsd.org, freebsd-bugs@freebsd.org, Adrian Chadd , Mikolaj Golub , FreeBSD-gnats-submit@freebsd.org Subject: Re: kern/159663: sockets don't work though nullfs mounts X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Sep 2011 06:49:14 -0000 --=-=-= On Tue, 27 Sep 2011 07:36:28 +0200 Robert Millan wrote: RM> 2011/9/25 Mikolaj Golub : >> As a proof of concept below is a patch that implements it. RM> This works very well, I'm currently using your patch to run X11 over a RM> nullfs-mounted /tmp. >> The issues with this approach I see so far: >> >> - we need an additional flag for namei; RM> What does this involve? Well, adding yet another flag just to handle this one case might be not very good idea :-) But actually it is possible to do without the additional flag, with the only hack in nullfs code: in lookup and create return lower vnode if it is a socket, like in the patch below. It works for me but I have not tested much and not checked yet if use cases are possible when this makes undesirable effect. -- Mikolaj Golub --=-=-= Content-Type: text/x-diff Content-Disposition: inline; filename=nullfs.VSOCK.patch Index: sys/fs/nullfs/null_vnops.c =================================================================== --- sys/fs/nullfs/null_vnops.c (revision 225757) +++ sys/fs/nullfs/null_vnops.c (working copy) @@ -365,16 +365,38 @@ null_lookup(struct vop_lookup_args *ap) vrele(lvp); } else { error = null_nodeget(dvp->v_mount, lvp, &vp); - if (error) + if (error) { vput(lvp); - else + } else if (vp->v_type == VSOCK) { + vref(lvp); + vrele(vp); + *ap->a_vpp = lvp; + } else { *ap->a_vpp = vp; + } } } return (error); } static int +null_create(struct vop_create_args *ap) +{ + struct vnode *vp, *lvp; + int retval; + + retval = null_bypass(&ap->a_gen); + vp = *ap->a_vpp; + if (retval == 0 && vp->v_type == VSOCK) { + lvp = NULLVPTOLOWERVP(vp); + vref(lvp); + vrele(vp); + *ap->a_vpp = lvp; + } + return (retval); +} + +static int null_open(struct vop_open_args *ap) { int retval; @@ -826,6 +848,7 @@ struct vop_vector null_vnodeops = { .vop_accessx = null_accessx, .vop_advlockpurge = vop_stdadvlockpurge, .vop_bmap = VOP_EOPNOTSUPP, + .vop_create = null_create, .vop_getattr = null_getattr, .vop_getwritemount = null_getwritemount, .vop_inactive = null_inactive, --=-=-=-- From owner-freebsd-fs@FreeBSD.ORG Tue Sep 27 19:17:43 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 300B0106564A for ; Tue, 27 Sep 2011 19:17:43 +0000 (UTC) (envelope-from dpd@bitgravity.com) Received: from mail-vx0-f182.google.com (mail-vx0-f182.google.com [209.85.220.182]) by mx1.freebsd.org (Postfix) with ESMTP id D17D68FC08 for ; Tue, 27 Sep 2011 19:17:42 +0000 (UTC) Received: by vcbf13 with SMTP id f13so5595004vcb.13 for ; Tue, 27 Sep 2011 12:17:42 -0700 (PDT) Received: by 10.68.55.100 with SMTP id r4mr38599801pbp.69.1317151061955; Tue, 27 Sep 2011 12:17:41 -0700 (PDT) Received: from netops-234.sfo1.bitgravity.com (netops-234.sfo1.bitgravity.com. [209.131.110.234]) by mx.google.com with ESMTPS id h5sm7555869pbf.4.2011.09.27.12.17.40 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 27 Sep 2011 12:17:40 -0700 (PDT) Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: text/plain; charset=us-ascii From: David P Discher In-Reply-To: <4E7F61A2.5060908@platinum.linux.pl> Date: Tue, 27 Sep 2011 12:17:38 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: <299DCA15-FD90-4238-9DD9-C1B8F94CC726@bitgravity.com> References: <4E7F49A7.1020909@platinum.linux.pl> <20110925165946.GA42447@icarus.home.lan> <4E7F61A2.5060908@platinum.linux.pl> To: Adam Nowacki X-Mailer: Apple Mail (2.1084) Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and 3ware controller resets X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Sep 2011 19:17:43 -0000 We use a lot of this exact 3ware controller (and firmware) with zfs and = 8.1-RELEASE. Though I have seen controller resets, I have not seen this = exact error with zfs and 3ware. We do 2x RAID-1, and a 14-disk RAID 5,50 = or 10, and the controller seems to survive disk failures in RAID config = with ZFS. However, sometimes we will hit the "calru" ... time-went = backwards while the controller resets and the kernel tries to figure = things out. Of course this is likely service impacting. When multiple controller resets are detected, we have typically declared = the card as bad, and RMA or replaced the card. So far, our VAR has not = rejected replacing the card while in the standard 3-years warranty.=20 I would recommend replacing the controller.=20 HOWEVER - I have seen this ZFS behavior with a different controller/HBA = setup. We have older Xyratex 5400-series 48 bay what-evers connected to = the freebsd host via fiber channel and an LSI 7404EP HBA (mpt). Legacy = setups exported LUN/arrays from the Xyratex at RAID-5, and then = gstripe'ed to form single volumes. Setups upgraded to the ZFS setup, of = course do away with the gstripe.=20 When gstripe (with ufs2) when a Xyratex controllers crashes and resets, = geom gets confused, produces read/write errors, and eventually panics. = In the ZFS world, these failures are almost silent, zpool never reports = an error (we're striping the luns in the zpool, no raidz or raidz2 ). = Eventually all the processes access disk hang is D-state, and the = machine grinds to halt.=20 The recommendation from the community was to use gmountver(8) from -head = and use those vdevs in the zpool. We got it back ported to 8.1. = However, there was some issues with geom-tasting order, and what vdevs = will get picked up by the zpool. I have since abandoned this testing. = We were never able to get multi-pathing working under freebsd. --- David P. Discher dpd@bitgravity.com * AIM: bgDavidDPD BITGRAVITY * http://www.bitgravity.com On Sep 25, 2011, at 10:15 AM, Adam Nowacki wrote: > On 2011-09-25 18:59, Jeremy Chadwick wrote: >> On Sun, Sep 25, 2011 at 05:32:55PM +0200, Adam Nowacki wrote: >>> I have a 20 disk storage system, every now and then a disk dies and >>> causes 3ware controller to reset because of disk timeouts. This cuts >>> out ZFS from all disks, even healthy ones and the system requires a >>> hard reset. >>> Two issues here: >>> 1) Why the controller has to reset? Thats a completely insane way of >>> dealing with drive timeout. >>> 2) ZFS not reopening the disk after controller reset. >>>=20 >>> FreeBSD version: 8.1-RELEASE-p1 >>>=20 >>> /c0 Driver Version =3D 3.80.06.003 >>> /c0 Model =3D 9650SE-16ML >>> /c0 Available Memory =3D 224MB >>> /c0 Firmware Version =3D FE9X 4.10.00.007 >>> /c0 Bios Version =3D BE9X 4.08.00.002 >>> /c0 Boot Loader Version =3D BL9X 3.08.00.001 ... >=20 > I mean that not only the timeouting disk is affected but all disks = that are on the controller. Every single one stops working for ZFS, you = can see that in the zpool status output, each disk reports read and = write errors. zpool clear won't fix it, ZFS simply loses access to all = disks on the controller while for example dd can read from each disk = just fine. Also on the same controller I have a disk with UFS = filesystem, mounted when the controller resets, this survives the reset = as if it didn't even happen. For ZFS the only fix is to hard reset the = whole system. From owner-freebsd-fs@FreeBSD.ORG Wed Sep 28 00:19:30 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5BB15106566B; Wed, 28 Sep 2011 00:19:30 +0000 (UTC) (envelope-from mckusick@mckusick.com) Received: from chez.mckusick.com (chez.mckusick.com [70.36.157.235]) by mx1.freebsd.org (Postfix) with ESMTP id 320038FC12; Wed, 28 Sep 2011 00:19:29 +0000 (UTC) Received: from chez.mckusick.com (localhost [127.0.0.1]) by chez.mckusick.com (8.14.3/8.14.3) with ESMTP id p8S0JVUW067163; Tue, 27 Sep 2011 17:19:31 -0700 (PDT) (envelope-from mckusick@chez.mckusick.com) Message-Id: <201109280019.p8S0JVUW067163@chez.mckusick.com> To: Garrett Cooper In-reply-to: Date: Tue, 27 Sep 2011 17:19:31 -0700 From: Kirk McKusick X-Spam-Status: No, score=0.0 required=5.0 tests=MISSING_MID, UNPARSEABLE_RELAY autolearn=failed version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on chez.mckusick.com Cc: freebsd-fs@freebsd.org, Xin LI , bug-followup@freebsd.org Subject: Re: PR kern/161016 Need to force sync(2) before umounting UFS1 filesystems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Sep 2011 00:19:30 -0000 > Date: Sun, 25 Sep 2011 12:07:18 -0700 > From: Garrett Cooper > To: lev@freebsd.org > Cc: freebsd-fs@freebsd.org, Xin LI , current@freebsd.org > Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? > > 2011/9/25 Lev Serebryakov : > > Hello, Garrett. > > You wrote 25 =D3=C5=CE=D4=D1=C2=D2=D1 2011 =C7., 12:06:05: > > > >> =9A =9A Talking to Xin yesterday, he was convinced that this was a > >> filesystem//kern bug. Before I file a PR, I'm wondering if anyone else > >> has seen this issue.. > > =9AYes, and I posted message about it in embedded@ (Message-ID > > <1175277342.20110821215629@serebryakov.spb.ru>), I've got additional > > question from Warner Losh about base (underlying) file system, without > > any additional reaction. > > Thanks for the comments Adrian and Lev! I've filed PR 161016 to track > the issue, because it might be due to changes in the SU code, md, or a > subtle race condition in umount (highly unlikely, but it's been > noted). > -Garrett > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" I have taken responsibility for working on this bug report (PR kern/161016). I propose the following change to correct it: Index: sys/kern/vfs_mount.c =================================================================== --- sys/kern/vfs_mount.c (revision 225807) +++ sys/kern/vfs_mount.c (working copy) @@ -1227,18 +1227,6 @@ mp->mnt_kern_flag |= MNTK_UNMOUNTF; error = 0; if (mp->mnt_lockref) { - if ((flags & MNT_FORCE) == 0) { - mp->mnt_kern_flag &= ~(MNTK_UNMOUNT | MNTK_NOINSMNTQ | - MNTK_UNMOUNTF); - if (mp->mnt_kern_flag & MNTK_MWAIT) { - mp->mnt_kern_flag &= ~MNTK_MWAIT; - wakeup(mp); - } - MNT_IUNLOCK(mp); - if (coveredvp) - VOP_UNLOCK(coveredvp, 0); - return (EBUSY); - } mp->mnt_kern_flag |= MNTK_DRAINING; error = msleep(&mp->mnt_lockref, MNT_MTX(mp), PVFS, "mount drain", 0); The things to check for are: 1) That it fixes the EBUSY on unmount. 2) That it does not cause unmount to hang. I would appreciate feedback as to whether this fix helps. Kirk McKusick From owner-freebsd-fs@FreeBSD.ORG Thu Sep 29 01:36:49 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AB42D106566B; Thu, 29 Sep 2011 01:36:49 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id 470348FC0A; Thu, 29 Sep 2011 01:36:48 +0000 (UTC) Received: from alf.home (alf.kiev.zoral.com.ua [10.1.1.177]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id p8T1aZVY088478 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 29 Sep 2011 04:36:35 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from alf.home (kostik@localhost [127.0.0.1]) by alf.home (8.14.5/8.14.5) with ESMTP id p8T1aZPm050811; Thu, 29 Sep 2011 04:36:35 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by alf.home (8.14.5/8.14.5/Submit) id p8T1aZUG050810; Thu, 29 Sep 2011 04:36:35 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: alf.home: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 29 Sep 2011 04:36:35 +0300 From: Kostik Belousov To: Kirk McKusick Message-ID: <20110929013635.GG1511@deviant.kiev.zoral.com.ua> References: <201109280019.p8S0JVUW067163@chez.mckusick.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="ssSfcPohcXNs3135" Content-Disposition: inline In-Reply-To: <201109280019.p8S0JVUW067163@chez.mckusick.com> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-3.9 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: Garrett Cooper , freebsd-fs@freebsd.org, Xin LI , bug-followup@freebsd.org Subject: Re: PR kern/161016 Need to force sync(2) before umounting UFS1 filesystems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Sep 2011 01:36:49 -0000 --ssSfcPohcXNs3135 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Sep 27, 2011 at 05:19:31PM -0700, Kirk McKusick wrote: > > Date: Sun, 25 Sep 2011 12:07:18 -0700 > > From: Garrett Cooper > > To: lev@freebsd.org > > Cc: freebsd-fs@freebsd.org, Xin LI , current@freeb= sd.org > > Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? > >=20 > > 2011/9/25 Lev Serebryakov : > > > Hello, Garrett. > > > You wrote 25 =3DD3=3DC5=3DCE=3DD4=3DD1=3DC2=3DD2=3DD1 2011 =3DC7., 12= :06:05: > > > > > >> =3D9A =3D9A Talking to Xin yesterday, he was convinced that this was= a > > >> filesystem//kern bug. Before I file a PR, I'm wondering if anyone el= se > > >> has seen this issue.. > > > =3D9AYes, and I posted message about it in embedded@ (Message-ID > > > <1175277342.20110821215629@serebryakov.spb.ru>), I've got additional > > > question from Warner Losh about base (underlying) file system, without > > > any additional reaction. > >=20 > > Thanks for the comments Adrian and Lev! I've filed PR 161016 to track > > the issue, because it might be due to changes in the SU code, md, or a > > subtle race condition in umount (highly unlikely, but it's been > > noted). > > -Garrett > > _______________________________________________ > > freebsd-fs@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >=20 > I have taken responsibility for working on this bug report (PR kern/16101= 6). >=20 > I propose the following change to correct it: >=20 > Index: sys/kern/vfs_mount.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- sys/kern/vfs_mount.c (revision 225807) > +++ sys/kern/vfs_mount.c (working copy) > @@ -1227,18 +1227,6 @@ > mp->mnt_kern_flag |=3D MNTK_UNMOUNTF; > error =3D 0; > if (mp->mnt_lockref) { > - if ((flags & MNT_FORCE) =3D=3D 0) { > - mp->mnt_kern_flag &=3D ~(MNTK_UNMOUNT | MNTK_NOINSMNTQ | > - MNTK_UNMOUNTF); > - if (mp->mnt_kern_flag & MNTK_MWAIT) { > - mp->mnt_kern_flag &=3D ~MNTK_MWAIT; > - wakeup(mp); > - } > - MNT_IUNLOCK(mp); > - if (coveredvp) > - VOP_UNLOCK(coveredvp, 0); > - return (EBUSY); > - } > mp->mnt_kern_flag |=3D MNTK_DRAINING; > error =3D msleep(&mp->mnt_lockref, MNT_MTX(mp), PVFS, > "mount drain", 0); >=20 > The things to check for are: >=20 > 1) That it fixes the EBUSY on unmount. >=20 > 2) That it does not cause unmount to hang. >=20 > I would appreciate feedback as to whether this fix helps. I think the item 2) should be tested mostly on the hung NFS server. I understand what you are doing, you do not want a transient mount point busy caller to fail the unmount. But my belief is that this is the intended mode of operation for non-forced unmounts. As I compare the original bug report and your change, the reason that UFS gives spurious EBUSY on soft unmounts is that SU code busies mp around some processing. Is my guess right ? Then, restoring some amount of sync(2) before the unmount would be useful, please see r222466 for the most likely reason why the issue appeared. Might be, the best route would be to add a kludge mnt_flag that request dounmount() to do a VFS_SYNC() before checking for the busy holder ? --ssSfcPohcXNs3135 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (FreeBSD) iEYEARECAAYFAk6Dy6MACgkQC3+MBN1Mb4hhwQCgzuj/4OgfYVYgROYIjridzOs5 wooAnje938vnGjgW9UincSwhn0+Sj7Fq =4iJ6 -----END PGP SIGNATURE----- --ssSfcPohcXNs3135-- From owner-freebsd-fs@FreeBSD.ORG Thu Sep 29 02:19:54 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 16A65106564A; Thu, 29 Sep 2011 02:19:54 +0000 (UTC) (envelope-from yanegomi@gmail.com) Received: from mail-qy0-f175.google.com (mail-qy0-f175.google.com [209.85.216.175]) by mx1.freebsd.org (Postfix) with ESMTP id 90BE48FC1A; Thu, 29 Sep 2011 02:19:53 +0000 (UTC) Received: by qyk10 with SMTP id 10so3403735qyk.13 for ; Wed, 28 Sep 2011 19:19:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=wN/t/EtuC6GgjdP3mnmpJrSXXlQfyFD/2zCxSnlgdco=; b=HFpvz0l5UENN5laaQRB8In6A1zMTfmMi24i3yfyb5w0cVbG5xup/WLEYXXgmlR9+2o /Qdj7mPGyI/bn8YI4bAtxRqBgyr8rC7S8QM0IimnFtTKAi9KQFshxklL2205glwUPAhm LNum7FXvaVbZ9aIuNfVzTW3EI5Bk37gWI6dmw= MIME-Version: 1.0 Received: by 10.224.217.137 with SMTP id hm9mr7564312qab.124.1317262792677; Wed, 28 Sep 2011 19:19:52 -0700 (PDT) Received: by 10.224.74.82 with HTTP; Wed, 28 Sep 2011 19:19:52 -0700 (PDT) In-Reply-To: <20110929013635.GG1511@deviant.kiev.zoral.com.ua> References: <201109280019.p8S0JVUW067163@chez.mckusick.com> <20110929013635.GG1511@deviant.kiev.zoral.com.ua> Date: Wed, 28 Sep 2011 19:19:52 -0700 Message-ID: From: Garrett Cooper To: Kostik Belousov Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: Kirk McKusick , freebsd-fs@freebsd.org, Xin LI , bug-followup@freebsd.org Subject: Re: PR kern/161016 Need to force sync(2) before umounting UFS1 filesystems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Sep 2011 02:19:54 -0000 On Wed, Sep 28, 2011 at 6:36 PM, Kostik Belousov wrot= e: > On Tue, Sep 27, 2011 at 05:19:31PM -0700, Kirk McKusick wrote: >> > Date: Sun, 25 Sep 2011 12:07:18 -0700 >> > From: Garrett Cooper >> > To: lev@freebsd.org >> > Cc: freebsd-fs@freebsd.org, Xin LI , current@free= bsd.org >> > Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? >> > >> > 2011/9/25 Lev Serebryakov : >> > > Hello, Garrett. >> > > You wrote 25 =3DD3=3DC5=3DCE=3DD4=3DD1=3DC2=3DD2=3DD1 2011 =3DC7., 1= 2:06:05: >> > > >> > >> =3D9A =3D9A Talking to Xin yesterday, he was convinced that this wa= s a >> > >> filesystem//kern bug. Before I file a PR, I'm wondering if anyone e= lse >> > >> has seen this issue.. >> > > =3D9AYes, and I posted message about it in embedded@ (Message-ID >> > > <1175277342.20110821215629@serebryakov.spb.ru>), I've got additional >> > > question from Warner Losh about base (underlying) file system, witho= ut >> > > any additional reaction. >> > >> > Thanks for the comments Adrian and Lev! I've filed PR 161016 to track >> > the issue, because it might be due to changes in the SU code, md, or a >> > subtle race condition in umount (highly unlikely, but it's been >> > noted). >> > -Garrett >> > _______________________________________________ >> > freebsd-fs@freebsd.org mailing list >> > http://lists.freebsd.org/mailman/listinfo/freebsd-fs >> > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >> >> I have taken responsibility for working on this bug report (PR kern/1610= 16). >> >> I propose the following change to correct it: >> >> Index: sys/kern/vfs_mount.c >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >> --- sys/kern/vfs_mount.c =A0 =A0 =A0(revision 225807) >> +++ sys/kern/vfs_mount.c =A0 =A0 =A0(working copy) >> @@ -1227,18 +1227,6 @@ >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 mp->mnt_kern_flag |=3D MNTK_UNMOUNTF; >> =A0 =A0 =A0 error =3D 0; >> =A0 =A0 =A0 if (mp->mnt_lockref) { >> - =A0 =A0 =A0 =A0 =A0 =A0 if ((flags & MNT_FORCE) =3D=3D 0) { >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 mp->mnt_kern_flag &=3D ~(MNTK_= UNMOUNT | MNTK_NOINSMNTQ | >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 MNTK_UNMOUNTF); >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (mp->mnt_kern_flag & MNTK_M= WAIT) { >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 mp->mnt_kern_f= lag &=3D ~MNTK_MWAIT; >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 wakeup(mp); >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 } >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 MNT_IUNLOCK(mp); >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (coveredvp) >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 VOP_UNLOCK(cov= eredvp, 0); >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 return (EBUSY); >> - =A0 =A0 =A0 =A0 =A0 =A0 } >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 mp->mnt_kern_flag |=3D MNTK_DRAINING; >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 error =3D msleep(&mp->mnt_lockref, MNT_MTX(m= p), PVFS, >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 "mount drain", 0); >> >> The things to check for are: >> >> 1) That it fixes the EBUSY on unmount. >> >> 2) That it does not cause unmount to hang. >> >> I would appreciate feedback as to whether this fix helps. > > I think the item 2) should be tested mostly on the hung NFS server. > > I understand what you are doing, you do not want a transient mount point > busy caller to fail the unmount. But my belief is that this is the > intended mode of operation for non-forced unmounts. > > As I compare the original bug report and your change, the reason that > UFS gives spurious EBUSY on soft unmounts is that SU code busies mp > around some processing. Is my guess right ? Then, restoring some amount > of sync(2) before the unmount would be useful, please see r222466 for > the most likely reason why the issue appeared. > > Might be, the best route would be to add a kludge mnt_flag that request > dounmount() to do a VFS_SYNC() before checking for the busy holder ? This would undo some of the changes attillio added for the locking stuff in r184554. Not sure what the prior behavior was because I only traced back the change a little bit. Thanks, -Garrett 1. http://svnweb.freebsd.org/base/head/sys/kern/vfs_mount.c?revision=3D1845= 54&view=3Dmarkup From owner-freebsd-fs@FreeBSD.ORG Thu Sep 29 06:20:27 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A381B1065670; Thu, 29 Sep 2011 06:20:27 +0000 (UTC) (envelope-from mckusick@mckusick.com) Received: from chez.mckusick.com (chez.mckusick.com [70.36.157.235]) by mx1.freebsd.org (Postfix) with ESMTP id 72BC08FC0C; Thu, 29 Sep 2011 06:20:27 +0000 (UTC) Received: from chez.mckusick.com (localhost [127.0.0.1]) by chez.mckusick.com (8.14.3/8.14.3) with ESMTP id p8T6KGCl057169; Wed, 28 Sep 2011 23:20:16 -0700 (PDT) (envelope-from mckusick@chez.mckusick.com) Message-Id: <201109290620.p8T6KGCl057169@chez.mckusick.com> To: attilio@freebsd.org X-URL: http://WWW.McKusick.COM/ Date: Wed, 28 Sep 2011 23:20:16 -0700 From: Kirk McKusick X-Spam-Status: No, score=0.0 required=5.0 tests=MISSING_MID, UNPARSEABLE_RELAY autolearn=failed version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on chez.mckusick.com Cc: Garrett Cooper , freebsd-fs@freebsd.org, Xin LI Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Kirk McKusick List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Sep 2011 06:20:27 -0000 Hi Attilio, I have been looking into the problem described below and since you appear to be the person that put in the change in question, I would like to get you opinion on what (if anything) should be changed here. A bit of background. Historically (i.e., since UNIX was written and up until this change went in) the unmount command would always fully sync out the filesystem and return with it unmounted. The exception was if some file or directory in the filesystem was actively being used. In this case, the unmount would fail with EBUSY. After this change, an unmount will fail with EBUSY if there are dirty blocks that need to be synced, even if there are no active users of the filesystem. This affects UFS (where the soft-updates code may be doing background tasks), NFS (where the NFS daemon may be doing background tasks), and ZFS (where its syncer may be doing background tasks). The only way to reliably unmount an idle filesystem is to loop doing sync's and unmount attempts until it succeeds. Now it is possible to get the unmount to succeed by doing a forcible unmount, but that is often not what is desired as that will kill any legitimate users on the filesystem. My argument below is that we should revert to the historic semantics of unmount which is to always succeed unless there are active users on the filesystem. So, please look over my suggested change and let me know what you think. Kirk McKusick =-=-= Date: Thu, 29 Sep 2011 04:36:35 +0300 From: Kostik Belousov To: Kirk McKusick Cc: Garrett Cooper , freebsd-fs@freebsd.org, Xin LI Subject: Re: PR kern/161016 Need to force sync(2) before umounting UFS1 filesystems? On Tue, Sep 27, 2011 at 05:19:31PM -0700, Kirk McKusick wrote: > > Date: Sun, 25 Sep 2011 12:07:18 -0700 > > From: Garrett Cooper > > To: lev@freebsd.org > > Cc: freebsd-fs@freebsd.org, Xin LI , current@freebsd.org > > Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? > > > > 2011/9/25 Lev Serebryakov : > > > Hello, Garrett. > > > You wrote 25 =D3=C5=CE=D4=D1=C2=D2=D1 2011 =C7., 12:06:05: > > > > > >> =9A =9A Talking to Xin yesterday, he was convinced that this was a > > >> filesystem//kern bug. Before I file a PR, I'm wondering if anyone else > > >> has seen this issue.. > > > =9AYes, and I posted message about it in embedded@ (Message-ID > > > <1175277342.20110821215629@serebryakov.spb.ru>), I've got additional > > > question from Warner Losh about base (underlying) file system, without > > > any additional reaction. > > > > Thanks for the comments Adrian and Lev! I've filed PR 161016 to track > > the issue, because it might be due to changes in the SU code, md, or a > > subtle race condition in umount (highly unlikely, but it's been > > noted). > > -Garrett > > _______________________________________________ > > freebsd-fs@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > > I have taken responsibility for working on this bug report (PR kern/161016). > > I propose the following change to correct it: > > Index: sys/kern/vfs_mount.c > =================================================================== > --- sys/kern/vfs_mount.c (revision 225807) > +++ sys/kern/vfs_mount.c (working copy) > @@ -1227,18 +1227,6 @@ > mp->mnt_kern_flag |= MNTK_UNMOUNTF; > error = 0; > if (mp->mnt_lockref) { > - if ((flags & MNT_FORCE) == 0) { > - mp->mnt_kern_flag &= ~(MNTK_UNMOUNT | MNTK_NOINSMNTQ | > - MNTK_UNMOUNTF); > - if (mp->mnt_kern_flag & MNTK_MWAIT) { > - mp->mnt_kern_flag &= ~MNTK_MWAIT; > - wakeup(mp); > - } > - MNT_IUNLOCK(mp); > - if (coveredvp) > - VOP_UNLOCK(coveredvp, 0); > - return (EBUSY); > - } > mp->mnt_kern_flag |= MNTK_DRAINING; > error = msleep(&mp->mnt_lockref, MNT_MTX(mp), PVFS, > "mount drain", 0); > > The things to check for are: > > 1) That it fixes the EBUSY on unmount. > > 2) That it does not cause unmount to hang. > > I would appreciate feedback as to whether this fix helps. I think the item 2) should be tested mostly on the hung NFS server. I understand what you are doing, you do not want a transient mount point busy caller to fail the unmount. But my belief is that this is the intended mode of operation for non-forced unmounts. As I compare the original bug report and your change, the reason that UFS gives spurious EBUSY on soft unmounts is that SU code busies mp around some processing. Is my guess right ? Then, restoring some amount of sync(2) before the unmount would be useful, please see r222466 for the most likely reason why the issue appeared. Might be, the best route would be to add a kludge mnt_flag that request dounmount() to do a VFS_SYNC() before checking for the busy holder ? From owner-freebsd-fs@FreeBSD.ORG Thu Sep 29 10:04:26 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DCC7F106564A for ; Thu, 29 Sep 2011 10:04:26 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: from mail-wy0-f182.google.com (mail-wy0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id 4F4E28FC16 for ; Thu, 29 Sep 2011 10:04:25 +0000 (UTC) Received: by wyj26 with SMTP id 26so128806wyj.13 for ; Thu, 29 Sep 2011 03:04:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=Kw9hO6oh+/Fh2RRPEikdRSS91q+h0XV4Sp+BPVHZC2w=; b=YdIlrWY/83t5vhl/qKdMdkr0Hc5naplwNZup2iV8P3ZSZSsOQRYLS1BHX/UlBzwbcd ZGD74nJd9jVfGXu5RhaZvEq+PTfrb5nXj5tXa7rQUvw4oa4PXa+tT6nMszA3E0Vm5dR5 YFnEvTIght4RchH7yaM7AzlMJ+xDEb2EcgXlM= MIME-Version: 1.0 Received: by 10.216.203.69 with SMTP id e47mr8197082weo.57.1317290664950; Thu, 29 Sep 2011 03:04:24 -0700 (PDT) Sender: asmrookie@gmail.com Received: by 10.216.182.3 with HTTP; Thu, 29 Sep 2011 03:04:24 -0700 (PDT) In-Reply-To: <201109290620.p8T6KGCl057169@chez.mckusick.com> References: <201109290620.p8T6KGCl057169@chez.mckusick.com> Date: Thu, 29 Sep 2011 12:04:24 +0200 X-Google-Sender-Auth: bKv4SGopmXn09ZjMvp037WTWOW8 Message-ID: From: Attilio Rao To: Kirk McKusick Content-Type: text/plain; charset=UTF-8 Cc: Garrett Cooper , freebsd-fs@freebsd.org, Xin LI Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Sep 2011 10:04:26 -0000 2011/9/29 Kirk McKusick : > Hi Attilio, > > I have been looking into the problem described below and since you > appear to be the person that put in the change in question, I would > like to get you opinion on what (if anything) should be changed here. Kirk, please note that I didn't add/change anything wrt. that codepath. In the old code it was present a lockmgr() acquisition with LK_DRAIN and LK_NOWAIT. This means that if the lockmgr() lock on the struct mount was already held by any other consumer it was going to fallback in the codepath you outlined in the patch immediately, rather than just sleeping (and note that LK_NOWAIT was just passed in the case of a non-forced unmount). Said that, I don't really have an objection with making the forced unmount case as the default, but I still didn't go through the whole thread you outlined and I don't have any context on it, thus I'm not sure if this is the right approach or not. If you want to share more context on the problem you are trying to solve by switching that policy we may discuss this too, but in general I don't have a problem about adopting forced unmount policy on unmount for all the cases. Attilio -- Peace can only be achieved by understanding - A. Einstein From owner-freebsd-fs@FreeBSD.ORG Thu Sep 29 11:30:16 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 80C2E1065673 for ; Thu, 29 Sep 2011 11:30:16 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 59DAB8FC13 for ; Thu, 29 Sep 2011 11:30:16 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p8TBUG44089704 for ; Thu, 29 Sep 2011 11:30:16 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p8TBUGQZ089701; Thu, 29 Sep 2011 11:30:16 GMT (envelope-from gnats) Date: Thu, 29 Sep 2011 11:30:16 GMT Message-Id: <201109291130.p8TBUGQZ089701@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Gavin Atkinson Cc: Subject: Re: kern/159971: [ffs] [panic] panic with soft updates journaling during load testing X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Gavin Atkinson List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Sep 2011 11:30:16 -0000 The following reply was made to PR kern/159971; it has been noted by GNATS. From: Gavin Atkinson To: bug-followup@FreeBSD.org Cc: Subject: Re: kern/159971: [ffs] [panic] panic with soft updates journaling during load testing Date: Thu, 29 Sep 2011 12:14:21 +0100 Regression test for this PR committed as r225871. From owner-freebsd-fs@FreeBSD.ORG Thu Sep 29 14:10:54 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8C4531065675 for ; Thu, 29 Sep 2011 14:10:54 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru [IPv6:2a01:4f8:131:60a2::2]) by mx1.freebsd.org (Postfix) with ESMTP id 2E4828FC1A for ; Thu, 29 Sep 2011 14:10:54 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (unknown [IPv6:2001:470:923f:1:906c:6af3:5301:18c6]) (Authenticated sender: lev@serebryakov.spb.ru) by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id D1B604AC1C for ; Thu, 29 Sep 2011 18:10:52 +0400 (MSD) Date: Thu, 29 Sep 2011 18:10:48 +0400 From: Lev Serebryakov Organization: FreeBSD X-Priority: 3 (Normal) Message-ID: <736310478.20110929181048@serebryakov.spb.ru> To: freebsd-fs@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=windows-1251 Content-Transfer-Encoding: quoted-printable Cc: Subject: Random access benchamr with different wwights for different files? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: lev@FreeBSD.org List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Sep 2011 14:10:54 -0000 Hello, Freebsd-fs. Is here any FS benchmark with random access test, which emulates different weight for different files? Real life examples of such access is big file server with multiple clients, where some files are actual and some are historic or big torrent box with torrents of different popularity. Read-only is Ok. Auto-generated weights are Ok (for example, "normal" distribution of probabilities with given sigma). --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-fs@FreeBSD.ORG Thu Sep 29 15:31:26 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D86A1106566C; Thu, 29 Sep 2011 15:31:26 +0000 (UTC) (envelope-from mckusick@mckusick.com) Received: from chez.mckusick.com (chez.mckusick.com [70.36.157.235]) by mx1.freebsd.org (Postfix) with ESMTP id 625E78FC08; Thu, 29 Sep 2011 15:31:26 +0000 (UTC) Received: from chez.mckusick.com (localhost [127.0.0.1]) by chez.mckusick.com (8.14.3/8.14.3) with ESMTP id p8TFVRka077669; Thu, 29 Sep 2011 08:31:28 -0700 (PDT) (envelope-from mckusick@chez.mckusick.com) Message-Id: <201109291531.p8TFVRka077669@chez.mckusick.com> To: Attilio Rao In-reply-to: Date: Thu, 29 Sep 2011 08:31:27 -0700 From: Kirk McKusick X-Spam-Status: No, score=0.0 required=5.0 tests=MISSING_MID, UNPARSEABLE_RELAY autolearn=failed version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on chez.mckusick.com Cc: Garrett Cooper , freebsd-fs@freebsd.org, Xin LI Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Sep 2011 15:31:26 -0000 > Date: Thu, 29 Sep 2011 12:04:24 +0200 > From: Attilio Rao > To: Kirk McKusick > Cc: Garrett Cooper , freebsd-fs@freebsd.org, > Xin LI > Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? > > 2011/9/29 Kirk McKusick : > > Hi Attilio, > > > > I have been looking into the problem described below and since you > > appear to be the person that put in the change in question, I would > > like to get you opinion on what (if anything) should be changed here. > > Kirk, > please note that I didn't add/change anything wrt. that codepath. > > In the old code it was present a lockmgr() acquisition with LK_DRAIN > and LK_NOWAIT. This means that if the lockmgr() lock on the struct > mount was already held by any other consumer it was going to fallback > in the codepath you outlined in the patch immediately, rather than > just sleeping (and note that LK_NOWAIT was just passed in the case of > a non-forced unmount). > > Said that, I don't really have an objection with making the forced > unmount case as the default, but I still didn't go through the whole > thread you outlined and I don't have any context on it, thus I'm not > sure if this is the right approach or not. > > If you want to share more context on the problem you are trying to > solve by switching that policy we may discuss this too, but in general > I don't have a problem about adopting forced unmount policy on unmount > for all the cases. > > Attilio > -- > Peace can only be achieved by understanding - A. Einstein Thanks for providing a bit more of the history on this codepath. Since 9-stable has now been branched, I believe that the best path forward is to check this change into head and let it sit there for several months so that we can get some experience with it. If it causes folks problems we can back it out. If it does not cause problems, then we can MFC it to 9-stable. Does this seem like a reasonable approach? Kirk McKusick From owner-freebsd-fs@FreeBSD.ORG Thu Sep 29 15:39:01 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E4F7E106564A; Thu, 29 Sep 2011 15:39:00 +0000 (UTC) (envelope-from yanegomi@gmail.com) Received: from mail-qy0-f182.google.com (mail-qy0-f182.google.com [209.85.216.182]) by mx1.freebsd.org (Postfix) with ESMTP id 7AF2B8FC08; Thu, 29 Sep 2011 15:39:00 +0000 (UTC) Received: by qyk4 with SMTP id 4so1074208qyk.13 for ; Thu, 29 Sep 2011 08:38:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=sGLdW508RBabRkHp4rwYAs0pkFQfvlgshmMuQTBoG/0=; b=rkDnuyuE93Dbuq7c+8Rl227zFi1VuM4gvGRnAbXo+4LffgtTbqFaUQvwnnVDB6NWx6 M01UM3s2fnYVzUXA9Hkv3NpFPIvP2258CQ03aFNEA+OStlDf2PfhEC8NuY9Mb8B29h7P kJI/MDOvOk+VH+wv2Gwl4zG5R+64LJF/NTkeA= MIME-Version: 1.0 Received: by 10.224.217.137 with SMTP id hm9mr8098428qab.124.1317310739648; Thu, 29 Sep 2011 08:38:59 -0700 (PDT) Received: by 10.224.74.82 with HTTP; Thu, 29 Sep 2011 08:38:59 -0700 (PDT) In-Reply-To: <201109291531.p8TFVRka077669@chez.mckusick.com> References: <201109291531.p8TFVRka077669@chez.mckusick.com> Date: Thu, 29 Sep 2011 08:38:59 -0700 Message-ID: From: Garrett Cooper To: Kirk McKusick Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: Attilio Rao , freebsd-fs@freebsd.org, Xin LI Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Sep 2011 15:39:01 -0000 On Thu, Sep 29, 2011 at 8:31 AM, Kirk McKusick wrot= e: >> Date: Thu, 29 Sep 2011 12:04:24 +0200 >> From: Attilio Rao >> To: Kirk McKusick >> Cc: Garrett Cooper , freebsd-fs@freebsd.org, >> =A0 =A0 =A0 =A0 Xin LI >> Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? >> >> 2011/9/29 Kirk McKusick : >> > Hi Attilio, >> > >> > I have been looking into the problem described below and since you >> > appear to be the person that put in the change in question, I would >> > like to get you opinion on what (if anything) should be changed here. >> >> Kirk, >> please note that I didn't add/change anything wrt. that codepath. >> >> In the old code it was present a lockmgr() acquisition with LK_DRAIN >> and LK_NOWAIT. This means that if the lockmgr() lock on the struct >> mount was already held by any other consumer it was going to fallback >> in the codepath you outlined in the patch immediately, rather than >> just sleeping (and note that LK_NOWAIT was just passed in the case of >> a non-forced unmount). >> >> Said that, I don't really have an objection with making the forced >> unmount case as the default, but I still didn't go through the whole >> thread you outlined and I don't have any context on it, thus I'm not >> sure if this is the right approach or not. >> >> If you want to share more context on the problem you are trying to >> solve by switching that policy we may discuss this too, but in general >> I don't have a problem about adopting forced unmount policy on unmount >> for all the cases. >> >> Attilio >> -- >> Peace can only be achieved by understanding - A. Einstein > > Thanks for providing a bit more of the history on this codepath. > > Since 9-stable has now been branched, I believe that the best path > forward is to check this change into head and let it sit there for > several months so that we can get some experience with it. If it > causes folks problems we can back it out. If it does not cause > problems, then we can MFC it to 9-stable. > > Does this seem like a reasonable approach? I'll give it a quick run through first on some machines this weekend, with NFS, UFS, and ZFS. It seems like this could negatively affect a number of users, so I want to make sure that it passes a smoke test before committing directly to HEAD. Thanks! -Garrett From owner-freebsd-fs@FreeBSD.ORG Thu Sep 29 15:40:45 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1629E106564A; Thu, 29 Sep 2011 15:40:45 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: from mail-ww0-f50.google.com (mail-ww0-f50.google.com [74.125.82.50]) by mx1.freebsd.org (Postfix) with ESMTP id 7C0058FC1A; Thu, 29 Sep 2011 15:40:44 +0000 (UTC) Received: by wwe3 with SMTP id 3so1159952wwe.31 for ; Thu, 29 Sep 2011 08:40:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=Cq5V2On8BFRcjOal0Tg5AYNq09UvZrnAZgwMrCIQvX4=; b=rZHIlOPc/qqzTMBBd5aoVrlF/Hn4WtpxNcISE8a/ARCmMP0DWmKUvpiTzHnNeZQ1Zi YIGeEzSS83risBVNi1IJtR4bbkMrdMNu9AP8rASh/0rOU75dKoqdgF4IqPF5EBk17JtP Sola+QseuDI/5dIJxIS9IV+wWukdaytPnGYa0= MIME-Version: 1.0 Received: by 10.216.229.134 with SMTP id h6mr1107735weq.42.1317310843383; Thu, 29 Sep 2011 08:40:43 -0700 (PDT) Sender: asmrookie@gmail.com Received: by 10.216.182.3 with HTTP; Thu, 29 Sep 2011 08:40:43 -0700 (PDT) In-Reply-To: <201109291531.p8TFVRka077669@chez.mckusick.com> References: <201109291531.p8TFVRka077669@chez.mckusick.com> Date: Thu, 29 Sep 2011 17:40:43 +0200 X-Google-Sender-Auth: G_fTv5gjl3Jp5LDF4UsoSZrZ9tM Message-ID: From: Attilio Rao To: Kirk McKusick Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: Garrett Cooper , freebsd-fs@freebsd.org, Xin LI Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Sep 2011 15:40:45 -0000 2011/9/29 Kirk McKusick : >> Date: Thu, 29 Sep 2011 12:04:24 +0200 >> From: Attilio Rao >> To: Kirk McKusick >> Cc: Garrett Cooper , freebsd-fs@freebsd.org, >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 Xin LI >> Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? >> >> 2011/9/29 Kirk McKusick : >> > Hi Attilio, >> > >> > I have been looking into the problem described below and since you >> > appear to be the person that put in the change in question, I would >> > like to get you opinion on what (if anything) should be changed here. >> >> Kirk, >> please note that I didn't add/change anything wrt. that codepath. >> >> In the old code it was present a lockmgr() acquisition with LK_DRAIN >> and LK_NOWAIT. This means that if the lockmgr() lock on the struct >> mount was already held by any other consumer it was going to fallback >> in the codepath you outlined in the patch immediately, rather than >> just sleeping (and note that LK_NOWAIT was just passed in the case of >> a non-forced unmount). >> >> Said that, I don't really have an objection with making the forced >> unmount case as the default, but I still didn't go through the whole >> thread you outlined and I don't have any context on it, thus I'm not >> sure if this is the right approach or not. >> >> If you want to share more context on the problem you are trying to >> solve by switching that policy we may discuss this too, but in general >> I don't have a problem about adopting forced unmount policy on unmount >> for all the cases. >> >> Attilio >> -- >> Peace can only be achieved by understanding - A. Einstein > > Thanks for providing a bit more of the history on this codepath. > > Since 9-stable has now been branched, I believe that the best path > forward is to check this change into head and let it sit there for > several months so that we can get some experience with it. If it > causes folks problems we can back it out. If it does not cause > problems, then we can MFC it to 9-stable. > > Does this seem like a reasonable approach? In general yes, but I'd like to understand why unmount should fail so much with SU... do we do extended period with vfs_busy()'ed filesystem? I need more context here, likely I'd need to look into the PRs too before to give an informative answer. Attilio --=20 Peace can only be achieved by understanding - A. Einstein From owner-freebsd-fs@FreeBSD.ORG Thu Sep 29 15:55:51 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3DBDE106567B; Thu, 29 Sep 2011 15:55:51 +0000 (UTC) (envelope-from yanegomi@gmail.com) Received: from mail-qy0-f182.google.com (mail-qy0-f182.google.com [209.85.216.182]) by mx1.freebsd.org (Postfix) with ESMTP id C8F028FC17; Thu, 29 Sep 2011 15:55:50 +0000 (UTC) Received: by qyk4 with SMTP id 4so1097981qyk.13 for ; Thu, 29 Sep 2011 08:55:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=EdAKtOJxanC/LJ7LYwyuI29dQbxlM9defS5miZrM3gc=; b=tVmXgV0qg1Ic0XlbPYWqh1KQ+wOtB7QlGKVA1DgYo08TiFtsoUiwteJ29+XhzcSDtl LxUFjSAe0YJSmbRA6Vxd42VlWt/emXj+B854ukN2ewzZ7ci8vXPFlPGvx4XiVNy840nt JmvkefAdEXZ+u9VyOwhNavbqlnR8MrQuLgF1E= MIME-Version: 1.0 Received: by 10.224.175.82 with SMTP id w18mr8027795qaz.374.1317311749846; Thu, 29 Sep 2011 08:55:49 -0700 (PDT) Received: by 10.224.74.82 with HTTP; Thu, 29 Sep 2011 08:55:49 -0700 (PDT) In-Reply-To: References: <201109291531.p8TFVRka077669@chez.mckusick.com> Date: Thu, 29 Sep 2011 08:55:49 -0700 Message-ID: From: Garrett Cooper To: Attilio Rao Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: Kirk McKusick , freebsd-fs@freebsd.org, Xin LI Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Sep 2011 15:55:51 -0000 On Thu, Sep 29, 2011 at 8:40 AM, Attilio Rao wrote: > 2011/9/29 Kirk McKusick : >>> Date: Thu, 29 Sep 2011 12:04:24 +0200 >>> From: Attilio Rao >>> To: Kirk McKusick >>> Cc: Garrett Cooper , freebsd-fs@freebsd.org, >>> =A0 =A0 =A0 =A0 Xin LI >>> Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? >>> >>> 2011/9/29 Kirk McKusick : >>> > Hi Attilio, >>> > >>> > I have been looking into the problem described below and since you >>> > appear to be the person that put in the change in question, I would >>> > like to get you opinion on what (if anything) should be changed here. >>> >>> Kirk, >>> please note that I didn't add/change anything wrt. that codepath. >>> >>> In the old code it was present a lockmgr() acquisition with LK_DRAIN >>> and LK_NOWAIT. This means that if the lockmgr() lock on the struct >>> mount was already held by any other consumer it was going to fallback >>> in the codepath you outlined in the patch immediately, rather than >>> just sleeping (and note that LK_NOWAIT was just passed in the case of >>> a non-forced unmount). >>> >>> Said that, I don't really have an objection with making the forced >>> unmount case as the default, but I still didn't go through the whole >>> thread you outlined and I don't have any context on it, thus I'm not >>> sure if this is the right approach or not. >>> >>> If you want to share more context on the problem you are trying to >>> solve by switching that policy we may discuss this too, but in general >>> I don't have a problem about adopting forced unmount policy on unmount >>> for all the cases. >>> >>> Attilio >>> -- >>> Peace can only be achieved by understanding - A. Einstein >> >> Thanks for providing a bit more of the history on this codepath. >> >> Since 9-stable has now been branched, I believe that the best path >> forward is to check this change into head and let it sit there for >> several months so that we can get some experience with it. If it >> causes folks problems we can back it out. If it does not cause >> problems, then we can MFC it to 9-stable. >> >> Does this seem like a reasonable approach? > > In general yes, but I'd like to understand why unmount should fail so > much with SU... do we do extended period with vfs_busy()'ed > filesystem? > > I need more context here, likely I'd need to look into the PRs too > before to give an informative answer. The case noted in PR 161016 is that data isn't being completely flushed out to disk (or in this case memory disk) at the end of each nanobsd build when it's creating an md(4) image on the 2nd (and subsequent tries) when creating the disk image; so we have to place hacks in nanobsd.sh to sync out the SU data before umount so we can unmount and destroy the md device to prevent the build from barfing. I know that another company that does something similar to nanobsd disk images in a different way for appliance builds (I don't remember if the md generation scripts syncs out to disk though), and I'm sure that there are more companies that do the same thing. This was a behavior change between 8.x and 9.x that I noticed recently only because I started using nanobsd ~1 month ago (the other company I used to work for might have had this hack in place and I just didn't realize it at the time). Thanks, -Garrett From owner-freebsd-fs@FreeBSD.ORG Thu Sep 29 15:59:36 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CCA3B106566C; Thu, 29 Sep 2011 15:59:36 +0000 (UTC) (envelope-from mckusick@mckusick.com) Received: from chez.mckusick.com (chez.mckusick.com [70.36.157.235]) by mx1.freebsd.org (Postfix) with ESMTP id A8BBB8FC19; Thu, 29 Sep 2011 15:59:36 +0000 (UTC) Received: from chez.mckusick.com (localhost [127.0.0.1]) by chez.mckusick.com (8.14.3/8.14.3) with ESMTP id p8TFxc63084067; Thu, 29 Sep 2011 08:59:38 -0700 (PDT) (envelope-from mckusick@chez.mckusick.com) Message-Id: <201109291559.p8TFxc63084067@chez.mckusick.com> To: Attilio Rao In-reply-to: Date: Thu, 29 Sep 2011 08:59:38 -0700 From: Kirk McKusick X-Spam-Status: No, score=0.0 required=5.0 tests=MISSING_MID, UNPARSEABLE_RELAY autolearn=failed version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on chez.mckusick.com Cc: Garrett Cooper , freebsd-fs@freebsd.org, Xin LI Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Sep 2011 15:59:36 -0000 > Date: Thu, 29 Sep 2011 17:40:43 +0200 > Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? > From: Attilio Rao > To: Kirk McKusick > Cc: Garrett Cooper , freebsd-fs@freebsd.org, > Xin LI > > 2011/9/29 Kirk McKusick : > > > Thanks for providing a bit more of the history on this codepath. > > > > Since 9-stable has now been branched, I believe that the best path > > forward is to check this change into head and let it sit there for > > several months so that we can get some experience with it. If it > > causes folks problems we can back it out. If it does not cause > > problems, then we can MFC it to 9-stable. > > > > Does this seem like a reasonable approach? > > In general yes, but I'd like to understand why unmount should fail so > much with SU... do we do extended period with vfs_busy()'ed > filesystem? > > I need more context here, likely I'd need to look into the PRs too > before to give an informative answer. > > Attilio I am definitely not in a rush on this, so by all means take some time to look it over. The EBUSY unmount has been in its current state for several years, so I am fine with taking a few weeks to sort out the correct solution. Indeed, I am glad that Garrett has volunteered to do some more serious testing. If this general approach is not correct, I can put a hook in for just UFS so that it can have its historic behavior. As you have noted, the SU code has a lot of activity that gets done under the protection of vfs_busy. So it may be the only filesystem for which draining the vfs_busy lock during unmount is needed. Will you be at the EuroBSD conference next week? If so we can discuss this there. > Date: Thu, 29 Sep 2011 08:38:59 -0700 > Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? > From: Garrett Cooper > To: Kirk McKusick > Cc: Attilio Rao , freebsd-fs@freebsd.org, > Xin LI > > > Does this seem like a reasonable approach? > > I'll give it a quick run through first on some machines this weekend, > with NFS, UFS, and ZFS. It seems like this could negatively affect a > number of users, so I want to make sure that it passes a smoke test > before committing directly to HEAD. > > Thanks! > -Garrett Thanks for doing these tests to help us find out if there are landmines in this change. Kirk McKusick From owner-freebsd-fs@FreeBSD.ORG Thu Sep 29 16:00:28 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 17410106568D for ; Thu, 29 Sep 2011 16:00:28 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id E405C8FC0A for ; Thu, 29 Sep 2011 16:00:24 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p8TG0OIA040955 for ; Thu, 29 Sep 2011 16:00:24 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p8TG0OI4040954; Thu, 29 Sep 2011 16:00:24 GMT (envelope-from gnats) Date: Thu, 29 Sep 2011 16:00:24 GMT Message-Id: <201109291600.p8TG0OI4040954@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Mark Saad Cc: Subject: Re: kern/156168: [nfs] [panic] Kernel panic under concurrent access over NFS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Mark Saad List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Sep 2011 16:00:28 -0000 The following reply was made to PR kern/156168; it has been noted by GNATS. From: Mark Saad To: bug-followup@FreeBSD.org, niakrisn@gmail.com Cc: Subject: Re: kern/156168: [nfs] [panic] Kernel panic under concurrent access over NFS Date: Thu, 29 Sep 2011 11:32:12 -0400 All I am seeing a similar crash on 7.3-RELEASE-p2 amd64 when using apache-1.3.34 with accf_httpd and a nfs docroot The servers that have crashed are all FreeBSD 7.3-RELEASE amd64. Hardware is HP Dl145 g2 They have 2G of ram and 2G swap with one single core opteron cpu. We are using the following sysctls . kern.ipc.maxsockbuf=2097152 kern.ipc.nmbclusters=32768 kern.ipc.somaxconn=1024 kern.maxfiles=131072 kern.maxfilesperproc=32768 net.inet.tcp.inflight.enable=0 net.inet.tcp.path_mtu_discovery=0 net.inet.tcp.recvbuf_inc=524288 net.inet.tcp.recvbuf_max=8388608 net.inet.tcp.recvspace=32768 net.inet.tcp.sendbuf_inc=16384 net.inet.tcp.sendbuf_max=8388608 net.inet.tcp.sendspace=32768 net.inet.udp.recvspace=42080 net.isr.direct=1 vm.pmap.shpgperproc=600 Up time prior to the crash was not the other system was up for 11 days this one was 6 days. Here is the contents of my crash [root@web29 /var/crash]# kgdb /boot/kernel/kernel /var/crash/vmcore.0 GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x258 fault code = supervisor read data, page not present instruction pointer = 0x8:0xffffffff8051a66d stack pointer = 0x10:0xffffff803e69b1c0 frame pointer = 0x10:0xffffff0001b50ae0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 9336 (libhttpd.ep) trap number = 12 panic: page fault cpuid = 0 Uptime: 6d5h18m39s Physical memory: 2034 MB Dumping 1451 MB: 1436 1420 1404 1388 1372 1356 1340 1324 1308 1292 1276 1260 1244 1228 1212 1196 1180 1164 1148 1132 1116 1100 1084 1068 1052 1036 1020 1004 988 972 956 940 924 908 892 876 860 844 828 812 796 780 764 748 732 716 700 684 668 652 636 620 604 588 572 556 540 524 508 492 476 460 444 428 412 396 380 364 348 332 316 300 284 268 252 236 220 204 188 172 156 140 124 108 92 76 60 44 28 12 Reading symbols from /boot/kernel/accf_http.ko...Reading symbols from /boot/kernel/accf_http.ko.symbols...done. done. Loaded symbols for /boot/kernel/accf_http.ko #0 doadump () at pcpu.h:195 195 pcpu.h: No such file or directory. in pcpu.h (kgdb) bt #0 doadump () at pcpu.h:195 #1 0x0000000000000004 in ?? () #2 0xffffffff805285f9 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418 #3 0xffffffff80528a02 in panic (fmt=0x104
) at /usr/src/sys/kern/kern_shutdown.c:574 #4 0xffffffff807ec813 in trap_fatal (frame=0xffffff0001b50ae0, eva=Variable "eva" is not available. ) at /usr/src/sys/amd64/amd64/trap.c:777 #5 0xffffffff807ecbe5 in trap_pfault (frame=0xffffff803e69b110, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:693 #6 0xffffffff807ed50c in trap (frame=0xffffff803e69b110) at /usr/src/sys/amd64/amd64/trap.c:464 #7 0xffffffff807d614e in calltrap () at /usr/src/sys/amd64/amd64/exception.S:218 #8 0xffffffff8051a66d in _mtx_lock_sleep (m=0xffffff002f3d7a80, tid=18446742974226565856, opts=Variable "opts" is not available. ) at /usr/src/sys/kern/kern_mutex.c:339 #9 0xffffffff80701f60 in clnt_dg_create (so=0xffffff00017755a0, svcaddr=0xffffff803e69b310, program=100000, version=4, sendsz=Variable "sendsz" is not available. ) at /usr/src/sys/rpc/clnt_dg.c:259 #10 0xffffffff806e97c9 in nlm_get_rpc (sa=Variable "sa" is not available. ) at /usr/src/sys/nlm/nlm_prot_impl.c:327 #11 0xffffffff806e9d39 in nlm_host_get_rpc (host=0xffffff0001705000) at /usr/src/sys/nlm/nlm_prot_impl.c:1199 #12 0xffffffff806e680f in nlm_clearlock (host=0xffffff0001705000, ext=0xffffff803e69b9a0, vers=4, timo=0xffffff803e69b9d0, retries=2147483647, vp=0xffffff004881edc8, op=2, fl=0xffffff803e69bac0, flags=64, svid=9336, fhlen=32, fh=0xffffff803e69b750, size=689) at /usr/src/sys/nlm/nlm_advlock.c:943 #13 0xffffffff806e7801 in nlm_advlock_internal (vp=0xffffff004881edc8, id=Variable "id" is not available. ) at /usr/src/sys/nlm/nlm_advlock.c:355 #14 0xffffffff806e8166 in nlm_advlock (ap=Variable "ap" is not available. ) at /usr/src/sys/nlm/nlm_advlock.c:392 #15 0xffffffff806ced28 in nfs_advlock (ap=0xffffff803e69ba90) at /usr/src/sys/nfsclient/nfs_vnops.c:3153 #16 0xffffffff804f40e2 in closef (fp=0xffffff0073716d80, td=0xffffff0001b50ae0) at vnode_if.h:1036 #17 0xffffffff804f462b in kern_close (td=0xffffff0001b50ae0, fd=Variable "fd" is not available. ) at /usr/src/sys/kern/kern_descrip.c:1125 #18 0xffffffff807ece67 in syscall (frame=0xffffff803e69bc80) at /usr/src/sys/amd64/amd64/trap.c:920 #19 0xffffffff807d635b in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:339 #20 0x00000008009c5b1c in ?? () Previous frame inner to this frame (corrupt stack?) -- mark saad | nonesuch@longcount.org From owner-freebsd-fs@FreeBSD.ORG Thu Sep 29 16:13:37 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 150941065688; Thu, 29 Sep 2011 16:13:37 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: from mail-wy0-f182.google.com (mail-wy0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id 3F5078FC0A; Thu, 29 Sep 2011 16:13:35 +0000 (UTC) Received: by wyj26 with SMTP id 26so207208wyj.13 for ; Thu, 29 Sep 2011 09:13:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=Aiohl0RQJzS3yU6BnakC+UJWMqxybg6JUodr7NeyUrc=; b=PySw9qQBPENCCK0E1QjiEgwuIhWnzkXsNh+dzVoZSHb6wuSNAj8wOtdUu8kREP3Mil u89TZAqCrftgSIN7RZXfId/A/LFI16JYOBlPZLDoe24mIGrCrykYcy8CfIiuk/gar33X jG9gvEABsJOAROc3siSW9qM8k4/UOAsFEULxk= MIME-Version: 1.0 Received: by 10.216.229.134 with SMTP id h6mr1148480weq.42.1317312814929; Thu, 29 Sep 2011 09:13:34 -0700 (PDT) Sender: asmrookie@gmail.com Received: by 10.216.182.3 with HTTP; Thu, 29 Sep 2011 09:13:34 -0700 (PDT) In-Reply-To: <201109291559.p8TFxc63084067@chez.mckusick.com> References: <201109291559.p8TFxc63084067@chez.mckusick.com> Date: Thu, 29 Sep 2011 18:13:34 +0200 X-Google-Sender-Auth: 7AwbG_xLw2EqSfE7WfRbKHRk_sc Message-ID: From: Attilio Rao To: Kirk McKusick , Konstantin Belousov Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: Garrett Cooper , freebsd-fs@freebsd.org, Xin LI Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Sep 2011 16:13:37 -0000 2011/9/29 Kirk McKusick : >> Date: Thu, 29 Sep 2011 17:40:43 +0200 >> Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? >> From: Attilio Rao >> To: Kirk McKusick >> Cc: Garrett Cooper , freebsd-fs@freebsd.org, >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 Xin LI >> >> 2011/9/29 Kirk McKusick : >> >> > Thanks for providing a bit more of the history on this codepath. >> > >> > Since 9-stable has now been branched, I believe that the best path >> > forward is to check this change into head and let it sit there for >> > several months so that we can get some experience with it. If it >> > causes folks problems we can back it out. If it does not cause >> > problems, then we can MFC it to 9-stable. >> > >> > Does this seem like a reasonable approach? >> >> In general yes, but I'd like to understand why unmount should fail so >> much with SU... do we do extended period with vfs_busy()'ed >> filesystem? >> >> I need more context here, likely I'd need to look into the PRs too >> before to give an informative answer. >> >> Attilio > > I am definitely not in a rush on this, so by all means take some time > to look it over. The EBUSY unmount has been in its current state > for several years, so I am fine with taking a few weeks to sort out > the correct solution. Indeed, I am glad that Garrett has volunteered > to do some more serious testing. > > If this general approach is not correct, I can put a hook in for just > UFS so that it can have its historic behavior. As you have noted, the > SU code has a lot of activity that gets done under the protection of > vfs_busy. So it may be the only filesystem for which draining the > vfs_busy lock during unmount is needed. Honestly, my first thought was exactly that -- an option that was forcing the unmount sleep if SU is compiled in the kernel. When you mention 'historical behaviour' you mean the behaviour UFS had even prior the introduction of the 'complete' VFS layer or it was the behaviour unmount(2) was expected to implemented since the beginning? My guess is that recent SU improvement by you and Jeff may have lead to higher vfs_busy() contention, thus making this behaviour just more visible. BTW, I'm afraid the forced unmount case may have a possible deadlock. thread1 is doing whatever codepath it wants and thread2 is doing unmount (forced right now): thread1::vfs_busy() thread2::lock coveredvnode thread1::contests coveredvnode thread2::sleep because of thread1 vfs_busy I think this deadlock was actually possible even with the old code, it was just a LOR between a vnode lock and mount lock. I'm not sure if there was any invariant I discussed with Kostik in the past, preventing one way or another I'm forgetting about, but it seems a possible deadlock to me. If you see this issue I'll make a patch for it. Attilio --=20 Peace can only be achieved by understanding - A. Einstein From owner-freebsd-fs@FreeBSD.ORG Thu Sep 29 18:51:55 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 72A77106566B for ; Thu, 29 Sep 2011 18:51:55 +0000 (UTC) (envelope-from dorionpatrick@gmail.com) Received: from mail-qy0-f175.google.com (mail-qy0-f175.google.com [209.85.216.175]) by mx1.freebsd.org (Postfix) with ESMTP id 2D6AC8FC0C for ; Thu, 29 Sep 2011 18:51:54 +0000 (UTC) Received: by qyk10 with SMTP id 10so4436274qyk.13 for ; Thu, 29 Sep 2011 11:51:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=message-id:from:to:content-type:content-transfer-encoding :mime-version:subject:date:x-mailer; bh=qYfS43ERCrDsleK982NE5WbwXTOa6RCYeWcCVb94i5U=; b=c6yzYH+lDj/PbcIZBUssQuCso7ddXcJQHZU2qRwW+Cj33piUgzRgyKRhe2xuDMwQdD D8kvIbEpFPeKFIXnOB2RsKMo9jlnBN1/oJY76y+ZEWz0uHiDEYdfHMGegfQo3U0F0UwX 58ah/gAFjVvy7merdmVtyWMsTJ3x7HXyHvOUU= Received: by 10.224.196.3 with SMTP id ee3mr8106211qab.229.1317320571564; Thu, 29 Sep 2011 11:22:51 -0700 (PDT) Received: from [192.168.5.245] (178-82-252-216-static.colba.net. [216.252.82.178]) by mx.google.com with ESMTPS id cl1sm2627181qab.0.2011.09.29.11.22.49 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 29 Sep 2011 11:22:50 -0700 (PDT) Message-Id: From: Patrick Dorion To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v936) Date: Thu, 29 Sep 2011 14:22:45 -0400 X-Mailer: Apple Mail (2.936) Subject: 'kernel' not found - ZFS on GPT boot X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Sep 2011 18:51:55 -0000 Gentlemen, and women perhaps, I would greatly appreciate any input that could shed light on my issue. Essentially, a seemingly well-configured ZFS system residing on a GPT partition is not detected by zfsloader. The formatted message is posted in the forums at http://forums.freebsd.org/showthread.php?p=148989 for your convenience, and the original message reproduced below, while the output might be difficult to distinguish. = = = = = = = = ======================================================================== I'm trying to boot from a ZFS pool on GPT partition. BTX loader 1.00 BTX version is 1.02 BIOS drive C: is disk0 FreeBSD/x86 ZFS enabled bootstrap loader, Revision 1.1 can't load 'kernel' Type '?' for a list of commands, 'help' for more detailed help. OK lsdev disk devices: disk0 BIOS drive C: disk0s1: FFS bad disklabel zfs devices: OK lsmod OK[/CODE] FreeBSD-8.2-RELEASE-amd64-livefs.iso SHA256=f72ff7e9043f200651ca6dff3a4b71ec9447319c6efc419a2f6922a921bdfc68 Fixit# gpart show -l => ad4 GPT 1 /dev/ad4p1 (freebsd-boot) 3 /dev/ad4p3 (freebsd-zfs) Fixit# gpart bootcode -b /dist/boot/pmbr -p /dist/boot/gptzfsloader -i 1 /dev/ad4p1 Fixit# zpool status pool: zpool state: ONLINE config: zpool ONLINE ad4p3 ONLINE Fixit# zpool get bootfs zpool zpool bootfs zpool local Fixit# zfs get mountpoint zpool zpool mountpoint legacy local Fixit# cd /zpool/boot Fixit# ls -l drw------- 2 root 0 2 Feb 17 2011 zfs/ Fixit# cp -f defaults/loader.conf . Fixit# cat loader.conf vfs.root.mountfrom="zfs:zpool" zfs_load="YES" Filesystem was taken from /dist/ on the livefs. All of it is read- only except /boot/zfs. Thoughts? = = = = = = = = ======================================================================== Once again, I thank you for the time and the attention that you bring to my issue and any light that you may be able to shed on this; I am fairly certain that it is an oversight, unfortunately I don't understand enough about the process at this point to be able to formulate a hypothesis. Patrick Dorion From owner-freebsd-fs@FreeBSD.ORG Thu Sep 29 21:35:18 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9C3A8106566B for ; Thu, 29 Sep 2011 21:35:18 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-yx0-f182.google.com (mail-yx0-f182.google.com [209.85.213.182]) by mx1.freebsd.org (Postfix) with ESMTP id 5E2B28FC1A for ; Thu, 29 Sep 2011 21:35:18 +0000 (UTC) Received: by yxk36 with SMTP id 36so1310293yxk.13 for ; Thu, 29 Sep 2011 14:35:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=QiV5X1e0r6EFuQ+5+Hdiy68qo/4Y+HJXH9xveE72BW0=; b=EkGps6Eo4XxyrexyqgCeRzGCo3ZyTeqKDIU2Pd0owhR/x85QQJSOAFjwXcEmv904c7 HINF2MEzidG+mhg+uajiqaN9m87pSUtzi/JnUpssj2hmYCcbHiLO0yiPJ53jRcKspDmJ HGNc/FqdcnkZRWFtkc4DCwI1KZ4H5n9wlAl0U= MIME-Version: 1.0 Received: by 10.236.191.71 with SMTP id f47mr67021247yhn.125.1317332117571; Thu, 29 Sep 2011 14:35:17 -0700 (PDT) Sender: artemb@gmail.com Received: by 10.236.102.147 with HTTP; Thu, 29 Sep 2011 14:35:17 -0700 (PDT) In-Reply-To: References: Date: Thu, 29 Sep 2011 14:35:17 -0700 X-Google-Sender-Auth: D1CGLAHEI1JypVOl-IWl5EQDMms Message-ID: From: Artem Belevich To: Patrick Dorion Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-fs@freebsd.org Subject: Re: 'kernel' not found - ZFS on GPT boot X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Sep 2011 21:35:18 -0000 On Thu, Sep 29, 2011 at 11:22 AM, Patrick Dorion wrote: > Fixit# cd /zpool/boot > Fixit# ls -l > drw------- 2 root 0 2 Feb 17 2011 zfs/ Bootloader seems to have good reason to complain as there seems to be no kernel installed on the filesystem you're booting from. --Artem From owner-freebsd-fs@FreeBSD.ORG Fri Sep 30 10:46:49 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 984A21065670 for ; Fri, 30 Sep 2011 10:46:49 +0000 (UTC) (envelope-from peterjeremy@acm.org) Received: from mail26.syd.optusnet.com.au (mail26.syd.optusnet.com.au [211.29.133.167]) by mx1.freebsd.org (Postfix) with ESMTP id E3AA68FC0C for ; Fri, 30 Sep 2011 10:46:48 +0000 (UTC) Received: from server.vk2pj.dyndns.org (c220-239-116-103.belrs4.nsw.optusnet.com.au [220.239.116.103]) by mail26.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id p8UAkjEQ005484 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 30 Sep 2011 20:46:46 +1000 X-Bogosity: Ham, spamicity=0.000000 Received: from server.vk2pj.dyndns.org (localhost.vk2pj.dyndns.org [127.0.0.1]) by server.vk2pj.dyndns.org (8.14.5/8.14.4) with ESMTP id p8UAkiMk073875; Fri, 30 Sep 2011 20:46:44 +1000 (EST) (envelope-from peter@server.vk2pj.dyndns.org) Received: (from peter@localhost) by server.vk2pj.dyndns.org (8.14.5/8.14.4/Submit) id p8UAkhab073862; Fri, 30 Sep 2011 20:46:43 +1000 (EST) (envelope-from peter) Date: Fri, 30 Sep 2011 20:46:43 +1000 From: Peter Jeremy To: Patrick Dorion Message-ID: <20110930104643.GB51227@server.vk2pj.dyndns.org> References: MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="IrhDeMKUP4DT/M7F" Content-Disposition: inline In-Reply-To: X-PGP-Key: http://members.optusnet.com.au/peterjeremy/pubkey.asc User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org Subject: Re: 'kernel' not found - ZFS on GPT boot X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Sep 2011 10:46:49 -0000 --IrhDeMKUP4DT/M7F Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2011-Sep-29 14:22:45 -0400, Patrick Dorion wro= te: >I would greatly appreciate any input that could shed light on my =20 >issue. Essentially, a seemingly well-configured ZFS system residing =20 >on a GPT partition is not detected by zfsloader. Your log shows you've correctly used gptzfsloader. >Fixit# cd /zpool/boot >Fixit# ls -l >drw------- 2 root 0 2 Feb 17 2011 zfs/ > >Filesystem was taken from /dist/ on the livefs. All of it is read-=20 >only except /boot/zfs. Do you have /boot/zfs/zpool.cache on zpool? That's the most obvious item that I don't see in your log. I presume you're aware of http://wiki.freebsd.org/RootOnZFS/GPTZFSBoot If that's not the procedure you followed, you might like to cross- check the steps fou took against that page. --=20 Peter Jeremy --IrhDeMKUP4DT/M7F Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (FreeBSD) iEYEARECAAYFAk6FnhMACgkQ/opHv/APuIf7wgCfYrBe/XC1Jaztw1npTX4xM394 bawAn3Ma9rNdUzNPpRvpIv0j1UQ9GAE0 =kL8r -----END PGP SIGNATURE----- --IrhDeMKUP4DT/M7F-- From owner-freebsd-fs@FreeBSD.ORG Fri Sep 30 12:48:25 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5125C106564A for ; Fri, 30 Sep 2011 12:48:25 +0000 (UTC) (envelope-from dorionpatrick@gmail.com) Received: from mail-qw0-f54.google.com (mail-qw0-f54.google.com [209.85.216.54]) by mx1.freebsd.org (Postfix) with ESMTP id 0A0798FC13 for ; Fri, 30 Sep 2011 12:48:24 +0000 (UTC) Received: by qadz30 with SMTP id z30so327835qad.13 for ; Fri, 30 Sep 2011 05:48:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=message-id:from:to:in-reply-to:content-type :content-transfer-encoding:mime-version:subject:date:references :x-mailer; bh=APYXleKxONhwuagsW3YmwUQq+03Tx67+tvYDjku6H5s=; b=SfiONtvVUkIqgBPgHVbSNsYkpBl0YyJgkfbuPJPQL98m/2rhqOc0I3pF3xwK0lWdiH W1VIYA0zpPI5UAOkkkiGgVV2teZJI39bShi6OBkaAZO4nvJ9uXY5iPmz1/iqHAWckaac lYajNOH9obGrTIBBEpGJakNw97y8BZMoW4RC0= Received: by 10.224.176.4 with SMTP id bc4mr8925448qab.6.1317386904153; Fri, 30 Sep 2011 05:48:24 -0700 (PDT) Received: from [192.168.5.245] (178-82-252-216-static.colba.net. [216.252.82.178]) by mx.google.com with ESMTPS id gu10sm5381975qab.2.2011.09.30.05.48.15 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 30 Sep 2011 05:48:22 -0700 (PDT) Message-Id: <0BC125B4-A1BA-442B-B498-7A14EE07441C@gmail.com> From: Patrick Dorion To: freebsd-fs@freebsd.org In-Reply-To: <20110930104643.GB51227@server.vk2pj.dyndns.org> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v936) Date: Fri, 30 Sep 2011 08:48:09 -0400 References: <20110930104643.GB51227@server.vk2pj.dyndns.org> X-Mailer: Apple Mail (2.936) Subject: Resolved: 'kernel' not found - ZFS on GPT boot X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Sep 2011 12:48:25 -0000 Well I am pleased to report that I was successful in booting FreeBSD 8.2-RELEASE from a ZFS filesystem on a GPT partition. I followed the guide. I don't believe to have done anything different this time around, aside installing the default boot loader using fdisk. Perhaps there was a conflict, I'm not certain... It works! Thank you all, Patrick Dorion On Sep 30, 2011, at 6:46 AM, Peter Jeremy wrote: > On 2011-Sep-29 14:22:45 -0400, Patrick Dorion > wrote: >> I would greatly appreciate any input that could shed light on my >> issue. Essentially, a seemingly well-configured ZFS system residing >> on a GPT partition is not detected by zfsloader. > > Your log shows you've correctly used gptzfsloader. > >> Fixit# cd /zpool/boot >> Fixit# ls -l >> drw------- 2 root 0 2 Feb 17 2011 zfs/ >> >> Filesystem was taken from /dist/ on the livefs. All of it is read- >> only except /boot/zfs. > > Do you have /boot/zfs/zpool.cache on zpool? That's the most obvious > item that I don't see in your log. > > I presume you're aware of http://wiki.freebsd.org/RootOnZFS/GPTZFSBoot > If that's not the procedure you followed, you might like to cross- > check the steps fou took against that page. > > -- > Peter Jeremy From owner-freebsd-fs@FreeBSD.ORG Fri Sep 30 13:25:24 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 37EBF106566B; Fri, 30 Sep 2011 13:25:24 +0000 (UTC) (envelope-from mckusick@mckusick.com) Received: from chez.mckusick.com (chez.mckusick.com [70.36.157.235]) by mx1.freebsd.org (Postfix) with ESMTP id 07B0E8FC15; Fri, 30 Sep 2011 13:25:23 +0000 (UTC) Received: from chez.mckusick.com (localhost [127.0.0.1]) by chez.mckusick.com (8.14.3/8.14.3) with ESMTP id p8UDPPG2072859; Fri, 30 Sep 2011 06:25:25 -0700 (PDT) (envelope-from mckusick@chez.mckusick.com) Message-Id: <201109301325.p8UDPPG2072859@chez.mckusick.com> To: Attilio Rao In-reply-to: Date: Fri, 30 Sep 2011 06:25:25 -0700 From: Kirk McKusick X-Spam-Status: No, score=0.0 required=5.0 tests=MISSING_MID, UNPARSEABLE_RELAY autolearn=failed version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on chez.mckusick.com Cc: Garrett Cooper , freebsd-fs@freebsd.org, Konstantin Belousov , Xin LI Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Sep 2011 13:25:24 -0000 > Date: Thu, 29 Sep 2011 18:13:34 +0200 > Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? > From: Attilio Rao > To: Kirk McKusick , > Konstantin Belousov > Cc: Garrett Cooper , freebsd-fs@freebsd.org, > Xin LI > > 2011/9/29 Kirk McKusick : > > > I am definitely not in a rush on this, so by all means take some time > > to look it over. The EBUSY unmount has been in its current state > > for several years, so I am fine with taking a few weeks to sort out > > the correct solution. Indeed, I am glad that Garrett has volunteered > > to do some more serious testing. > > > > If this general approach is not correct, I can put a hook in for just > > UFS so that it can have its historic behavior. As you have noted, the > > SU code has a lot of activity that gets done under the protection of > > vfs_busy. So it may be the only filesystem for which draining the > > vfs_busy lock during unmount is needed. > > Honestly, my first thought was exactly that -- an option that was > forcing the unmount sleep if SU is compiled in the kernel. The above would solve the problem at hand (NanoBSD builds). But I believe that this issue can arise with any filesystem that has background behavior such as ZFS and NFS. It is just most evident with UFS using SU. > When you mention 'historical behaviour' you mean the behaviour UFS had > even prior the introduction of the 'complete' VFS layer or it was the > behaviour unmount(2) was expected to implemented since the beginning? Synchronous unmount has exisited since at least the UNIX versions that I used in the 1970's. > My guess is that recent SU improvement by you and Jeff may have lead > to higher vfs_busy() contention, thus making this behaviour just more > visible. You are correct. SU has a lot of background activity. And as memories have grown bigger, the backlog has been able to get bigger and hence more noticable. On a busy system I have measured over fourty calls to "sync; sleep 1; umount" before the filesystem would finally unmount successfully. > BTW, I'm afraid the forced unmount case may have a possible deadlock. > > thread1 is doing whatever codepath it wants and thread2 is doing > unmount (forced right now): > > thread1::vfs_busy() > thread2::lock coveredvnode > thread1::contests coveredvnode > thread2::sleep because of thread1 vfs_busy I agree that the above deadlock is possible. See suggested solution below. > I think this deadlock was actually possible even with the old code, it > was just a LOR between a vnode lock and mount lock. Yes, this is a long-standing issue. > I'm not sure if there was any invariant I discussed with Kostik in the > past, preventing one way or another I'm forgetting about, but it seems > a possible deadlock to me. > > If you see this issue I'll make a patch for it. > > Attilio Here is my proposed fix. It does the unroll originally found in the non-FORCE case before sleeping waiting for the vfs_busy to clear. Is it acceptable to hold the mount mutex while calling VOP_UNLOCK? If not, then it needs to be released before the unlock, reacquired afterwards, and the check to see if the sleep is needed redone. Index: vfs_mount.c =================================================================== --- vfs_mount.c (revision 225881) +++ vfs_mount.c (working copy) @@ -1187,6 +1187,7 @@ mtx_assert(&Giant, MA_OWNED); +top: if ((coveredvp = mp->mnt_vnodecovered) != NULL) { mnt_gen_r = mp->mnt_gen; VI_LOCK(coveredvp); @@ -1227,21 +1228,19 @@ mp->mnt_kern_flag |= MNTK_UNMOUNTF; error = 0; if (mp->mnt_lockref) { - if ((flags & MNT_FORCE) == 0) { - mp->mnt_kern_flag &= ~(MNTK_UNMOUNT | MNTK_NOINSMNTQ | - MNTK_UNMOUNTF); - if (mp->mnt_kern_flag & MNTK_MWAIT) { - mp->mnt_kern_flag &= ~MNTK_MWAIT; - wakeup(mp); - } - MNT_IUNLOCK(mp); - if (coveredvp) - VOP_UNLOCK(coveredvp, 0); - return (EBUSY); + mp->mnt_kern_flag &= ~(MNTK_UNMOUNT | MNTK_NOINSMNTQ | + MNTK_UNMOUNTF); + if (mp->mnt_kern_flag & MNTK_MWAIT) { + mp->mnt_kern_flag &= ~MNTK_MWAIT; + wakeup(mp); } + if (coveredvp) + VOP_UNLOCK(coveredvp, 0); mp->mnt_kern_flag |= MNTK_DRAINING; error = msleep(&mp->mnt_lockref, MNT_MTX(mp), PVFS, "mount drain", 0); + MNT_IUNLOCK(mp); + goto top; } MNT_IUNLOCK(mp); KASSERT(mp->mnt_lockref == 0, Does this seem like the correct fix to you? Kirk McKusick From owner-freebsd-fs@FreeBSD.ORG Fri Sep 30 13:31:58 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F13EF1065762; Fri, 30 Sep 2011 13:31:58 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: from mail-ww0-f42.google.com (mail-ww0-f42.google.com [74.125.82.42]) by mx1.freebsd.org (Postfix) with ESMTP id 26D058FC18; Fri, 30 Sep 2011 13:31:57 +0000 (UTC) Received: by wwn22 with SMTP id 22so751229wwn.1 for ; Fri, 30 Sep 2011 06:31:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=sr6m1VuhRFOYn0RvmWXeS0rBf4Piz5oQGKmGUoUy5rU=; b=gVUeR3B01xnTSJ1duuy4jUYhTt5OzaME23Raxdf419tBHmMlmLj63+8T+285UVvvEU Ds0rL3Smw1G/9v8Rt6yNZrm/h/xNHfe0A8NXqAU0aXDP53l3KUUPNVOeKR5QCOFAn3Ly 7JG9MuFuLjf1qCVU7DzWf05l1Rp75NC6aTJ24= MIME-Version: 1.0 Received: by 10.216.229.134 with SMTP id h6mr1426921weq.42.1317389516673; Fri, 30 Sep 2011 06:31:56 -0700 (PDT) Sender: asmrookie@gmail.com Received: by 10.216.182.3 with HTTP; Fri, 30 Sep 2011 06:31:56 -0700 (PDT) In-Reply-To: <201109301325.p8UDPPG2072859@chez.mckusick.com> References: <201109301325.p8UDPPG2072859@chez.mckusick.com> Date: Fri, 30 Sep 2011 15:31:56 +0200 X-Google-Sender-Auth: vuCQk4J-SY05wOt9YBfo8Rp6mGI Message-ID: From: Attilio Rao To: Kirk McKusick Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: Garrett Cooper , freebsd-fs@freebsd.org, Konstantin Belousov , Xin LI Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Sep 2011 13:31:59 -0000 2011/9/30 Kirk McKusick : >> Date: Thu, 29 Sep 2011 18:13:34 +0200 >> Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? >> From: Attilio Rao >> To: Kirk McKusick , >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 Konstantin Belousov >> Cc: Garrett Cooper , freebsd-fs@freebsd.org, >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 Xin LI >> >> 2011/9/29 Kirk McKusick : >> >> > I am definitely not in a rush on this, so by all means take some time >> > to look it over. The EBUSY unmount has been in its current state >> > for several years, so I am fine with taking a few weeks to sort out >> > the correct solution. Indeed, I am glad that Garrett has volunteered >> > to do some more serious testing. >> > >> > If this general approach is not correct, I can put a hook in for just >> > UFS so that it can have its historic behavior. As you have noted, the >> > SU code has a lot of activity that gets done under the protection of >> > vfs_busy. So it may be the only filesystem for which draining the >> > vfs_busy lock during unmount is needed. >> >> Honestly, my first thought was exactly that -- an option that was >> forcing the unmount sleep if SU is compiled in the kernel. > > The above would solve the problem at hand (NanoBSD builds). But I > believe that this issue can arise with any filesystem that has > background behavior such as ZFS and NFS. It is just most evident > with UFS using SU. > >> When you mention 'historical behaviour' you mean the behaviour UFS had >> even prior the introduction of the 'complete' VFS layer or it was the >> behaviour unmount(2) was expected to implemented since the beginning? > > Synchronous unmount has exisited since at least the UNIX versions > that I used in the 1970's. > >> My guess is that recent SU improvement by you and Jeff may have lead >> to higher vfs_busy() contention, thus making this behaviour just more >> visible. > > You are correct. SU has a lot of background activity. And as memories > have grown bigger, the backlog has been able to get bigger and hence > more noticable. On a busy system I have measured over fourty calls to > "sync; sleep 1; umount" before the filesystem would finally unmount > successfully. > >> BTW, I'm afraid the forced unmount case may have a possible deadlock. >> >> thread1 is doing whatever codepath it wants and thread2 is doing >> unmount (forced right now): >> >> thread1::vfs_busy() >> thread2::lock coveredvnode >> thread1::contests coveredvnode >> thread2::sleep because of thread1 vfs_busy > > I agree that the above deadlock is possible. See suggested solution below= . > >> I think this deadlock was actually possible even with the old code, it >> was just a LOR between a vnode lock and mount lock. > > Yes, this is a long-standing issue. > >> I'm not sure if there was any invariant I discussed with Kostik in the >> past, preventing one way or another I'm forgetting about, but it seems >> a possible deadlock to me. >> >> If you see this issue I'll make a patch for it. >> >> Attilio > > Here is my proposed fix. It does the unroll originally found in the > non-FORCE case before sleeping waiting for the vfs_busy to clear. > Is it acceptable to hold the mount mutex while calling VOP_UNLOCK? > If not, then it needs to be released before the unlock, reacquired > afterwards, and the check to see if the sleep is needed redone. I thought about this approach when sending the e-mail, but there is a problem: you need to handle the MNTK_UNMOUNT flag checking and subsequent setting after coveredvnode is held, otherwise at the first looping you will just return EBUSY. > Index: vfs_mount.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- vfs_mount.c (revision 225881) > +++ vfs_mount.c (working copy) > @@ -1187,6 +1187,7 @@ > > =C2=A0 =C2=A0 =C2=A0 =C2=A0mtx_assert(&Giant, MA_OWNED); > > +top: > =C2=A0 =C2=A0 =C2=A0 =C2=A0if ((coveredvp =3D mp->mnt_vnodecovered) !=3D = NULL) { > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0mnt_gen_r =3D mp->= mnt_gen; > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0VI_LOCK(coveredvp)= ; > @@ -1227,21 +1228,19 @@ > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0mp->mnt_kern_flag = |=3D MNTK_UNMOUNTF; > =C2=A0 =C2=A0 =C2=A0 =C2=A0error =3D 0; > =C2=A0 =C2=A0 =C2=A0 =C2=A0if (mp->mnt_lockref) { > - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if ((flags & MNT_FORCE= ) =3D=3D 0) { > - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 mp->mnt_kern_flag &=3D ~(MNTK_UNMOUNT | MNTK_NOINSMNTQ | > - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 MNTK_UNMOUNTF); > - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 if (mp->mnt_kern_flag & MNTK_MWAIT) { > - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 mp->mnt_kern_flag &=3D ~MNTK_MWAIT; > - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 wakeup(mp); > - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 } > - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 MNT_IUNLOCK(mp); > - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 if (coveredvp) > - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 VOP_UNLOCK(coveredvp, 0); > - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 return (EBUSY); > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 mp->mnt_kern_flag &=3D= ~(MNTK_UNMOUNT | MNTK_NOINSMNTQ | > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 MNTK_UNM= OUNTF); > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (mp->mnt_kern_flag = & MNTK_MWAIT) { > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 mp->mnt_kern_flag &=3D ~MNTK_MWAIT; > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 wakeup(mp); > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0} > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (coveredvp) > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 VOP_UNLOCK(coveredvp, 0); > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0mp->mnt_kern_flag = |=3D MNTK_DRAINING; > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0error =3D msleep(&= mp->mnt_lockref, MNT_MTX(mp), PVFS, > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"mou= nt drain", 0); > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 MNT_IUNLOCK(mp); > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 goto top; You can avoid the unlock by passing PVFS | PDROP. Attilio --=20 Peace can only be achieved by understanding - A. Einstein From owner-freebsd-fs@FreeBSD.ORG Fri Sep 30 14:51:57 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 13951106566B; Fri, 30 Sep 2011 14:51:57 +0000 (UTC) (envelope-from jwd@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 04B518FC12; Fri, 30 Sep 2011 14:51:57 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p8UEpuTM048313; Fri, 30 Sep 2011 14:51:56 GMT (envelope-from jwd@freefall.freebsd.org) Received: (from jwd@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p8UEpuqv048312; Fri, 30 Sep 2011 14:51:56 GMT (envelope-from jwd) Date: Fri, 30 Sep 2011 14:51:56 +0000 From: John To: freebsd-fs@freebsd.org, freebsd-current@freebsd.org Message-ID: <20110930145156.GA2504@FreeBSD.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.3i Cc: Subject: Create a gpart in a multipath container? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Sep 2011 14:51:57 -0000 Hi Folks, I'm trying to create a set of partitions inside of a multipath container. For this discussion, a drive in a shelf hanging off a pair of controllers: # camcontrol inquiry da0 pass2: Fixed Direct Access SCSI-5 device pass2: Serial Number 3TB1BKGX00009036W9EN pass2: 600.000MB/s transfers, Command Queueing Enabled # camcontrol inquiry da25 pass27: Fixed Direct Access SCSI-5 device pass27: Serial Number 3TB1BKGX00009036W9EN pass27: 600.000MB/s transfers, Command Queueing Enabled The multipath container: # gmultipath label Z0 da0 da25 # gmultipath list Geom name: Z0 Providers: 1. Name: multipath/Z0 Mediasize: 146815737344 (136G) Sectorsize: 512 Mode: r0w0e0 Consumers: 1. Name: da0 Mediasize: 146815737856 (136G) Sectorsize: 512 Mode: r0w0e0 2. Name: da25 Mediasize: 146815737856 (136G) Sectorsize: 512 Mode: r0w0e0 Note, at this point I can create multipath containers from the rest of the drives and create zfs volume with no problems. However, I'd now like to create a set of partitions inside of the multipath container. # gpart create -s gpt multipath/Z0 multipath/Z0 created # gpart add -s 1m -t freebsd-ufs -l Z0test multipath/Z0 multipath/Z0p1 added # gpart add -t freebsd-zfs -l Z0 multipath/Z0 The created partition looks correct, but two additional geoms have been created that are corrupt: Geom name: multipath/Z0 modified: false state: OK fwheads: 255 fwsectors: 63 last: 286749453 first: 34 entries: 128 scheme: GPT Providers: 1. Name: multipath/Z0p1 Mediasize: 1048576 (1.0M) Sectorsize: 512 Stripesize: 0 Stripeoffset: 17408 Mode: r0w0e0 rawuuid: f6a44058-eb72-11e0-8eb1-001e4f258317 rawtype: 516e7cb6-6ecf-11d6-8ff8-00022d09712b label: Z0test length: 1048576 offset: 17408 type: freebsd-ufs index: 1 end: 2081 start: 34 2. Name: multipath/Z0p2 Mediasize: 146814654464 (136G) Sectorsize: 512 Stripesize: 0 Stripeoffset: 1065984 Mode: r0w0e0 rawuuid: 0a064a37-eb73-11e0-8eb1-001e4f258317 rawtype: 516e7cba-6ecf-11d6-8ff8-00022d09712b label: Z0 length: 146814654464 offset: 1065984 type: freebsd-zfs index: 2 end: 286749453 start: 2082 Consumers: 1. Name: multipath/Z0 Mediasize: 146815737344 (136G) Sectorsize: 512 Mode: r0w0e0 These show up and are corrupt: Geom name: da0 modified: false state: CORRUPT fwheads: 255 fwsectors: 63 last: 286749453 first: 34 entries: 128 scheme: GPT Providers: 1. Name: da0p1 Mediasize: 1048576 (1.0M) Sectorsize: 512 Stripesize: 0 Stripeoffset: 17408 Mode: r0w0e0 rawuuid: f6a44058-eb72-11e0-8eb1-001e4f258317 rawtype: 516e7cb6-6ecf-11d6-8ff8-00022d09712b label: Z0test length: 1048576 offset: 17408 type: freebsd-ufs index: 1 end: 2081 start: 34 2. Name: da0p2 Mediasize: 146814654464 (136G) Sectorsize: 512 Stripesize: 0 Stripeoffset: 1065984 Mode: r0w0e0 rawuuid: 0a064a37-eb73-11e0-8eb1-001e4f258317 rawtype: 516e7cba-6ecf-11d6-8ff8-00022d09712b label: Z0 length: 146814654464 offset: 1065984 type: freebsd-zfs index: 2 end: 286749453 start: 2082 Consumers: 1. Name: da0 Mediasize: 146815737856 (136G) Sectorsize: 512 Mode: r0w0e0 And the same for da25. Finally, these messages show up: GEOM_MULTIPATH: adding da0 to Z0/ac33be7a-eb68-11e0-97dd-001e4f258317 GEOM_MULTIPATH: da0 now active path in Z0 GEOM: da0: the secondary GPT header is not in the last LBA. GEOM_MULTIPATH: adding da25 to Z0/ac33be7a-eb68-11e0-97dd-001e4f258317 GEOM: da25: the secondary GPT header is not in the last LBA. GEOM: da0: the secondary GPT header is not in the last LBA. GEOM: da25: the secondary GPT header is not in the last LBA. GEOM: da0: the secondary GPT header is not in the last LBA. And after a reboot, the multipath container is gone, and this is found for da0/da25. It can be recovered, but it seems to have taken over the multipath container. Geom name: da0 modified: false state: CORRUPT fwheads: 255 fwsectors: 63 last: 286749453 first: 34 entries: 128 scheme: GPT Providers: 1. Name: da0p1 Mediasize: 1048576 (1.0M) Sectorsize: 512 Stripesize: 0 Stripeoffset: 17408 Mode: r0w0e0 rawuuid: 227634d0-eb6a-11e0-97dd-001e4f258317 rawtype: 516e7cb6-6ecf-11d6-8ff8-00022d09712b label: Z0test length: 1048576 offset: 17408 type: freebsd-ufs index: 1 end: 2081 start: 34 2. Name: da0p2 Mediasize: 146814654464 (136G) Sectorsize: 512 Stripesize: 0 Stripeoffset: 1065984 Mode: r0w0e0 rawuuid: 6b285348-eb6a-11e0-97dd-001e4f258317 rawtype: 516e7cba-6ecf-11d6-8ff8-00022d09712b label: Z0 length: 146814654464 offset: 1065984 type: freebsd-zfs index: 2 end: 286749453 start: 2082 Consumers: 1. Name: da0 Mediasize: 146815737856 (136G) Sectorsize: 512 Mode: r0w0e0 Apologies for the long-winded explanation. I hope it made sense. Is there a way to make this work? Is there a better way to configure this? I'd like the partitions to be protected by multipathing which lead me to try this. Am I missing something totally obvious? I've been looking at the code and I'm thinking there is an issue between a real physical disk container vs a partition and sizing. Any comments are appreciated. Thanks, John From owner-freebsd-fs@FreeBSD.ORG Fri Sep 30 15:07:43 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5E8BA106564A; Fri, 30 Sep 2011 15:07:43 +0000 (UTC) (envelope-from bu7cher@yandex.ru) Received: from forward8.mail.yandex.net (forward8.mail.yandex.net [IPv6:2a02:6b8:0:202::3]) by mx1.freebsd.org (Postfix) with ESMTP id C5C618FC12; Fri, 30 Sep 2011 15:07:42 +0000 (UTC) Received: from smtp8.mail.yandex.net (smtp8.mail.yandex.net [77.88.61.54]) by forward8.mail.yandex.net (Yandex) with ESMTP id 2C306F625F1; Fri, 30 Sep 2011 19:07:41 +0400 (MSD) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1317395261; bh=LPjgcuG6h1nFxA3TM22MnnV3l87I6+sf6WwZPkAOxmA=; h=Message-ID:Date:From:MIME-Version:To:CC:Subject:References: In-Reply-To:Content-Type; b=nH6Q2m0MbGbzMPuq1RQW9fUDbqW5FSZPLhxkFvUORsJOnAez17Z6W4ndMiHlWtnw6 IoONLZd0NGp75Rvwj4s4szHcDq3+h8+hRl1Tk1k0v4wcXeql3454jZuiXGHgL2AApJ biKQIl1ZYj2/uqeTWlLRUHjWxsdqRUmjUEnB2ZXA= Received: from smtp8.mail.yandex.net (localhost [127.0.0.1]) by smtp8.mail.yandex.net (Yandex) with ESMTP id 027B01B6008A; Fri, 30 Sep 2011 19:07:40 +0400 (MSD) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1317395261; bh=LPjgcuG6h1nFxA3TM22MnnV3l87I6+sf6WwZPkAOxmA=; h=Message-ID:Date:From:MIME-Version:To:CC:Subject:References: In-Reply-To:Content-Type; b=nH6Q2m0MbGbzMPuq1RQW9fUDbqW5FSZPLhxkFvUORsJOnAez17Z6W4ndMiHlWtnw6 IoONLZd0NGp75Rvwj4s4szHcDq3+h8+hRl1Tk1k0v4wcXeql3454jZuiXGHgL2AApJ biKQIl1ZYj2/uqeTWlLRUHjWxsdqRUmjUEnB2ZXA= Received: from dynamic-178-141-5-236.kirov.comstar-r.ru (dynamic-178-141-5-236.kirov.comstar-r.ru [178.141.5.236]) by smtp8.mail.yandex.net (nwsmtp/Yandex) with ESMTP id 7eeOu7Z5-7eeCKEAV; Fri, 30 Sep 2011 19:07:40 +0400 X-Yandex-Spam: 1 Message-ID: <4E85DB06.5070502@yandex.ru> Date: Fri, 30 Sep 2011 19:06:46 +0400 From: "Andrey V. Elsukov" User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.17) Gecko/20110429 Thunderbird/3.1.10 MIME-Version: 1.0 To: John References: <20110930145156.GA2504@FreeBSD.org> In-Reply-To: <20110930145156.GA2504@FreeBSD.org> X-Enigmail-Version: 1.1.2 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig8D8934D1F8478E602080DA4D" Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org Subject: Re: Create a gpart in a multipath container? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Sep 2011 15:07:43 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig8D8934D1F8478E602080DA4D Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: quoted-printable On 30.09.2011 18:51, John wrote: > Apologies for the long-winded explanation. I hope it made sense. >=20 > Is there a way to make this work? Is there a better way to configure > this? I'd like the partitions to be protected by multipathing which > lead me to try this. Am I missing something totally obvious? >=20 > I've been looking at the code and I'm thinking there is an issue betwee= n > a real physical disk container vs a partition and sizing. >=20 > Any comments are appreciated. Do you have loaded geom_multipath module after reboot? --=20 WBR, Andrey V. Elsukov --------------enig8D8934D1F8478E602080DA4D Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (FreeBSD) iQEcBAEBAgAGBQJOhdsLAAoJEAHF6gQQyKF65qsIAMHAdX+kizk991xFCekV69Kq P4mG+EafZLJjBapo+66Z9PNiPjJ/f3tX9ahqyYI+j75b1OvCJqxyQWHWJpUaiO30 lmm6z/CYNDGgJryxU32ITWFVRifGVjkmac26v0cd57are7BDzdHgU/vs8dyUXHlh 4TXB0B5+JGBX5xCt7WK402dMH8yDYlTWiud/XAundFh6wCxVXHP8CrVdam44A/Cc gTVnB1uZWrEI05rJMbd7BHZTJFts/KBlzusUMXgR/yuJ+MQQPagYPx5KImHfY+vK iBjGV2GBzD8XClGmyN0cEmMSbPT5zJavedM61BzwCSN9E+xv4eYLYYizKnQceZk= =yHAK -----END PGP SIGNATURE----- --------------enig8D8934D1F8478E602080DA4D-- From owner-freebsd-fs@FreeBSD.ORG Fri Sep 30 18:20:27 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CFC6E1065740; Fri, 30 Sep 2011 18:20:27 +0000 (UTC) (envelope-from mckusick@mckusick.com) Received: from chez.mckusick.com (chez.mckusick.com [70.36.157.235]) by mx1.freebsd.org (Postfix) with ESMTP id 917CE8FC15; Fri, 30 Sep 2011 18:20:27 +0000 (UTC) Received: from chez.mckusick.com (localhost [127.0.0.1]) by chez.mckusick.com (8.14.3/8.14.3) with ESMTP id p8UIKSGj039445; Fri, 30 Sep 2011 11:20:29 -0700 (PDT) (envelope-from mckusick@chez.mckusick.com) Message-Id: <201109301820.p8UIKSGj039445@chez.mckusick.com> To: Attilio Rao In-reply-to: Date: Fri, 30 Sep 2011 11:20:28 -0700 From: Kirk McKusick X-Spam-Status: No, score=0.0 required=5.0 tests=MISSING_MID, UNPARSEABLE_RELAY autolearn=failed version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on chez.mckusick.com Cc: Garrett Cooper , freebsd-fs@freebsd.org, Konstantin Belousov , Xin LI Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Sep 2011 18:20:27 -0000 > Date: Fri, 30 Sep 2011 15:31:56 +0200 > Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? > From: Attilio Rao > To: Kirk McKusick > Cc: Konstantin Belousov , > Garrett Cooper , > freebsd-fs@freebsd.org, Xin LI > > 2011/9/30 Kirk McKusick : > > > Here is my proposed fix. It does the unroll originally found in the > > non-FORCE case before sleeping waiting for the vfs_busy to clear. > > Is it acceptable to hold the mount mutex while calling VOP_UNLOCK? > > If not, then it needs to be released before the unlock, reacquired > > afterwards, and the check to see if the sleep is needed redone. > > I thought about this approach when sending the e-mail, but there is a > problem: you need to handle the MNTK_UNMOUNT flag checking and > subsequent setting after coveredvnode is held, otherwise at the first > looping you will just return EBUSY. > > You can avoid the unlock by passing PVFS | PDROP. > > Attilio Problem noted. I have updated the patch to clear the MNTK_UNMOUNT (and other flags set above it) after it returns from the sleep. Which means I cannot use the PDROP flag now, but it is good to know about it for future reference. Still not clear to me if it is acceptable to hold the mount mutex while calling VOP_UNLOCK. Should I drop the mount mutex around the VOP_UNLOCK(coveredvp)? Other than that possible problem, this patch appears to solve the EBUSY problem and avoid possible deadlocks. Kirk McKusick Index: sys/kern/vfs_mount.c =================================================================== --- sys/kern/vfs_mount.c (revision 225884) +++ sys/kern/vfs_mount.c (working copy) @@ -1187,6 +1187,7 @@ mtx_assert(&Giant, MA_OWNED); +top: if ((coveredvp = mp->mnt_vnodecovered) != NULL) { mnt_gen_r = mp->mnt_gen; VI_LOCK(coveredvp); @@ -1227,21 +1228,19 @@ mp->mnt_kern_flag |= MNTK_UNMOUNTF; error = 0; if (mp->mnt_lockref) { - if ((flags & MNT_FORCE) == 0) { - mp->mnt_kern_flag &= ~(MNTK_UNMOUNT | MNTK_NOINSMNTQ | - MNTK_UNMOUNTF); - if (mp->mnt_kern_flag & MNTK_MWAIT) { - mp->mnt_kern_flag &= ~MNTK_MWAIT; - wakeup(mp); - } - MNT_IUNLOCK(mp); - if (coveredvp) - VOP_UNLOCK(coveredvp, 0); - return (EBUSY); + if (mp->mnt_kern_flag & MNTK_MWAIT) { + mp->mnt_kern_flag &= ~MNTK_MWAIT; + wakeup(mp); } + if (coveredvp) + VOP_UNLOCK(coveredvp, 0); mp->mnt_kern_flag |= MNTK_DRAINING; error = msleep(&mp->mnt_lockref, MNT_MTX(mp), PVFS, "mount drain", 0); + mp->mnt_kern_flag &= ~(MNTK_UNMOUNT | MNTK_NOINSMNTQ | + MNTK_UNMOUNTF); + MNT_IUNLOCK(mp); + goto top; } MNT_IUNLOCK(mp); KASSERT(mp->mnt_lockref == 0, From owner-freebsd-fs@FreeBSD.ORG Fri Sep 30 20:19:00 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 23705106566B; Fri, 30 Sep 2011 20:19:00 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id B10358FC0C; Fri, 30 Sep 2011 20:18:59 +0000 (UTC) Received: from alf.home (alf.kiev.zoral.com.ua [10.1.1.177]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id p8UKIqoZ037219 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 30 Sep 2011 23:18:52 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from alf.home (kostik@localhost [127.0.0.1]) by alf.home (8.14.5/8.14.5) with ESMTP id p8UKIqfR058649; Fri, 30 Sep 2011 23:18:52 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by alf.home (8.14.5/8.14.5/Submit) id p8UKIpHj058648; Fri, 30 Sep 2011 23:18:51 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: alf.home: kostik set sender to kostikbel@gmail.com using -f Date: Fri, 30 Sep 2011 23:18:51 +0300 From: Kostik Belousov To: Kirk McKusick Message-ID: <20110930201851.GB1511@deviant.kiev.zoral.com.ua> References: <201109301820.p8UIKSGj039445@chez.mckusick.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="m4l0o/auqPKO2x83" Content-Disposition: inline In-Reply-To: <201109301820.p8UIKSGj039445@chez.mckusick.com> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-3.3 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00, DNS_FROM_OPENWHOIS autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: Attilio Rao , Garrett Cooper , Xin LI , freebsd-fs@freebsd.org Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Sep 2011 20:19:00 -0000 --m4l0o/auqPKO2x83 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Sep 30, 2011 at 11:20:28AM -0700, Kirk McKusick wrote: > > Date: Fri, 30 Sep 2011 15:31:56 +0200 > > Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? > > From: Attilio Rao > > To: Kirk McKusick > > Cc: Konstantin Belousov , > > Garrett Cooper , > > freebsd-fs@freebsd.org, Xin LI > >=20 > > 2011/9/30 Kirk McKusick : > >=20 > > > Here is my proposed fix. It does the unroll originally found in the > > > non-FORCE case before sleeping waiting for the vfs_busy to clear. > > > Is it acceptable to hold the mount mutex while calling VOP_UNLOCK? > > > If not, then it needs to be released before the unlock, reacquired > > > afterwards, and the check to see if the sleep is needed redone. > >=20 > > I thought about this approach when sending the e-mail, but there is a > > problem: you need to handle the MNTK_UNMOUNT flag checking and > > subsequent setting after coveredvnode is held, otherwise at the first > > looping you will just return EBUSY. > >=20 > > You can avoid the unlock by passing PVFS | PDROP. > >=20 > > Attilio >=20 > Problem noted. I have updated the patch to clear the MNTK_UNMOUNT > (and other flags set above it) after it returns from the sleep. > Which means I cannot use the PDROP flag now, but it is good to > know about it for future reference. >=20 > Still not clear to me if it is acceptable to hold the mount mutex > while calling VOP_UNLOCK. Should I drop the mount mutex around the > VOP_UNLOCK(coveredvp)? Other than that possible problem, this patch > appears to solve the EBUSY problem and avoid possible deadlocks. I do not understand which deadlock is talked about there. It seems thay Attilio concern was with acquiring covered vnode lock after mounted fs is busied, but this is prohibited. See r166167 for more detailed description of the order. >=20 > Kirk McKusick >=20 > Index: sys/kern/vfs_mount.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- sys/kern/vfs_mount.c (revision 225884) > +++ sys/kern/vfs_mount.c (working copy) > @@ -1187,6 +1187,7 @@ > =20 > mtx_assert(&Giant, MA_OWNED); > =20 > +top: > if ((coveredvp =3D mp->mnt_vnodecovered) !=3D NULL) { > mnt_gen_r =3D mp->mnt_gen; > VI_LOCK(coveredvp); > @@ -1227,21 +1228,19 @@ > mp->mnt_kern_flag |=3D MNTK_UNMOUNTF; > error =3D 0; > if (mp->mnt_lockref) { > - if ((flags & MNT_FORCE) =3D=3D 0) { > - mp->mnt_kern_flag &=3D ~(MNTK_UNMOUNT | MNTK_NOINSMNTQ | > - MNTK_UNMOUNTF); > - if (mp->mnt_kern_flag & MNTK_MWAIT) { > - mp->mnt_kern_flag &=3D ~MNTK_MWAIT; > - wakeup(mp); > - } > - MNT_IUNLOCK(mp); > - if (coveredvp) > - VOP_UNLOCK(coveredvp, 0); > - return (EBUSY); > + if (mp->mnt_kern_flag & MNTK_MWAIT) { > + mp->mnt_kern_flag &=3D ~MNTK_MWAIT; > + wakeup(mp); > } > + if (coveredvp) > + VOP_UNLOCK(coveredvp, 0); > mp->mnt_kern_flag |=3D MNTK_DRAINING; > error =3D msleep(&mp->mnt_lockref, MNT_MTX(mp), PVFS, > "mount drain", 0); > + mp->mnt_kern_flag &=3D ~(MNTK_UNMOUNT | MNTK_NOINSMNTQ | > + MNTK_UNMOUNTF); > + MNT_IUNLOCK(mp); > + goto top; > } > MNT_IUNLOCK(mp); > KASSERT(mp->mnt_lockref =3D=3D 0, --m4l0o/auqPKO2x83 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (FreeBSD) iEYEARECAAYFAk6GJCsACgkQC3+MBN1Mb4gjDQCghKC8OgO5SmPn3QAfwjbgBmiC 0yoAoM6YZsEQgGWARcYMPLFOWvCot3yj =UE5c -----END PGP SIGNATURE----- --m4l0o/auqPKO2x83-- From owner-freebsd-fs@FreeBSD.ORG Sat Oct 1 12:39:17 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6D1001065672; Sat, 1 Oct 2011 12:39:17 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: from mail-ww0-f50.google.com (mail-ww0-f50.google.com [74.125.82.50]) by mx1.freebsd.org (Postfix) with ESMTP id D13478FC12; Sat, 1 Oct 2011 12:39:16 +0000 (UTC) Received: by wwe3 with SMTP id 3so3594298wwe.31 for ; Sat, 01 Oct 2011 05:39:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=TDjmCvUHIX2T270snGmALyxWX1nzghKdwIN75dSsmIo=; b=P2Br2ZVapA3vC7VAbrIpx7WW2g3U0nGpxYjkK7PR6blz9rtEJGJVJ8FwMwLAMj9Ev3 BlfEt0xxL+R8n04NyiPvyhmC9obFWHrK0mcikUceMjvS6mvlndDOObyYQmzft61/X7yq apysJKIb5YH7mZiKAPpY6VCrnVQvLnSLi9ZYs= MIME-Version: 1.0 Received: by 10.216.229.134 with SMTP id h6mr2766307weq.42.1317472755183; Sat, 01 Oct 2011 05:39:15 -0700 (PDT) Sender: asmrookie@gmail.com Received: by 10.216.182.3 with HTTP; Sat, 1 Oct 2011 05:39:14 -0700 (PDT) In-Reply-To: <20110930201851.GB1511@deviant.kiev.zoral.com.ua> References: <201109301820.p8UIKSGj039445@chez.mckusick.com> <20110930201851.GB1511@deviant.kiev.zoral.com.ua> Date: Sat, 1 Oct 2011 14:39:14 +0200 X-Google-Sender-Auth: QULF849IdUsrX7u-X0-r-r27FYw Message-ID: From: Attilio Rao To: Kostik Belousov Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: Kirk McKusick , Garrett Cooper , Xin LI , freebsd-fs@freebsd.org Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Oct 2011 12:39:17 -0000 2011/9/30 Kostik Belousov : > On Fri, Sep 30, 2011 at 11:20:28AM -0700, Kirk McKusick wrote: >> > Date: Fri, 30 Sep 2011 15:31:56 +0200 >> > Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? >> > From: Attilio Rao >> > To: Kirk McKusick >> > Cc: Konstantin Belousov , >> > =C2=A0 =C2=A0 Garrett Cooper , >> > =C2=A0 =C2=A0 freebsd-fs@freebsd.org, Xin LI >> > >> > 2011/9/30 Kirk McKusick : >> > >> > > Here is my proposed fix. It does the unroll originally found in the >> > > non-FORCE case before sleeping waiting for the vfs_busy to clear. >> > > Is it acceptable to hold the mount mutex while calling VOP_UNLOCK? >> > > If not, then it needs to be released before the unlock, reacquired >> > > afterwards, and the check to see if the sleep is needed redone. >> > >> > I thought about this approach when sending the e-mail, but there is a >> > problem: you need to handle the MNTK_UNMOUNT flag checking and >> > subsequent setting after coveredvnode is held, otherwise at the first >> > looping you will just return EBUSY. >> > >> > You can avoid the unlock by passing PVFS | PDROP. >> > >> > Attilio >> >> Problem noted. I have updated the patch to clear the MNTK_UNMOUNT >> (and other flags set above it) after it returns from the sleep. >> Which means I cannot use the PDROP flag now, but it is good to >> know about it for future reference. >> >> Still not clear to me if it is acceptable to hold the mount mutex >> while calling VOP_UNLOCK. Should I drop the mount mutex around the >> VOP_UNLOCK(coveredvp)? Other than that possible problem, this patch >> appears to solve the EBUSY problem and avoid possible deadlocks. > I do not understand which deadlock is talked about there. > It seems thay Attilio concern was with acquiring covered vnode lock > after mounted fs is busied, but this is prohibited. > > See r166167 for more detailed description of the order. Ok, so that is the invariant I was forgetting, thanks Kostik. Kirk, you can make the 'forced unmount' behaviour by default for me, now, thanks. It would be great to have a comment on top of vfs_busy() or dounmount() check of mnt_ref on why this deadlock cannot happen, likely squeezing some good words from tegge's description of r166167. Kirk may be the best person to do it, but I can have his backs if he doesn't have time right now. Attilio --=20 Peace can only be achieved by understanding - A. Einstein From owner-freebsd-fs@FreeBSD.ORG Sat Oct 1 14:11:01 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 15B5C1065670 for ; Sat, 1 Oct 2011 14:11:01 +0000 (UTC) (envelope-from kraduk@gmail.com) Received: from mail-gy0-f182.google.com (mail-gy0-f182.google.com [209.85.160.182]) by mx1.freebsd.org (Postfix) with ESMTP id CA2A48FC12 for ; Sat, 1 Oct 2011 14:11:00 +0000 (UTC) Received: by gyf2 with SMTP id 2so2909379gyf.13 for ; Sat, 01 Oct 2011 07:11:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=QItz/kgPzQRoZPmacusO2uHjEVzt2ym3jSGB7cZ94qM=; b=KGWsFyUVxi/Awf0fgC0+TQM9B6qzpt3aWlVD3iMEWoxx68koYXuWYYepRGzAM73yqq azn67GN55bQBS8coEFtD0rtbWphfo0tQRqjeauhwCi8qT1J9g2oSVhCYFu7965LdZQvW a4vUGu1ARrduUnANEqiC1oQ+qIpJpnjMo3s/g= MIME-Version: 1.0 Received: by 10.236.183.170 with SMTP id q30mr79174378yhm.42.1317478260326; Sat, 01 Oct 2011 07:11:00 -0700 (PDT) Received: by 10.236.105.166 with HTTP; Sat, 1 Oct 2011 07:11:00 -0700 (PDT) In-Reply-To: <20110930104643.GB51227@server.vk2pj.dyndns.org> References: <20110930104643.GB51227@server.vk2pj.dyndns.org> Date: Sat, 1 Oct 2011 15:11:00 +0100 Message-ID: From: krad To: Peter Jeremy Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org, Patrick Dorion Subject: Re: 'kernel' not found - ZFS on GPT boot X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Oct 2011 14:11:01 -0000 On 30 September 2011 11:46, Peter Jeremy wrote: > On 2011-Sep-29 14:22:45 -0400, Patrick Dorion > wrote: > >I would greatly appreciate any input that could shed light on my > >issue. Essentially, a seemingly well-configured ZFS system residing > >on a GPT partition is not detected by zfsloader. > > Your log shows you've correctly used gptzfsloader. > > >Fixit# cd /zpool/boot > >Fixit# ls -l > >drw------- 2 root 0 2 Feb 17 2011 zfs/ > > > >Filesystem was taken from /dist/ on the livefs. All of it is read- > >only except /boot/zfs. > > Do you have /boot/zfs/zpool.cache on zpool? That's the most obvious > item that I don't see in your log. > > I presume you're aware of http://wiki.freebsd.org/RootOnZFS/GPTZFSBoot > If that's not the procedure you followed, you might like to cross- > check the steps fou took against that page. > > -- > Peter Jeremy > if it isnt the zpool.cache issue have you 4k aligned the drives? If so you will need a patched version of the boot loader and zfsloader. You can grab binary versions here http://people.freebsd.org/~pjd/zfsboot/ I'm not sure when its due to get commited but, my clang build of 9-stable this morning didnt have it From owner-freebsd-fs@FreeBSD.ORG Sat Oct 1 14:46:28 2011 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D5562106566C for ; Sat, 1 Oct 2011 14:46:28 +0000 (UTC) (envelope-from edhoprima@gmail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id 68DD18FC14 for ; Sat, 1 Oct 2011 14:46:28 +0000 (UTC) Received: by bkbzs8 with SMTP id zs8so3562658bkb.13 for ; Sat, 01 Oct 2011 07:46:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:content-type; bh=G2J5QlgOQOBw+e7W6jCJs20U44dbq/0MC8DIpG0pfsE=; b=IH0526sSzPBzIUK0HPUpawOD8g6r3tHke0kta+E8CdhGxVL5uS7HwJ7BJiVXTxVd3D y5vKyqs2TpUKqYDyC4qku6QXZ6awSjPQFyNy8kFy1UT22bRgzWPPYXN2dRMzLUbI7wBe pvfeikCItulveHjsrl+rN16/oC9JPyx+MzNH8= Received: by 10.204.138.211 with SMTP id b19mr8734161bku.257.1317478669531; Sat, 01 Oct 2011 07:17:49 -0700 (PDT) MIME-Version: 1.0 Sender: edhoprima@gmail.com Received: by 10.204.53.195 with HTTP; Sat, 1 Oct 2011 07:17:17 -0700 (PDT) In-Reply-To: References: <20110930104643.GB51227@server.vk2pj.dyndns.org> From: Edho Arief Date: Sat, 1 Oct 2011 23:17:17 +0900 X-Google-Sender-Auth: CXD90AYx5-S7tAOQjTrwExscsi8 Message-ID: To: krad , fs@freebsd.org Content-Type: text/plain; charset=UTF-8 Cc: Subject: Re: 'kernel' not found - ZFS on GPT boot X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Oct 2011 14:46:28 -0000 On Sat, Oct 1, 2011 at 11:11 PM, krad wrote: > if it isnt the zpool.cache issue have you 4k aligned the drives? If so you > will need a patched version of the boot loader and zfsloader. You can grab > binary versions here http://people.freebsd.org/~pjd/zfsboot/ > > I'm not sure when its due to get commited but, my clang build of 9-stable > this morning didnt have it I booted 8.2 with rootonzfs/ashift=12/raidz2 with pmbr, zfsloader and gptzfsboot taken from 9.0-BETA3 from ftp.freebsd. -- O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From owner-freebsd-fs@FreeBSD.ORG Sat Oct 1 15:13:31 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1ADA1106566B; Sat, 1 Oct 2011 15:13:31 +0000 (UTC) (envelope-from rmh.aybabtu@gmail.com) Received: from mail-iy0-f182.google.com (mail-iy0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id C2AB78FC08; Sat, 1 Oct 2011 15:13:30 +0000 (UTC) Received: by iadk27 with SMTP id k27so4500684iad.13 for ; Sat, 01 Oct 2011 08:13:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:date:x-google-sender-auth:message-id:subject :from:to:cc:content-type; bh=OOAGT4KQo7JUaASeSAsGJTe6x+zfDBVAFzG9CI/d1UE=; b=tVxI87bR6XB9EUtTBabrjn6OoD0IElQucIH2S6RPBqw2Akjk9S2v9Nl8iNyWS8wUaZ qNQrHyjbj2CaraeemgTrVnmbYsgyKb22IHVh7e1YSeQtvtXjlic6altEuTMxs/M9+Va4 xQD8W3VAgtpDe7WW4b2H+AVeziO/HENktzi7I= MIME-Version: 1.0 Received: by 10.43.133.138 with SMTP id hy10mr4544026icc.184.1317482010056; Sat, 01 Oct 2011 08:13:30 -0700 (PDT) Sender: rmh.aybabtu@gmail.com Received: by 10.42.217.74 with HTTP; Sat, 1 Oct 2011 08:13:30 -0700 (PDT) Date: Sat, 1 Oct 2011 17:13:30 +0200 X-Google-Sender-Auth: And6dYSMEPu2Gbk51aRQABIFV-w Message-ID: From: Robert Millan To: delphij@freebsd.org, freebsd-fs@freebsd.org Content-Type: text/plain; charset=UTF-8 Cc: Adrian Chadd Subject: is TMPFS still highly experimental? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Oct 2011 15:13:31 -0000 Hi, Is TMPFS still considered highly experimental? I notice a warning saying this was added in 2007: fs/tmpfs/tmpfs_vfsops.c: printf("WARNING: TMPFS is considered to be a highly experimental " Since it's very old, I wonder if it still applies. After 4 years and 54 commits, can someone tell if the maturity of this file system has improved significantly? From owner-freebsd-fs@FreeBSD.ORG Sat Oct 1 15:41:44 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 86C5F1065670; Sat, 1 Oct 2011 15:41:44 +0000 (UTC) (envelope-from kaduk@mit.edu) Received: from dmz-mailsec-scanner-1.mit.edu (DMZ-MAILSEC-SCANNER-1.MIT.EDU [18.9.25.12]) by mx1.freebsd.org (Postfix) with ESMTP id B4B828FC13; Sat, 1 Oct 2011 15:41:43 +0000 (UTC) X-AuditID: 1209190c-b7fd26d0000008df-e6-4e873131f4c2 Received: from mailhub-auth-1.mit.edu ( [18.9.21.35]) by dmz-mailsec-scanner-1.mit.edu (Symantec Messaging Gateway) with SMTP id 5D.A6.02271.131378E4; Sat, 1 Oct 2011 11:26:41 -0400 (EDT) Received: from outgoing.mit.edu (OUTGOING-AUTH.MIT.EDU [18.7.22.103]) by mailhub-auth-1.mit.edu (8.13.8/8.9.2) with ESMTP id p91FQfQu016612; Sat, 1 Oct 2011 11:26:41 -0400 Received: from multics.mit.edu (MULTICS.MIT.EDU [18.187.1.73]) (authenticated bits=56) (User authenticated as kaduk@ATHENA.MIT.EDU) by outgoing.mit.edu (8.13.6/8.12.4) with ESMTP id p91FQdpu023638 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Sat, 1 Oct 2011 11:26:40 -0400 (EDT) Received: (from kaduk@localhost) by multics.mit.edu (8.12.9.20060308) id p91FQcn8021666; Sat, 1 Oct 2011 11:26:38 -0400 (EDT) Date: Sat, 1 Oct 2011 11:26:38 -0400 (EDT) From: Benjamin Kaduk To: Robert Millan In-Reply-To: Message-ID: References: User-Agent: Alpine 1.10 (GSO 962 2008-03-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFvrIIsWRmVeSWpSXmKPExsUixCmqrGto2O5nsO2shMXerduZLF7f+Mdu cezxTzaLB4eeMjuweMz4NJ8lgDGKyyYlNSezLLVI3y6BK+PFvxXMBRvYK+YtPc3UwPiftYuR k0NCwETiQs9uZghbTOLCvfVsXYxcHEIC+xglVq9ezArhrGeUuLrvPzuEs59J4sKTLYwgLUIC 9RKNt7cwgdgsAloS89c/YgGx2QRUJGa+2cgGYosIKEucXDeLHcRmFoiWeH/+Jtg6YQEDiWur e8F6OQUCJa5u/wjUy8HBK2AvcWN2DsT4AImtrbvASkQFdCRW758CNp5XQFDi5MwnLBAjLSXO /bnONoFRcBaS1CwkqQWMTKsYZVNyq3RzEzNzilOTdYuTE/PyUot0DfVyM0v0UlNKNzGCQ1aS Zwfjm4NKhxgFOBiVeHg//Wv1E2JNLCuuzD3EKMnBpCTKm67f7ifEl5SfUpmRWJwRX1Sak1p8 iFGCg1lJhNf/OFA5b0piZVVqUT5MSpqDRUmc9+AOBz8hgfTEktTs1NSC1CKYrAwHh5IE7wwD oKGCRanpqRVpmTklCGkmDk6Q4TxAwyNBaniLCxJzizPTIfKnGBWlxHnLQBICIImM0jy4XlhK ecUoDvSKMO80kCoeYDqC634FNJgJaLCBNtjgkkSElFQDo/DdsFiPhDcsm8S2VPdtkQv98iVg cvXvlOv1N05LbEn3ncyuuu2ZgPOip5PYVtW1vQgT6Nq8tEH9rYTCrTdr/9+boGeoabaq4AnX yQflO3+z6sv+eJnv58uVtDfMJHFH6wkNU73W+uv+F+YUHnc4ybnoaZe5Y+307Gks/g++NL7y +vugVP3SASWW4oxEQy3mouJEAPOWAIYEAwAA Cc: freebsd-fs@freebsd.org, Adrian Chadd , delphij@freebsd.org Subject: Re: is TMPFS still highly experimental? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Oct 2011 15:41:44 -0000 On Sat, 1 Oct 2011, Robert Millan wrote: > Hi, > > Is TMPFS still considered highly experimental? I notice a warning > saying this was added in 2007: > > fs/tmpfs/tmpfs_vfsops.c: printf("WARNING: TMPFS is considered > to be a highly experimental " > > Since it's very old, I wonder if it still applies. After 4 years and > 54 commits, can someone tell if the maturity of this file system has > improved significantly? This thread: http://lists.freebsd.org/pipermail/freebsd-current/2011-June/025475.html has covered this topic somewhat. Peter Holm (pho) is known for running pretty intensive filesystem (and other) stress tests, and did not come up with a whole lot of crashes. Also, http://www.freebsd.org/cgi/query-pr-summary.cgi?&sort=none&text=tmpfs is not too big, showing only a couple of new reports. Mayhaps it is not "highly" experimental, but probably still experimental, at least. -Ben Kaduk From owner-freebsd-fs@FreeBSD.ORG Sat Oct 1 16:10:56 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C5E9D106566B for ; Sat, 1 Oct 2011 16:10:56 +0000 (UTC) (envelope-from mj@feral.com) Received: from ns1.feral.com (ns1.feral.com [192.67.166.1]) by mx1.freebsd.org (Postfix) with ESMTP id 8F64A8FC16 for ; Sat, 1 Oct 2011 16:10:51 +0000 (UTC) Received: from [192.168.135.105] (c-24-7-47-62.hsd1.ca.comcast.net [24.7.47.62]) (authenticated bits=0) by ns1.feral.com (8.14.4/8.14.4) with ESMTP id p91FXWD1067871 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Sat, 1 Oct 2011 08:33:35 -0700 (PDT) (envelope-from mj@feral.com) Message-ID: <4E8732C7.9000800@feral.com> Date: Sat, 01 Oct 2011 08:33:27 -0700 From: Matthew Jacob Organization: Feral Software User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:6.0.2) Gecko/20110902 Thunderbird/6.0.2 MIME-Version: 1.0 To: freebsd-fs@freebsd.org References: In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (ns1.feral.com [192.67.166.1]); Sat, 01 Oct 2011 08:33:36 -0700 (PDT) Subject: Re: is TMPFS still highly experimental? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Oct 2011 16:10:56 -0000 > Hi, > > Is TMPFS still considered highly experimental? I notice a warning > saying this was added in 2007: > > fs/tmpfs/tmpfs_vfsops.c: printf("WARNING: TMPFS is considered > to be a highly experimental " > > Since it's very old, I wonder if it still applies. After 4 years and > 54 commits, can someone tell if the maturity of this file system has > improved significantly? > _______________________________________________ > We went through this discussion a few months back, and as best as I can remember there were still some serious issues found. From owner-freebsd-fs@FreeBSD.ORG Sat Oct 1 16:14:41 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 66B8C106564A for ; Sat, 1 Oct 2011 16:14:41 +0000 (UTC) (envelope-from utisoft@gmail.com) Received: from mail-iy0-f182.google.com (mail-iy0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id 2E8758FC12 for ; Sat, 1 Oct 2011 16:14:40 +0000 (UTC) Received: by iadk27 with SMTP id k27so4565688iad.13 for ; Sat, 01 Oct 2011 09:14:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=kq9vKFeyuS7un6ivfQcbX+1sPHhXcjwbmeN3b7iD8fA=; b=KHllF3ljNntjeDz7SgFiRpBd1GwxLyVFR1bvqa4A+l3DzzNBzJh4SvYy0XWxw40cjD Hu4swSy999tF1XTaxYM9hdJaeiZZDOuv3wuE1Wjk8paIghvg+E4zmicrVYC7+Mu1szRC wv1GguPEdFJ16El/aGkm655GRy4n8gjslLZBY= MIME-Version: 1.0 Received: by 10.231.65.73 with SMTP id h9mr7002658ibi.21.1317484122411; Sat, 01 Oct 2011 08:48:42 -0700 (PDT) Sender: utisoft@gmail.com Received: by 10.231.35.194 with HTTP; Sat, 1 Oct 2011 08:48:41 -0700 (PDT) Received: by 10.231.35.194 with HTTP; Sat, 1 Oct 2011 08:48:41 -0700 (PDT) In-Reply-To: References: Date: Sat, 1 Oct 2011 16:48:41 +0100 X-Google-Sender-Auth: lHdrXL6fYFoxCvEXWQML52Tc8Dg Message-ID: From: Chris Rees To: Benjamin Kaduk Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org, Adrian Chadd , delphij@freebsd.org Subject: Re: is TMPFS still highly experimental? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Oct 2011 16:14:41 -0000 On 1 Oct 2011 16:41, "Benjamin Kaduk" wrote: > > On Sat, 1 Oct 2011, Robert Millan wrote: > >> Hi, >> >> Is TMPFS still considered highly experimental? I notice a warning >> saying this was added in 2007: >> >> fs/tmpfs/tmpfs_vfsops.c: printf("WARNING: TMPFS is considered >> to be a highly experimental " >> >> Since it's very old, I wonder if it still applies. After 4 years and >> 54 commits, can someone tell if the maturity of this file system has >> improved significantly? > > > This thread: > http://lists.freebsd.org/pipermail/freebsd-current/2011-June/025475.html > has covered this topic somewhat. Peter Holm (pho) is known for running pretty intensive filesystem (and other) stress tests, and did not come up with a whole lot of crashes. > Also, http://www.freebsd.org/cgi/query-pr-summary.cgi?&sort=none&text=tmpfs > is not too big, showing only a couple of new reports. > Mayhaps it is not "highly" experimental, but probably still experimental, at least. > I've also not heard of anyone using it with zfs successfully- it tends to shrink rapidly. Chris From owner-freebsd-fs@FreeBSD.ORG Sat Oct 1 19:44:05 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 687F0106566B; Sat, 1 Oct 2011 19:44:05 +0000 (UTC) (envelope-from yanegomi@gmail.com) Received: from mail-qw0-f54.google.com (mail-qw0-f54.google.com [209.85.216.54]) by mx1.freebsd.org (Postfix) with ESMTP id F17F68FC16; Sat, 1 Oct 2011 19:44:04 +0000 (UTC) Received: by qadz30 with SMTP id z30so1212829qad.13 for ; Sat, 01 Oct 2011 12:44:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=UWP9IV5p30gm43Lb8zIlr3ygNEORjCtrWPAdAzMRxt8=; b=F6t6NKZjyCNIy+BEnL9BnqTZ7Vkn/fB+OxX18mitAkP8JdaOWu8BRp9AAlTNjUJ4vj k0LWxTkCNrxp/CjYGTuZ6gO+0uL0yjsXrk/xgTuzwV6L8NcNlnM6J016brsVCqAkaIwQ MSN+fPxAUBQSDFTSDcB04jPkCma8RKY6PmIbs= MIME-Version: 1.0 Received: by 10.224.215.133 with SMTP id he5mr9861274qab.224.1317498244353; Sat, 01 Oct 2011 12:44:04 -0700 (PDT) Received: by 10.224.74.82 with HTTP; Sat, 1 Oct 2011 12:44:04 -0700 (PDT) In-Reply-To: References: <201109301820.p8UIKSGj039445@chez.mckusick.com> <20110930201851.GB1511@deviant.kiev.zoral.com.ua> Date: Sat, 1 Oct 2011 12:44:04 -0700 Message-ID: From: Garrett Cooper To: Attilio Rao Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: Kirk McKusick , Xin LI , freebsd-fs@freebsd.org Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Oct 2011 19:44:05 -0000 On Sat, Oct 1, 2011 at 5:39 AM, Attilio Rao wrote: > 2011/9/30 Kostik Belousov : >> On Fri, Sep 30, 2011 at 11:20:28AM -0700, Kirk McKusick wrote: >>> > Date: Fri, 30 Sep 2011 15:31:56 +0200 >>> > Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? >>> > From: Attilio Rao >>> > To: Kirk McKusick >>> > Cc: Konstantin Belousov , >>> > =A0 =A0 Garrett Cooper , >>> > =A0 =A0 freebsd-fs@freebsd.org, Xin LI >>> > >>> > 2011/9/30 Kirk McKusick : >>> > >>> > > Here is my proposed fix. It does the unroll originally found in the >>> > > non-FORCE case before sleeping waiting for the vfs_busy to clear. >>> > > Is it acceptable to hold the mount mutex while calling VOP_UNLOCK? >>> > > If not, then it needs to be released before the unlock, reacquired >>> > > afterwards, and the check to see if the sleep is needed redone. >>> > >>> > I thought about this approach when sending the e-mail, but there is a >>> > problem: you need to handle the MNTK_UNMOUNT flag checking and >>> > subsequent setting after coveredvnode is held, otherwise at the first >>> > looping you will just return EBUSY. >>> > >>> > You can avoid the unlock by passing PVFS | PDROP. >>> > >>> > Attilio >>> >>> Problem noted. I have updated the patch to clear the MNTK_UNMOUNT >>> (and other flags set above it) after it returns from the sleep. >>> Which means I cannot use the PDROP flag now, but it is good to >>> know about it for future reference. >>> >>> Still not clear to me if it is acceptable to hold the mount mutex >>> while calling VOP_UNLOCK. Should I drop the mount mutex around the >>> VOP_UNLOCK(coveredvp)? Other than that possible problem, this patch >>> appears to solve the EBUSY problem and avoid possible deadlocks. >> I do not understand which deadlock is talked about there. >> It seems thay Attilio concern was with acquiring covered vnode lock >> after mounted fs is busied, but this is prohibited. >> >> See r166167 for more detailed description of the order. > > Ok, so that is the invariant I was forgetting, thanks Kostik. > > Kirk, you can make the 'forced unmount' behaviour by default for me, > now, thanks. > It would be great to have a comment on top of vfs_busy() or > dounmount() check of mnt_ref on why this deadlock cannot happen, > likely squeezing some good words from tegge's description of r166167. > Kirk may be the best person to do it, but I can have his backs if he > doesn't have time right now. Ok. Now that I know this is the direction you guys want to go, I'll start testing the change. Thanks! -Garrett From owner-freebsd-fs@FreeBSD.ORG Sat Oct 1 21:37:06 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4D33C106564A; Sat, 1 Oct 2011 21:37:05 +0000 (UTC) (envelope-from mckusick@mckusick.com) Received: from chez.mckusick.com (chez.mckusick.com [70.36.157.235]) by mx1.freebsd.org (Postfix) with ESMTP id E4CCE8FC15; Sat, 1 Oct 2011 21:37:04 +0000 (UTC) Received: from chez.mckusick.com (localhost [127.0.0.1]) by chez.mckusick.com (8.14.3/8.14.3) with ESMTP id p91Lb6FI093841; Sat, 1 Oct 2011 14:37:06 -0700 (PDT) (envelope-from mckusick@chez.mckusick.com) Message-Id: <201110012137.p91Lb6FI093841@chez.mckusick.com> To: Garrett Cooper In-reply-to: Date: Sat, 01 Oct 2011 14:37:06 -0700 From: Kirk McKusick X-Spam-Status: No, score=0.0 required=5.0 tests=MISSING_MID, UNPARSEABLE_RELAY autolearn=failed version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on chez.mckusick.com Cc: Attilio Rao , Xin LI , freebsd-fs@freebsd.org Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Oct 2011 21:37:06 -0000 > Date: Sat, 1 Oct 2011 12:44:04 -0700 > Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? > From: Garrett Cooper > To: Attilio Rao > Cc: Kostik Belousov , > Kirk McKusick , freebsd-fs@freebsd.org, > Xin LI > > Ok. Now that I know this is the direction you guys want to go, I'll > start testing the change. > Thanks! > -Garrett Thanks for throwing some testing at this. Please test my latest proposed change (included below so you do not have to dig through earlier email) as I believe that it has the least likelyhood of problems and is what I am currently proposing to put in. Kirk McKusick Index: sys/kern/vfs_mount.c =================================================================== --- sys/kern/vfs_mount.c (revision 225903) +++ sys/kern/vfs_mount.c (working copy) @@ -1187,6 +1187,7 @@ mtx_assert(&Giant, MA_OWNED); +top: if ((coveredvp = mp->mnt_vnodecovered) != NULL) { mnt_gen_r = mp->mnt_gen; VI_LOCK(coveredvp); @@ -1227,21 +1228,19 @@ mp->mnt_kern_flag |= MNTK_UNMOUNTF; error = 0; if (mp->mnt_lockref) { - if ((flags & MNT_FORCE) == 0) { - mp->mnt_kern_flag &= ~(MNTK_UNMOUNT | MNTK_NOINSMNTQ | - MNTK_UNMOUNTF); - if (mp->mnt_kern_flag & MNTK_MWAIT) { - mp->mnt_kern_flag &= ~MNTK_MWAIT; - wakeup(mp); - } - MNT_IUNLOCK(mp); - if (coveredvp) - VOP_UNLOCK(coveredvp, 0); - return (EBUSY); + if (mp->mnt_kern_flag & MNTK_MWAIT) { + mp->mnt_kern_flag &= ~MNTK_MWAIT; + wakeup(mp); } + if (coveredvp) + VOP_UNLOCK(coveredvp, 0); mp->mnt_kern_flag |= MNTK_DRAINING; error = msleep(&mp->mnt_lockref, MNT_MTX(mp), PVFS, "mount drain", 0); + mp->mnt_kern_flag &= ~(MNTK_UNMOUNT | MNTK_NOINSMNTQ | + MNTK_UNMOUNTF); + MNT_IUNLOCK(mp); + goto top; } MNT_IUNLOCK(mp); KASSERT(mp->mnt_lockref == 0, From owner-freebsd-fs@FreeBSD.ORG Sat Oct 1 21:54:12 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1B374106564A; Sat, 1 Oct 2011 21:54:12 +0000 (UTC) (envelope-from yanegomi@gmail.com) Received: from mail-pz0-f44.google.com (mail-pz0-f44.google.com [209.85.210.44]) by mx1.freebsd.org (Postfix) with ESMTP id D7B338FC08; Sat, 1 Oct 2011 21:54:11 +0000 (UTC) Received: by pzk32 with SMTP id 32so16701810pzk.3 for ; Sat, 01 Oct 2011 14:54:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:mime-version:content-type; bh=FWE2GkCStzLG58nd9T+D1gx0GlIg6Gu++b1EeiYT1m0=; b=SqjYAhak83KhYlbTMaIWFDeN21Q+cp7bRXlk13haboHgDCakNcQZAUWW2WL1yqswPi g+Jpe6Rnql/NSpPRxJFhWoiTn3Hn+SIrZeV1N46PW0vE0y/4v747niOFPuXYl24FOtYP cna5OeUp/X9qs43MBM2g8fJ9gyoZy67tuBOro= Received: by 10.68.208.229 with SMTP id mh5mr68887269pbc.124.1317506050980; Sat, 01 Oct 2011 14:54:10 -0700 (PDT) Received: from c-24-6-49-154.hsd1.ca.comcast.net (c-24-6-49-154.hsd1.ca.comcast.net. [24.6.49.154]) by mx.google.com with ESMTPS id ji3sm34388214pbc.2.2011.10.01.14.54.09 (version=TLSv1/SSLv3 cipher=OTHER); Sat, 01 Oct 2011 14:54:09 -0700 (PDT) Date: Sat, 1 Oct 2011 14:54:04 -0700 (PDT) From: Garrett Cooper To: Kirk McKusick In-Reply-To: <201110012137.p91Lb6FI093841@chez.mckusick.com> Message-ID: References: <201110012137.p91Lb6FI093841@chez.mckusick.com> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII Cc: Garrett Cooper , Attilio Rao , Xin LI , freebsd-fs@freebsd.org Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Oct 2011 21:54:12 -0000 On Sat, 1 Oct 2011, Kirk McKusick wrote: >> Date: Sat, 1 Oct 2011 12:44:04 -0700 >> Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? >> From: Garrett Cooper >> To: Attilio Rao >> Cc: Kostik Belousov , >> Kirk McKusick , freebsd-fs@freebsd.org, >> Xin LI >> >> Ok. Now that I know this is the direction you guys want to go, I'll >> start testing the change. >> Thanks! >> -Garrett > > Thanks for throwing some testing at this. Please test my latest > proposed change (included below so you do not have to dig through > earlier email) as I believe that it has the least likelyhood of > problems and is what I am currently proposing to put in. > > Kirk McKusick > > Index: sys/kern/vfs_mount.c > =================================================================== > --- sys/kern/vfs_mount.c (revision 225903) > +++ sys/kern/vfs_mount.c (working copy) > @@ -1187,6 +1187,7 @@ > > mtx_assert(&Giant, MA_OWNED); > > +top: > if ((coveredvp = mp->mnt_vnodecovered) != NULL) { > mnt_gen_r = mp->mnt_gen; > VI_LOCK(coveredvp); > @@ -1227,21 +1228,19 @@ > mp->mnt_kern_flag |= MNTK_UNMOUNTF; > error = 0; > if (mp->mnt_lockref) { > - if ((flags & MNT_FORCE) == 0) { > - mp->mnt_kern_flag &= ~(MNTK_UNMOUNT | MNTK_NOINSMNTQ | > - MNTK_UNMOUNTF); > - if (mp->mnt_kern_flag & MNTK_MWAIT) { > - mp->mnt_kern_flag &= ~MNTK_MWAIT; > - wakeup(mp); > - } > - MNT_IUNLOCK(mp); > - if (coveredvp) > - VOP_UNLOCK(coveredvp, 0); > - return (EBUSY); > + if (mp->mnt_kern_flag & MNTK_MWAIT) { > + mp->mnt_kern_flag &= ~MNTK_MWAIT; > + wakeup(mp); > } > + if (coveredvp) > + VOP_UNLOCK(coveredvp, 0); > mp->mnt_kern_flag |= MNTK_DRAINING; > error = msleep(&mp->mnt_lockref, MNT_MTX(mp), PVFS, > "mount drain", 0); > + mp->mnt_kern_flag &= ~(MNTK_UNMOUNT | MNTK_NOINSMNTQ | > + MNTK_UNMOUNTF); > + MNT_IUNLOCK(mp); > + goto top; > } > MNT_IUNLOCK(mp); > KASSERT(mp->mnt_lockref == 0, I'll run it through a few other filesystems (ntfs, smbfs, etc) just in case. I should have results by either Monday or Tuesday. Thanks, -Garrett From owner-freebsd-fs@FreeBSD.ORG Sat Oct 1 22:38:08 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 997BE1065780; Sat, 1 Oct 2011 22:38:08 +0000 (UTC) (envelope-from mckusick@mckusick.com) Received: from chez.mckusick.com (chez.mckusick.com [70.36.157.235]) by mx1.freebsd.org (Postfix) with ESMTP id 76EA08FC0C; Sat, 1 Oct 2011 22:38:08 +0000 (UTC) Received: from chez.mckusick.com (localhost [127.0.0.1]) by chez.mckusick.com (8.14.3/8.14.3) with ESMTP id p91Mc9uE007202; Sat, 1 Oct 2011 15:38:09 -0700 (PDT) (envelope-from mckusick@chez.mckusick.com) Message-Id: <201110012238.p91Mc9uE007202@chez.mckusick.com> To: Garrett Cooper In-reply-to: Date: Sat, 01 Oct 2011 15:38:09 -0700 From: Kirk McKusick X-Spam-Status: No, score=0.0 required=5.0 tests=MISSING_MID, UNPARSEABLE_RELAY autolearn=failed version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on chez.mckusick.com Cc: Attilio Rao , Xin LI , freebsd-fs@freebsd.org Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Oct 2011 22:38:08 -0000 > Date: Sat, 1 Oct 2011 14:54:04 -0700 (PDT) > From: Garrett Cooper > To: Kirk McKusick > cc: Garrett Cooper , Attilio Rao , > Kostik Belousov , freebsd-fs@freebsd.org, > Xin LI > Subject: Re: Need to force sync(2) before umounting UFS1 filesystems? > > On Sat, 1 Oct 2011, Kirk McKusick wrote: > > > Thanks for throwing some testing at this. Please test my latest > > proposed change (included below so you do not have to dig through > > earlier email) as I believe that it has the least likelyhood of > > problems and is what I am currently proposing to put in. > > > > Kirk McKusick > > I'll run it through a few other filesystems (ntfs, smbfs, etc) just in > case. I should have results by either Monday or Tuesday. > Thanks, > -Garrett The more filesystems you can throw at it the better. Most critical are ZFS, UFS, and NFS. Kirk McKusick