From owner-freebsd-fs@FreeBSD.ORG Tue Apr 6 17:34:15 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7280B1065670; Tue, 6 Apr 2010 17:34:15 +0000 (UTC) (envelope-from toasty@dragondata.com) Received: from mail-pz0-f197.google.com (mail-pz0-f197.google.com [209.85.222.197]) by mx1.freebsd.org (Postfix) with ESMTP id D0C798FC16; Tue, 6 Apr 2010 17:34:14 +0000 (UTC) Received: by pzk35 with SMTP id 35so133494pzk.3 for ; Tue, 06 Apr 2010 10:34:14 -0700 (PDT) Received: by 10.115.102.16 with SMTP id e16mr6156927wam.117.1270575253959; Tue, 06 Apr 2010 10:34:13 -0700 (PDT) Received: from vpn177.ord02.your.org (vpn177.ord02.your.org [204.9.55.177]) by mx.google.com with ESMTPS id 20sm5623360iwn.1.2010.04.06.10.34.12 (version=TLSv1/SSLv3 cipher=RC4-MD5); Tue, 06 Apr 2010 10:34:13 -0700 (PDT) Mime-Version: 1.0 (Apple Message framework v1077) Content-Type: text/plain; charset=us-ascii From: Kevin Day In-Reply-To: <20100310205711.GA1847@garage.freebsd.pl> Date: Tue, 6 Apr 2010 12:34:10 -0500 Content-Transfer-Encoding: quoted-printable Message-Id: <31E0B354-ADD6-412B-9599-5E33A5E27853@dragondata.com> References: <7418ECC2-55C1-4A28-82EA-0972AFE745EF@dragondata.com> <20100310205711.GA1847@garage.freebsd.pl> To: Pawel Jakub Dawidek X-Mailer: Apple Mail (2.1077) Cc: freebsd-fs@freebsd.org Subject: Re: iscsi over HAST backed storage partial success X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Apr 2010 17:34:15 -0000 On Mar 10, 2010, at 2:57 PM, Pawel Jakub Dawidek wrote: > On Tue, Mar 09, 2010 at 05:03:41PM -0600, Kevin Day wrote: >>=20 >> I'm running istgt (iscsi target) using HAST backed storage. For the = most part, it seems to work really well. I have ucarp running to change = the IP that istgt is bound to, and modified the ucarp scripts to = start/stop istgt depending on which side is the master. If I shut down = the primary, the secondary takes over and all seems well. >>=20 >> However, if I reboot the secondary, the primary starts freezing up = for long periods: >>=20 >> Mar 9 22:46:27 cs04 hastd: [iscsi1] (primary) Unable to r: Socket is = not connected. >> Mar 9 22:46:27 cs04 hastd: [iscsi1] (primary) Unable to co: = Connection refused. >> Mar 9 22:46:42 cs04 last message repeated 3 times >> Mar 9 22:46:53 cs04 istgt[14298]: ABORT_TASK >> Mar 9 22:47:35 cs04 last message repeated 3 times >> Mar 9 22:48:02 cs04 hastd: [iscsi1] (primary) Unable to co: = Operation timed out. >> Mar 9 22:48:02 cs04 istgt[14298]: CmdSN(45748), OP=3D0x2a, = ElapsedTime=3D74 cleared=20 >> Mar 9 22:48:02 cs04 istgt[14298]: istgt_iscsi.c: = 640:istgt_iscsi_write_pdu: ***ERROR*** iscsi_write() failed (errno=3D32) >> Mar 9 22:48:02 cs04 istgt[14298]: = istgt_iscsi.c:3327:istgt_iscsi_op_task: ***ERROR*** iscsi_write_pdu() = failed >> Mar 9 22:48:02 cs04 istgt[14298]: = istgt_iscsi.c:3867:istgt_iscsi_execute: ***ERROR*** iscsi_op_task() = failed =20 >> Mar 9 22:48:02 cs04 istgt[14298]: istgt_iscsi.c:4337:worker: = ***ERROR*** iscsi_execute() failed >> Mar 9 22:48:02 cs04 istgt[14298]: CmdSN(490802), OP=3D0x2a, = ElapsedTime=3D73 cleared >> Mar 9 22:48:02 cs04 istgt[14298]: CmdSN(28387), OP=3D0x2a, = ElapsedTime=3D73 cleared=20 >> Mar 9 22:48:14 cs04 istgt[14298]: ABORT_TASK >> Mar 9 22:48:52 cs04 last message repeated 2 times >> Mar 9 22:49:22 cs04 hastd: [iscsi1] (primary) Unable to co: = Operation timed out. >>=20 >> As soon as the secondary comes back online, everything starts = behaving again and all is well. >=20 > Could you try the following patch? >=20 > http://people.freebsd.org/~pjd/patches/hastd_primary.c.patch >=20 Sorry for the long delay. This does seem to fix that problem, yes. :) -- Kevin