From owner-freebsd-stable@FreeBSD.ORG Sun Feb 5 09:47:34 2012 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D9DA5106566B for ; Sun, 5 Feb 2012 09:47:34 +0000 (UTC) (envelope-from pawel@dawidek.net) Received: from mail.dawidek.net (60.wheelsystems.com [83.12.187.60]) by mx1.freebsd.org (Postfix) with ESMTP id 893CF8FC08 for ; Sun, 5 Feb 2012 09:47:34 +0000 (UTC) Received: from localhost (89-73-195-149.dynamic.chello.pl [89.73.195.149]) by mail.dawidek.net (Postfix) with ESMTPSA id 2D2C661C; Sun, 5 Feb 2012 10:29:08 +0100 (CET) Date: Sun, 5 Feb 2012 10:27:54 +0100 From: Pawel Jakub Dawidek To: Mikolaj Golub Message-ID: <20120205092753.GA30033@garage.freebsd.pl> References: <86ipjvbglk.fsf@kopusha.home.net> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="GvXjxJ+pjyke8COw" Content-Disposition: inline In-Reply-To: <86ipjvbglk.fsf@kopusha.home.net> X-OS: FreeBSD 10.0-CURRENT amd64 User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Artem Kajalainen , freebsd-stable@freebsd.org Subject: Re: problems with hast X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 05 Feb 2012 09:47:34 -0000 --GvXjxJ+pjyke8COw Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, Jan 29, 2012 at 12:35:35AM +0200, Mikolaj Golub wrote: > Investigating, it looks after r226859, when 'async' mode was added, we ha= ve 2 > issues with synchronization from secondary to master (rather very rear ca= se > normally): >=20 > 1) When the synchronization from secondary to master is running and prima= ry > gets READ request, the request should be sent to the secondary but actual= ly it > is lost. As a result READ operation gets stuck. After the syncronization = is > complete the following READ requests, which now can be served by primary,= work > ok. >=20 > 2) In async mode, for syncronization requests, write_complete() function, > which sends G_GATE_CMD_DONE command to ggate, is called twice and the sec= ond > call fails. >=20 > Artem, did you run async mode? If you did then I suppose you observed the > second issue. Could you please try the attached patch? The analysis and fixes look good to me, please go ahead and commit (small nits below). > Index: sbin/hastd/primary.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- sbin/hastd/primary.c (revision 230661) > +++ sbin/hastd/primary.c (working copy) > @@ -1255,7 +1255,7 @@ ggate_recv_thread(void *arg) > pjdlog_debug(2, > "ggate_recv: (%p) Moving request to the send queues.", hio); > refcount_init(&hio->hio_countdown, ncomps); > - for (ii =3D ncomp; ii < ncomps; ii++) > + for (ii =3D ncomp; ncomps !=3D 0; ncomps--, ii++) I'd prefer not to modify ncomps in the loop, maybe something like this: for (ii =3D ncomp; ii < ncomp + ncomps; ii++) > QUEUE_INSERT1(hio, send, ii); > } > /* NOTREACHED */ > @@ -1326,7 +1326,7 @@ local_send_thread(void *arg) > } else { > hio->hio_errors[ncomp] =3D 0; > if (hio->hio_replication =3D=3D > - HAST_REPLICATION_ASYNC) { > + HAST_REPLICATION_ASYNC && !ISSYNCREQ(hio)) { Could you move this additional check to separate line? Thanks! --=20 Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://tupytaj.pl --GvXjxJ+pjyke8COw Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (FreeBSD) iEYEARECAAYFAk8uS5gACgkQForvXbEpPzRQPwCfXj+FSNO47V13eoRL1DJwuHWK zlcAoNBPW26Lz+CvQcs48kYXlFFVBarV =07jR -----END PGP SIGNATURE----- --GvXjxJ+pjyke8COw--