Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 23 Apr 2010 08:29:50 +0200
From:      Pawel Jakub Dawidek <pjd@FreeBSD.org>
To:        Mikolaj Golub <to.my.trociny@gmail.com>
Cc:        freebsd-fs <freebsd-fs@freebsd.org>
Subject:   Re: HAST: primary might get stuck when there are connectivity problems with secondary
Message-ID:  <20100423062950.GD1670@garage.freebsd.pl>
In-Reply-To: <86r5m9dvqf.fsf@zhuzha.ua1>
References:  <86r5m9dvqf.fsf@zhuzha.ua1>

next in thread | previous in thread | raw e-mail | index | archive | help

--YToU2i3Vx8H2dn7O
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, Apr 21, 2010 at 12:02:16PM +0300, Mikolaj Golub wrote:
> Restoring network after this changes nothing. Primary gets stuck. No mess=
ages
> in the log and "dirty" in status output does not change:
>=20
> [root@hasta ~]# hastctl status
> storage:
>   role: primary
>   provname: storage
>   localpath: /dev/ad4
>   extentsize: 2097152
>   keepdirty: 64
>   remoteaddr: 172.20.66.202
>   replication: memsync
>   status: complete
>   dirty: 2097152 bytes
>=20
> On the secondary we have all this time:
>=20
> tcp4       0      0 172.20.66.202.8457     172.20.66.201.57596    ESTABLI=
SHED
> tcp4       0      0 172.20.66.202.8457     172.20.66.201.41841    ESTABLI=
SHED
>=20
> The last messages in log:
>=20
> Apr 21 10:50:21 hasta hastd: [storage] (primary) ggate_recv: (0x28411bc0)=
 Request received from the kernel: READ(13565952, 65536).
> Apr 21 10:50:21 hasta hastd: [storage] (primary) ggate_recv: (0x28411bc0)=
 Moving request to the send queue.
> Apr 21 10:50:21 hasta hastd: [storage] (primary) ggate_recv: Taking free =
request.
> Apr 21 10:50:21 hasta hastd: [storage] (primary) ggate_recv: (0x28411b80)=
 Got free request.
> Apr 21 10:50:21 hasta hastd: [storage] (primary) ggate_recv: (0x28411b80)=
 Waiting for request from the kernel.
> Apr 21 10:50:21 hasta hastd: [storage] (primary) local_send: (0x28411bc0)=
 Got request.
> Apr 21 10:50:21 hasta hastd: [storage] (primary) local_send: (0x28411bc0)=
 Moving request to the done queue.
> Apr 21 10:50:21 hasta hastd: [storage] (primary) local_send: Taking reque=
st.
> Apr 21 10:50:21 hasta hastd: [storage] (primary) ggate_send: (0x28411bc0)=
 Got request.
> Apr 21 10:50:21 hasta hastd: [storage] (primary) ggate_send: (0x28411bc0)=
 Moving request to the free queue.
> Apr 21 10:50:21 hasta hastd: [storage] (primary) ggate_send: Taking reque=
st.
> Apr 21 10:51:00 hasta hastd: [storage] (primary) ggate_recv: (0x28411b80)=
 Request received from the kernel: READ(1812529152, 65536).
> Apr 21 10:51:00 hasta hastd: [storage] (primary) ggate_recv: (0x28411b80)=
 Moving request to the send queue.
> Apr 21 10:51:00 hasta hastd: [storage] (primary) ggate_recv: Taking free =
request.
> Apr 21 10:51:00 hasta hastd: [storage] (primary) ggate_recv: (0x28411b00)=
 Got free request.
> Apr 21 10:51:00 hasta hastd: [storage] (primary) ggate_recv: (0x28411b00)=
 Waiting for request from the kernel.
> Apr 21 10:51:00 hasta hastd: [storage] (primary) local_send: (0x28411b80)=
 Got request.
> Apr 21 10:51:00 hasta hastd: [storage] (primary) local_send: (0x28411b80)=
 Moving request to the done queue.
> Apr 21 10:51:00 hasta hastd: [storage] (primary) local_send: Taking reque=
st.
> Apr 21 10:51:00 hasta hastd: [storage] (primary) ggate_send: (0x28411b80)=
 Got request.
> Apr 21 10:51:00 hasta hastd: [storage] (primary) ggate_send: (0x28411b80)=
 Moving request to the free queue.
> Apr 21 10:51:00 hasta hastd: [storage] (primary) ggate_send: Taking reque=
st.
>=20
> The backtrace of gotten stuck hastd is in the attach.
>=20
> I interpret this in the following way. Although the network is down
> hast_proto_send() in remote_send_thread() returns success (sent data are
> stored in the kernel buffer). Then kernel tries to send data and eventual=
ly
> fails after timeout and close the socket. hastd is not aware about this,
> remote_send_thread() is blocked in "Taking request" at this time, sync th=
read
> is waiting for status from the secondary about sent data but secondary do=
es
> not send it because it did not receive any data.

"Taking request" only means that HAST is waiting for a request. In case
of ggate_send thread it means it is waiting for I/O request from the
kernel. It stops here, because there is no new request.

> Restarting hastd on the secondary usually helps. A workaround is to set
> net.inet.tcp.keepidle to some small value (e.g. 300 sec) on the
> secondary. Then the secondary will notice much earlier that the peer has
> closed the connection and will restart the worker itself:
>=20
> Apr 21 11:52:21 hastb hastd: [storage] (secondary) Unable to receive requ=
est header: Connection reset by peer.
> Apr 21 11:52:21 hastb hastd: [storage] (secondary) Worker process (pid=3D=
1398) exited ungracefully: status=3D19200.

What you are observing here is because currently there is no keep-alive
mechanism in hastd, so if there are no I/O requests, primary won't
notice that secondary is down until next I/O request arrives.
I don't consider it critical, but confusing. Note that when secondary
goes down and there are no I/O requests to primary, there will be
nothing to synchronize, so fast reconnect isn't crucial in this case.
We need one I/O request coming to primary to discover that secondary is
down and start reconnect loop.

Could you confirm this is what you are observing? If not, what FreeBSD
version do you use? If it is HEAD, do you have r205738?

--=20
Pawel Jakub Dawidek                       http://www.wheelsystems.com
pjd@FreeBSD.org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!

--YToU2i3Vx8H2dn7O
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (FreeBSD)

iEYEARECAAYFAkvRPl0ACgkQForvXbEpPzQ2BwCgz94Ij8Xsz+SdrF+OvT3IOfev
Q8IAoIIFvxvBuCU6/mPquHRcuaiQgChh
=Ab6v
-----END PGP SIGNATURE-----

--YToU2i3Vx8H2dn7O--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100423062950.GD1670>