Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 11 Oct 2010 23:11:00 +0300
From:      Mikolaj Golub <to.my.trociny@gmail.com>
To:        "Michael W. Lucas" <mwlucas@blackhelicopters.org>
Cc:        fs@freebsd.org
Subject:   Re: hast crash
Message-ID:  <868w24cwx7.fsf@kopusha.home.net>
In-Reply-To: <20101011153051.GA15699@bewilderbeast.blackhelicopters.org> (Michael W. Lucas's message of "Mon, 11 Oct 2010 11:30:51 -0400")
References:  <20101011153051.GA15699@bewilderbeast.blackhelicopters.org>

next in thread | previous in thread | raw e-mail | index | archive | help

On Mon, 11 Oct 2010 11:30:51 -0400 Michael W. Lucas wrote:

 MWL> Hi,

 MWL> I upgraded my HAST cluster to 8.1-stable on 6 October 2010, and am now
 MWL> experiencing crashes in hastd.  hastd debug output is showing:

 MWL> ...
 MWL> [DEBUG][2] [mirror] (secondary) recv: (0x8013ecc40) Got request header: WRITE(11752701952, 131072).
 MWL> [DEBUG][2] [mirror] (secondary) recv: (0x8013ecc40) Moving request to the disk queue.
 MWL> [DEBUG][2] [mirror] (secondary) disk: (0x8013ecc40) Got request: WRITE(11752701952, 131072).
 MWL> [DEBUG][2] [mirror] (secondary) recv: Taking free request.
 MWL> [DEBUG][2] [mirror] (secondary) recv: (0x8013ecbf0) Got request.
 MWL> [ERROR] [mirror] (secondary) Unable to receive request header: RPC version wrong.
 MWL> [DEBUG][1] Unable to receive event header: Socket is not connected.
 MWL> [DEBUG][1] Accepting connection to tcp4://0.0.0.0:8457.
 MWL> [INFO] Connection from tcp4://192.168.0.1:21493 to tcp4://192.168.0.2:8457.
 MWL> [DEBUG][2] tcp4://192.168.0.1:21493: resource=mirror
 MWL> [DEBUG][1] [mirror] (secondary) Initial connection from tcp4://192.168.0.1:21493.
 MWL> [DEBUG][1] [mirror] (secondary) Worker process exists (pid=8826), stopping it.
 MWL> [ERROR] [mirror] (secondary) Worker process exited ungracefully (pid=8826, exitcode=75).
 MWL> Assertion failed: (conn != NULL), function proto_close, file /usr/src/sbin/hastd/proto.c, line 287.

This assertion has been fixed in r213579.

 MWL> Abort (core dumped)

 MWL> Both machines are running on VMWare ESXi.  The second machine is a
 MWL> clone of the first.

 MWL> Any thoughts, folks?

I would recommend upgrading sbin/hastd to current or wait a couple of days for
MFC :-).

-- 
Mikolaj Golub



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?868w24cwx7.fsf>