Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 18 Feb 2018 23:24:51 +0100 (CET)
From:      =?ISO-8859-1?Q?Trond_Endrest=F8l?= <Trond.Endrestol@fagskolen.gjovik.no>
To:        FreeBSD Current <freebsd-current@freebsd.org>, FreeBSD Hackers <freebsd-hackers@freebsd.org>
Subject:   Re: amd64 head -r329465 (non-debug build, but with symbols): "panic: spin lock held too long" during make check-old, reported during a sys_vfork
Message-ID:  <alpine.BSF.2.21.1802182320440.24158@mail.fig.ol.no>
In-Reply-To: <CAGudoHHsHGZYo3Ke2MGhk517oti6gEz0Zk0-jSmXVbbk9Vx83g@mail.gmail.com>
References:  <DA76F62D-3373-47CA-AD95-DE9BA580772B@yahoo.com> <6907E068-C80A-44B8-A8AD-3EF27D52D127@yahoo.com> <20832C61-AA5D-41A6-8BF9-90CC87D17219@yahoo.com> <CAGudoHH5Yz6312QSADiOVx9kd17=WcatEB3qyNjTa5qh_hXASg@mail.gmail.com> <alpine.BSF.2.21.1802182023090.24158@mail.fig.ol.no> <6D47FEC0-7991-4F76-AC31-2CC1E8934521@yahoo.com> <alpine.BSF.2.21.1802182134140.24158@mail.fig.ol.no> <CAGudoHHsHGZYo3Ke2MGhk517oti6gEz0Zk0-jSmXVbbk9Vx83g@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 18 Feb 2018 22:33+0100, Mateusz Guzik wrote:

> On Sun, Feb 18, 2018 at 9:38 PM, Trond Endrestøl <
> Trond.Endrestol@fagskolen.gjovik.no> wrote:
> 
> > On Sun, 18 Feb 2018 11:51-0800, Mark Millard wrote:
> >
> > > Note: -r329448 was reverted in -r329461 : racy.
> >
> > True. I got a crash when compiling r329451 while running r329449.
> > I've now booted the r329422 ZFS BE and I'm attempting to build
> > r329529.
> >
> 
> Looking around strongly suggests r329448 is the culprit. If you can verify
> 329447 works fine we are mostly done here.

I noticed no errors in r329447. When r329529 is built and installed, 
I'll try to incrementally build and install r329531.

> Note the revision got reverted and different variant got in in r329531.
> 
> That said, if r329447 works then the issue should be already fixed and in
> particular fresh head should work fine.

-- 
Trond.
From owner-freebsd-current@freebsd.org  Sun Feb 18 22:25:15 2018
Return-Path: <owner-freebsd-current@freebsd.org>
Delivered-To: freebsd-current@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 16F1EF0AAA1;
 Sun, 18 Feb 2018 22:25:15 +0000 (UTC)
 (envelope-from mjguzik@gmail.com)
Received: from mail-qt0-x244.google.com (mail-qt0-x244.google.com
 [IPv6:2607:f8b0:400d:c0d::244])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 9FE086F5D1;
 Sun, 18 Feb 2018 22:25:14 +0000 (UTC)
 (envelope-from mjguzik@gmail.com)
Received: by mail-qt0-x244.google.com with SMTP id q18so10181839qtl.3;
 Sun, 18 Feb 2018 14:25:14 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=mime-version:in-reply-to:references:from:date:message-id:subject:to
 :cc; bh=5Mz4rua7Pa3nzFZ50ABKH/hndRiO+i+rdmf7JPPiPnM=;
 b=M11eNmfCVxZAktlIlU8FBtpt/4EO+oV4Ra8LKDfqciV1MBvk5sQvRzvyimvhBvzJzT
 lm5gtu9GVSK12W0/VmDzxrkwPv64uDKLbIicBnyXg0k7+ssMCYyvdJPFxeRJy1BJaib5
 9CX8eAZxy+ggQ7YHQG6fLKe5UWbFtMAMt3ylmjWRm1HUXfRjID/sJpMr/afHj1nNDhiP
 5yLB+RsCCzxpSMPzFZJasxF5tD5ZJE9eJRTCuH+w2/GRba4M8+ifJDHOWAJt6xK5CvXX
 zNZTvLXVV3ys0lde0MW+JtCKuck/3rDKWyh90rsFW0AelgsVIZqgOMQTlg785ATL7wmJ
 29oQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:in-reply-to:references:from:date
 :message-id:subject:to:cc;
 bh=5Mz4rua7Pa3nzFZ50ABKH/hndRiO+i+rdmf7JPPiPnM=;
 b=kZY/GSQHt72lRrlzdp7jZJ2WX2oPujv8/llCX8cAmbQJ3LQ6kfThnCikEo+U0fXcjV
 8ByXhsSsobCnTAjvYSC7hd2r6TDRu+ey/C3NiYOgXcz2XelrGYR4krV4n8fVVXQZhjXP
 NTUn2htQ7xxtMzINFNspvAZ/JcZyLCdtwdCDWtaeiMRJ5R+42dZOFJKzP0G8kbnX2LUU
 NslEMSef8EXJg9aVOyeP3nCul4XDstKcnr8zZfWHnR5WEqFdjS2Obf/1hvvXBlbfO/gY
 0GvAjGWlTjhqZWlqg/cnzqoWXQ+hIks+/970jOATc+diOf/nw0FsND50oyC6nmZxebYb
 4deg==
X-Gm-Message-State: APf1xPDdtgkQCcHwO+Iug6D/RBe5KZvTyWEIQ9goOFMdbbVIyhE9E8WE
 0yW6hyZD4V6qQhKKqZPmPgYoMlEObbczTJFxXp1R0Q==
X-Google-Smtp-Source: AH8x225eE+wjAhvjheRNf7o1LatxKVBlohyL7Y7OgqbzI4/6x7zoLT7/utKn3KzHbYvGH0feMdfYxMBbK/H5gQgIhA4=
X-Received: by 10.200.48.13 with SMTP id f13mr21554173qte.140.1518992714326;
 Sun, 18 Feb 2018 14:25:14 -0800 (PST)
MIME-Version: 1.0
Received: by 10.237.58.99 with HTTP; Sun, 18 Feb 2018 14:25:13 -0800 (PST)
In-Reply-To: <24563F96-B1A3-48E6-ABE3-D77E0887FFEE@yahoo.com>
References: <DA76F62D-3373-47CA-AD95-DE9BA580772B@yahoo.com>
 <6907E068-C80A-44B8-A8AD-3EF27D52D127@yahoo.com>
 <20832C61-AA5D-41A6-8BF9-90CC87D17219@yahoo.com>
 <CAGudoHH5Yz6312QSADiOVx9kd17=WcatEB3qyNjTa5qh_hXASg@mail.gmail.com>
 <alpine.BSF.2.21.1802182023090.24158@mail.fig.ol.no>
 <6D47FEC0-7991-4F76-AC31-2CC1E8934521@yahoo.com>
 <alpine.BSF.2.21.1802182134140.24158@mail.fig.ol.no>
 <CAGudoHHsHGZYo3Ke2MGhk517oti6gEz0Zk0-jSmXVbbk9Vx83g@mail.gmail.com>
 <E9A951DF-914D-4707-8A76-32E627F13EC0@yahoo.com>
 <24563F96-B1A3-48E6-ABE3-D77E0887FFEE@yahoo.com>
From: Mateusz Guzik <mjguzik@gmail.com>
Date: Sun, 18 Feb 2018 23:25:13 +0100
Message-ID: <CAGudoHFkBySZwA__8Zs5o7AtOr-qMCYTE7PNfbjjH6WsrJYt3w@mail.gmail.com>
Subject: Re: amd64 head -r329465 (non-debug build, but with symbols): "panic:
 spin lock held too long" during make check-old,
 reported during a sys_vfork
To: Mark Millard <marklmi26-fbsd@yahoo.com>
Cc: =?UTF-8?Q?Trond_Endrest=C3=B8l?= <Trond.Endrestol@fagskolen.gjovik.no>, 
 FreeBSD Hackers <freebsd-hackers@freebsd.org>,
 FreeBSD Current <freebsd-current@freebsd.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Content-Filtered-By: Mailman/MimeDel 2.1.25
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.25
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
 <freebsd-current.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current/>;
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 18 Feb 2018 22:25:15 -0000

On Sun, Feb 18, 2018 at 10:50 PM, Mark Millard <marklmi26-fbsd@yahoo.com>
wrote:

>
>
> On 2018-Feb-18, at 1:46 PM, Mark Millard <marklmi26-fbsd@yahoo.com> wrote=
:
>
> > On 2018-Feb-18, at 1:33 PM, Mateusz Guzik <mjguzik@gmail.com> wrote:
> >
> >> On Sun, Feb 18, 2018 at 9:38 PM, Trond Endrest=C3=B8l <
> >> Trond.Endrestol@fagskolen.gjovik.no> wrote:
> >>
> >>> On Sun, 18 Feb 2018 11:51-0800, Mark Millard wrote:
> >>>
> >>>> Note: -r329448 was reverted in -r329461 : racy.
> >>>
> >>> True. I got a crash when compiling r329451 while running r329449.
> >>> I've now booted the r329422 ZFS BE and I'm attempting to build
> >>> r329529.
> >>>
> >>
> >> Looking around strongly suggests r329448 is the culprit. If you can
> verify
> >> 329447 works fine we are mostly done here.
> >>
> >> Note the revision got reverted and different variant got in in r329531=
.
> >>
> >> That said, if r329447 works then the issue should be already fixed and
> in
> >> particular fresh head should work fine.
> >
> > My initial problem was with -r329465, which is after -r329461 reverted
> > -r329488 . Trond reported in one note that he had problems with
> > -r329464 , also after -r329488 was reverted. Trond has also reported
> > -r329449 failed.
>
> Dumb typos above: I meant -r329448 instead of -r329488 both times.
>
>
Ok, I think I see the bug:

exit1 does:
        PROC_SLOCK(p);
        p->p_state =3D PRS_ZOMBIE;
/* work continues */

pre-patch proc_to_reap does an equivalent of:
       if (p->p_state =3D=3D PRS_ZOMBIE) {
                PROC_SLOCK(p);
                PROC_SUNLOCK(p);
                .... reap;
      }

It is possible the exiting thread will be caught just after setting the
state to PRS_ZOMBIE.

With the slock/sunlock cycle we guarantee the reaping thread will
wait for it to finish.

Without the cycle we can end up reaping the still exiting thread.

I'll fix it soon(tm).

--=20
Mateusz Guzik <mjguzik gmail.com>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.BSF.2.21.1802182320440.24158>