Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 16 Nov 2018 17:56:54 +0100
From:      Sylvain GALLIANO <sg@efficientip.com>
To:        markj@freebsd.org
Cc:        freebsd-current@freebsd.org
Subject:   Re: Panic on kern_event.c
Message-ID:  <CAHdyrku4OPRr1Ku0WF3XT3vK_dqvNzWN%2BMYz7pXTkiNJakfJGQ@mail.gmail.com>
In-Reply-To: <20181116154210.GB17379@raichu>
References:  <CAHdyrkvqGp8PGFaCSGgeDFC7wBhjnHK4eL99WM5fMO_yZ_u5KA@mail.gmail.com> <20181107043503.GB30861@raichu> <CAHdyrkt42cn8%2BKqhp-jQ9iZNnreypMT1qybNTcFtx8JivKggZA@mail.gmail.com> <20181115221019.GA2514@raichu> <CAHdyrksHLvzXDkjoy2PpiTgb%2BmEKHJ979rwcW3RJx32qdAyJzg@mail.gmail.com> <20181116154210.GB17379@raichu>

next in thread | previous in thread | raw e-mail | index | archive | help
Le ven. 16 nov. 2018 =C3=A0 16:42, Mark Johnston <markj@freebsd.org> a =C3=
=A9crit :

> On Fri, Nov 16, 2018 at 03:47:39PM +0100, Sylvain GALLIANO wrote:
> > Le jeu. 15 nov. 2018 =C3=A0 23:10, Mark Johnston <markj@freebsd.org> a =
=C3=A9crit
> :
> >
> > > On Thu, Nov 08, 2018 at 05:05:03PM +0100, Sylvain GALLIANO wrote:
> > > > Hi,
> > > >
> > > > I replaced
> > > > << printf("XXX knote %p already in tailq  status:%x kq_count:%d  [%=
p
> %p]
> > > >
> > >
> %u\n",kn,kn->kn_status,kq->kq_count,kn->kn_tqe.tqe_next,kn->kn_tqe.tqe_pr=
ev,__LINE__);
> > > > by
> > > > >> panic("XXX knote %p already in tailq  status:%x kq_count:%d  [%p
> %p]
> > > >
> > >
> %u\n",kn,kn->kn_status,kq->kq_count,kn->kn_tqe.tqe_next,kn->kn_tqe.tqe_pr=
ev,__LINE__);
> > > >
> > > > Here is the stack during panic:
> > > > panic: XXX knote 0xfffff801e1c6ddc0 already in tailq  status:1
> kq_count:2
> > > > [0 0xfffff8000957a978]  2671
> > > >
> > > Could you please give the following patch a try?
> > >
> > > If possible, could you also ktrace one of the active syslog-ng
> processes
> > > for some time, perhaps 15 seconds, and share the kdump?  I have been
> > > trying to reproduce the problem without any luck.
> > >
> > Unfortunately patched kernel is not stable:
> > - some processes run at 100% CPU (STOP state) and cannot be killed
> > - sometime the system completely freeze (need a hard reboot)
> >
> > I cannot reproduce the issue as soon as syslog-ng is under ktrace (even
> > after 10GB of ktrace file)
> > When I stop ktrace, issue come back after few minutes.
>
> That's ok, I'd like to see part of the ktrace even if the problem
> doesn't occur; this bug appears to be a race condition, so it's not
> surprising that ktrace might hide it.
>

Lucky ktrace this time, issue occured 2 times:

Nov 16 16:13:29 solid kernel: XXX knote 0xfffff8003282fb40 already in
tailq  status:1 kq_count:1  [0 0xfffff80032883138]  2671
Nov 16 16:14:39 solid kernel: XXX knote 0xfffff8003282f3c0 already in
tailq  status:1 kq_count:1  [0 0xfffff80032883138]  2671

ktrace.out.xz located in:
https://drive.google.com/drive/folders/1MbqJQm12-KOYDbb4-9uNRTnAdsNqLaIP?us=
p=3Dsharing



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAHdyrku4OPRr1Ku0WF3XT3vK_dqvNzWN%2BMYz7pXTkiNJakfJGQ>