Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 6 Oct 2016 13:40:14 +0300
From:      Slawa Olhovchenkov <slw@zxy.spb.ru>
To:        Bruce Evans <brde@optusnet.com.au>
Cc:        Eric van Gyzen <vangyzen@freebsd.org>, src-committers@freebsd.org, svn-src-all@freebsd.org, Gleb Smirnoff <glebius@freebsd.org>, svn-src-head@freebsd.org
Subject:   Re: svn commit: r306346 - head/sys/kern
Message-ID:  <20161006104014.GE6177@zxy.spb.ru>
In-Reply-To: <20161006135042.R2235@besplex.bde.org>
References:  <201609261530.u8QFUUZd020174@repo.freebsd.org> <20161004205600.GN23123@FreeBSD.org> <20161005101932.U984@besplex.bde.org> <20161005204613.GD6177@zxy.spb.ru> <20161006135042.R2235@besplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Oct 06, 2016 at 02:08:46PM +1100, Bruce Evans wrote:

> On Wed, 5 Oct 2016, Slawa Olhovchenkov wrote:
> 
> > On Wed, Oct 05, 2016 at 11:19:10AM +1100, Bruce Evans wrote:
> >
> >> On Tue, 4 Oct 2016, Gleb Smirnoff wrote:
> >>
> >>> On Mon, Sep 26, 2016 at 03:30:30PM +0000, Eric van Gyzen wrote:
> >>> E> ...
> >>> E> Modified: head/sys/kern/kern_mutex.c
> >>> E> ==============================================================================
> >>> E> --- head/sys/kern/kern_mutex.c	Mon Sep 26 15:03:31 2016	(r306345)
> >>> E> +++ head/sys/kern/kern_mutex.c	Mon Sep 26 15:30:30 2016	(r306346)
> >>> E> @@ -924,7 +924,7 @@ __mtx_assert(const volatile uintptr_t *c
> >>> E>  {
> >>> E>  	const struct mtx *m;
> >>> E>
> >>> E> -	if (panicstr != NULL || dumping)
> >>> E> +	if (panicstr != NULL || dumping || SCHEDULER_STOPPED())
> >>> E>  		return;
> >>>
> >>> I wonder if all this disjunct can be reduced just to SCHEDULER_STOPPED()?
> >>> Positive panicstr and dumping imply scheduler stopped.
> >>
> >> 'dumping' doesn't imply SCHEDULER_STOPPED().
> >>
> >> Checking 'dumping' here seems to be just an old bug.  It just breaks
> >> __mtx_assert(), while all other mutex operations work normally for dumping
> >> without panicing.
> >
> > [...]
> >
> > Is this related to halted (not reboted) 11.0 after ~^B and `panic`?
> 
> There might be related problems, but I don't see any here.
> 
> > What I see on serial console:
> > =====
> > db> panic
> > panic: from debugger
> 
> I wouldn't trust panic from the debugger, but it is safer than dump
> from the debugger (both are ddb commands, but this is another bug).
> 
> > cpuid = 1
> > KDB: stack backtrace:
> > db_trace_self_wrapper() at 0xffffffff8031fadb = db_trace_self_wrapper+0x2b/frame 0xfffffe1f9e198120
> > vpanic() at 0xffffffff804a0302 = vpanic+0x182/frame 0xfffffe1f9e1981a0
> > panic() at 0xffffffff804a0383 = panic+0x43/frame 0xfffffe1f9e198200
> > db_panic() at 0xffffffff8031d987 = db_panic+0x17/frame 0xfffffe1f9e198210
> > db_command() at 0xffffffff8031d019 = db_command+0x299/frame 0xfffffe1f9e1982e0
> > db_command_loop() at 0xffffffff8031cd74 = db_command_loop+0x64/frame 0xfffffe1f9e1982f0
> > db_trap() at 0xffffffff8031fc1b = db_trap+0xdb/frame 0xfffffe1f9e198380
> > kdb_trap() at 0xffffffff804dd8c3 = kdb_trap+0x193/frame 0xfffffe1f9e198410
> > trap() at 0xffffffff806e3065 = trap+0x255/frame 0xfffffe1f9e198620
> > calltrap() at 0xffffffff806cafd1 = calltrap+0x8/frame 0xfffffe1f9e198620
> > --- trap 0x3, rip = 0xffffffff804dd11e, rsp = 0xfffffe1f9e1986f0, rbp = 0xfffffe1f9e198710 ---
> > kdb_alt_break_internal() at 0xffffffff804dd11e = kdb_alt_break_internal+0x18e/frame 0xfffffe1f9e198710
> > kdb_alt_break() at 0xffffffff804dcf8b = kdb_alt_break+0xb/frame 0xfffffe1f9e198720
> > uart_intr_rxready() at 0xffffffff803e38a8 = uart_intr_rxready+0x98/frame 0xfffffe1f9e198750
> > uart_intr() at 0xffffffff803e4621 = uart_intr+0x121/frame 0xfffffe1f9e198790
> > intr_event_handle() at 0xffffffff8046c74b = intr_event_handle+0x9b/frame 0xfffffe1f9e1987e0
> > intr_execute_handlers() at 0xffffffff8076d2d8 = intr_execute_handlers+0x48/frame 0xfffffe1f9e198810
> > lapic_handle_intr() at 0xffffffff8077163f = lapic_handle_intr+0x3f/frame 0xfffffe1f9e198830
> > Xapic_isr1() at 0xffffffff806cb6b7 = Xapic_isr1+0xb7/frame 0xfffffe1f9e198830
> > --- interrupt, rip = 0xffffffff8032fedf, rsp = 0xfffffe1f9e198900, rbp = 0xfffffe1f9e198940 ---
> > acpi_cpu_idle() at 0xffffffff8032fedf = acpi_cpu_idle+0x2af/frame 0xfffffe1f9e198940
> > cpu_idle_acpi() at 0xffffffff8076ad1f = cpu_idle_acpi+0x3f/frame 0xfffffe1f9e198960
> > cpu_idle() at 0xffffffff8076adc5 = cpu_idle+0x95/frame 0xfffffe1f9e198980
> > sched_idletd() at 0xffffffff804cbbe5 = sched_idletd+0x495/frame 0xfffffe1f9e198a70
> > fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe1f9e198ab0
> > fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame 0xfffffe1f9e198ab0
> > --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> 
> This looks like a normal kdb entry then a not so normal panic from ddb,
> but no problems.

Yes, I am just capture all output from console after command (`panic`).

> > Uptime: 1d4h53m19s
> > Dumping 12148 out of 131020 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%
> > Dump complete
> > mps2: Sending StopUnit: path (xpt0:mps2:0:14:ffffffff):  handle 12
> > mps2: Incrementing SSU count
> > mps2: Sending StopUnit: path (xpt0:mps2:0:18:ffffffff):  handle 9
> > mps2: Incrementing SSU count
> > =====
> >
> > This is normal reboot (by /sbin/reboot):
> 
> Is the above just a hung dump from reboot, before going near ddb?  That
> case should work, but perhaps it needs to be more careful about waiting
> for the other CPUs.  Just stopping them is no good since it gives an
> even more fragile environment, like panicing or entering ddb.

Above is attempt to collect dump and reboot from KDB.
Similar output exist from INAVRIANT:

====
panic: tcp_detach: INP_TIMEWAIT && INP_DROPPED && tp != NULL
cpuid = 4
KDB: stack backtrace:
db_trace_self_wrapper() at 0xffffffff8032467b = db_trace_self_wrapper+0x2b/frame 0xfffffe1f9e1f8730
vpanic() at 0xffffffff804b5672 = vpanic+0x182/frame 0xfffffe1f9e1f87b0
kassert_panic() at 0xffffffff804b54e6 = kassert_panic+0x126/frame 0xfffffe1f9e1f8820
tcp_usr_detach() at 0xffffffff806564dc = tcp_usr_detach+0x1bc/frame 0xfffffe1f9e1f8850
sofree() at 0xffffffff8053de66 = sofree+0x1a6/frame 0xfffffe1f9e1f8880
tcp_close() at 0xffffffff8064dd8e = tcp_close+0x11e/frame 0xfffffe1f9e1f88b0
tcp_timer_2msl() at 0xffffffff80653c28 = tcp_timer_2msl+0x278/frame 0xfffffe1f9e1f88e0
softclock_call_cc() at 0xffffffff804cbacc = softclock_call_cc+0x19c/frame 0xfffffe1f9e1f89c0
softclock() at 0xffffffff804cbec7 = softclock+0x47/frame 0xfffffe1f9e1f89e0
intr_event_execute_handlers() at 0xffffffff8047aa86 = intr_event_execute_handlers+0x96/frame 0xfffffe1f9e1f8a20
ithread_loop() at 0xffffffff8047b106 = ithread_loop+0xa6/frame 0xfffffe1f9e1f8a70
fork_exit() at 0xffffffff804781b4 = fork_exit+0x84/frame 0xfffffe1f9e1f8ab0
fork_trampoline() at 0xffffffff80713fce = fork_trampoline+0xe/frame 0xfffffe1f9e1f8ab0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
Uptime: 54m39s
Dumping 7780 out of 131019 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%
Dump complete
mps2: Sending StopUnit: path (xpt0:mps2:0:14:ffffffff):  handle 12
mps2: Incrementing SSU count
mps2: Sending StopUnit: path (xpt0:mps2:0:18:ffffffff):  handle 9
mps2: Incrementing SSU count
====

And need power reset for reboot.

> >
> > ===
> > Sending StopUnit: path (xpt0:mps2:0:14:ffffffff):  handle 13
> > mps2: Incrementing SSU count
> > mps2: Sending StopUnit: path (xpt0:mps2:0:18:ffffffff):  handle 9
> > mps2: Incrementing SSU count
> > mps2: Decrementing SSU count.
> > mps2: Completing stop unit for (xpt0:mps2:0:18:ffffffff):
> > mps2: Decrementing SSU count.
> > mps2: Completing stop unit for (xpt0:mps2:0:14:ffffffff):
> > ===
> >
> > ====
> > mps2: lagg0: link state changed to DOWN
> > Sending StopUnit: path (xpt0:mps2:0:14:ffffffff):  handle 12
> > mps2: Incrementing SSU count
> > mps2: Sending StopUnit: path (xpt0:mps2:0:18:ffffffff):  handle 9
> > mps2: Incrementing SSU count
> > mps2: Decrementing SSU count.
> > mps2: Completing stop unit for (xpt0:mps2:0:18:ffffffff):
> > mps2: Decrementing SSU count.
> > mps2: Completing stop unit for (xpt0:mps2:0:14:ffffffff):
> > ====
> 
> Bruce
> _______________________________________________
> svn-src-all@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/svn-src-all
> To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20161006104014.GE6177>