From owner-freebsd-hackers@freebsd.org  Tue Nov  1 12:40:38 2016
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 44747C23641
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Tue,  1 Nov 2016 12:40:38 +0000 (UTC)
 (envelope-from kostikbel@gmail.com)
Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1])
 (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id CA1DB171D;
 Tue,  1 Nov 2016 12:40:37 +0000 (UTC)
 (envelope-from kostikbel@gmail.com)
Received: from tom.home (kib@localhost [127.0.0.1])
 by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id uA1CeUAD063378
 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO);
 Tue, 1 Nov 2016 14:40:31 +0200 (EET)
 (envelope-from kostikbel@gmail.com)
DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua uA1CeUAD063378
Received: (from kostik@localhost)
 by tom.home (8.15.2/8.15.2/Submit) id uA1CeU7q063376;
 Tue, 1 Nov 2016 14:40:30 +0200 (EET)
 (envelope-from kostikbel@gmail.com)
X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com
 using -f
Date: Tue, 1 Nov 2016 14:40:30 +0200
From: Konstantin Belousov <kostikbel@gmail.com>
To: Eric Badger <badger@FreeBSD.org>
Cc: freebsd-hackers@freebsd.org
Subject: Re: Crashes with 'reboot -d'
Message-ID: <20161101124030.GC54029@kib.kiev.ua>
References: <b2b9dd83-239d-57b9-0f7c-5bbd19b69dde@FreeBSD.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <b2b9dd83-239d-57b9-0f7c-5bbd19b69dde@FreeBSD.org>
User-Agent: Mutt/1.7.1 (2016-10-04)
X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00,
 DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no
 autolearn_force=no version=3.4.1
X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 01 Nov 2016 12:40:38 -0000

On Mon, Oct 31, 2016 at 05:07:25PM -0500, Eric Badger wrote:
> I've run into crashes when using 'reboot -d' (or a slightly tweaked
> version of it in our FreeBSD spin at work). The problem is that dump
> code is written to run in a panic/crash scenario, when all other CPUs
> are stopped. In the case of 'reboot -d', all other CPUs are not stopped.
> The code in xpt_polled_action runs what would normally be done by the
> interrupt handler, polling start_ccb->ccb_h.status to see when the
> operation has been completed. If the real interrupt handler is still
> running, however, polling start_ccb->ccb_h.status is not sufficient; the
> ccb may be placed in the cam kproc's doneq after start_ccb->ccb_h.status
> has been updated. The dumper will reuse the ccb's memory, but when the
> cam kproc processes that item in its doneq, it will twiddle bits and
> corrupt the now reused ccb memory.
> 
> I fixed this by shutting off other CPUs when doing a dump during reboot
> (patch below). This seems fine, but perhaps heavy handed. I also
> experimented with letting the normal interrupt handler and cam kproc do
> the work when we're not in a SCHEDULER_STOPPED() scenario. This seemed
> to reduce dump performance and make performance less consistent, but
> otherwise worked ok.
> 
> I'd appreciate any comments on things I may have failed to consider. If
> no objections are raised, I will proceed with the patch here.
> 
> Thanks,
> Eric
> 
> diff --git a/sys/kern/kern_shutdown.c b/sys/kern/kern_shutdown.c
> index 79c4c30..bdc0182 100644
> --- a/sys/kern/kern_shutdown.c
> +++ b/sys/kern/kern_shutdown.c
> @@ -319,8 +319,9 @@ void
>  kern_reboot(int howto)
>  {
>         static int once = 0;
> +#ifdef SMP
> + cpuset_t other_cpus;
> 
> -#if defined(SMP)
>         /*
>          * Bind us to CPU 0 so that all shutdown code runs there.  Some
>          * systems don't shutdown properly (i.e., ACPI power off) if we
> @@ -362,8 +363,28 @@ kern_reboot(int howto)
>          */
>         EVENTHANDLER_INVOKE(shutdown_post_sync, howto);
> 
> -   if ((howto & (RB_HALT|RB_DUMP)) == RB_DUMP && !cold && !dumping)
> + if ((howto & (RB_HALT|RB_DUMP)) == RB_DUMP && !cold && !dumping) {
> +#ifdef SMP
> + /*
> +  * Dump code assumes that all other CPUs have stopped, and thus
> +  * handles disk interrupts manually. This assumption must be enforced,
> +  * as otherwise the real interrupt handler may race with the dumper.
> +  */
> + if (!SCHEDULER_STOPPED()) {
> +         spinlock_enter();
> +
> +         other_cpus = all_cpus;
> +         CPU_CLR(PCPU_GET(cpuid), &other_cpus);
> +         stop_cpus_hard(other_cpus);
> +
> +         curthread->td_stopsched = 1;
> +
> +         /* Module shutdown is no longer safe. */
> +         howto |= RB_NOSYNC;
> + }
> +#endif
>                 doadump(TRUE);
> + }
> 
>         /* Now that we're going to really halt the system... */
>         EVENTHANDLER_INVOKE(shutdown_final, howto);
> 
> 

Such stop heavily relies on the fact that other CPUs do not perform any
cam-related activity in parallel.  It might be not true, e.g. if some
notifications are supplied by an HBA to OS.  Also, other CPUs might
own some resources which are needed by the io subsystem still, e.g.
busdma or vm locks.  This would reduce the probability of working dump.

Could the dumper ccb marked by some flag to prevent doneq from freeing
or modifying the memory and to keep the dumper using it ?