Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 22 Mar 2018 08:57:47 -0600
From:      Warner Losh <imp@bsdimp.com>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        Bruce Evans <brde@optusnet.com.au>, Warner Losh <imp@freebsd.org>,  src-committers <src-committers@freebsd.org>, svn-src-all@freebsd.org,  svn-src-head@freebsd.org
Subject:   Re: svn commit: r331298 - head/sys/dev/syscons
Message-ID:  <CANCZdfpYApi85_%2BhucGy0hZibZ8enSSbNeg5gS2Zz5_%2BFw0K_g@mail.gmail.com>
In-Reply-To: <20180322114213.GR76926@kib.kiev.ua>
References:  <201803211447.w2LElDcK091988@repo.freebsd.org> <20180322024846.S4293@besplex.bde.org> <20180321202752.GO76926@kib.kiev.ua> <CANCZdfrY9PQ-FUApReGeFqwH%2BdoSUN5AtvF0ag1rD09sKYq6gg@mail.gmail.com> <20180322174025.Q1053@besplex.bde.org> <20180322114213.GR76926@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Mar 22, 2018 at 5:42 AM, Konstantin Belousov <kostikbel@gmail.com>
wrote:

> On Thu, Mar 22, 2018 at 05:50:57PM +1100, Bruce Evans wrote:
> > On Wed, 21 Mar 2018, Warner Losh wrote:
> >
> > > On Wed, Mar 21, 2018 at 2:27 PM, Konstantin Belousov <
> kostikbel@gmail.com>
> > > wrote:
> > >> ...
> > >> Are you saying that fast interrupt handlers call shutdown_nice() ?
> This
> > >> is the quite serious bug on its own.  To fix it, shutdown_nice()
> should
> > >> use a fast taskqueue to schedule the task which would lock the process
> > >> and send the signal.
> > >
> > > Is there some way we know we're in a fast interrupt handler? If so, it
> > > should be simple to fix. If not, then there's an API change ahead of
> us...
> >
> > There is a td_intr_nesting_level flag that might work.  (I invented this,
> > but don't like its current use.)
> But why do we need to know this ?  We can always schedule a task in the
> fast taskqueue if init is present and can be scheduled.


Ah, good point. Seems like a poster child example of doing a fast task
queue deferment. That's always safe, except maybe in BDE's fast interrupt
world.

> > But bde is right: the system has to be in good enough shape to cope. I
> > > wonder if we should put that coping into kdb_reboot() instead. It's
> only an
> > > issue for <CR> TILDE ^R, which is a fairly edge case.
> >
> > shutdown_nice() is also called directly from syscons and vt for the
> reboot,
> > halt and poweroff keys.  This happens in normal interrupt handler
> context,
> > so there is only a problem when init is not running.
>
> diff --git a/sys/kern/kern_shutdown.c b/sys/kern/kern_shutdown.c
> index e5ea9644ad3..e7c6d4c92b2 100644
> --- a/sys/kern/kern_shutdown.c
> +++ b/sys/kern/kern_shutdown.c
> @@ -72,6 +72,7 @@ __FBSDID("$FreeBSD$");
>  #include <sys/smp.h>
>  #include <sys/sysctl.h>
>  #include <sys/sysproto.h>
> +#include <sys/taskqueue.h>
>  #include <sys/vnode.h>
>  #include <sys/watchdog.h>
>
> @@ -276,6 +277,28 @@ sys_reboot(struct thread *td, struct reboot_args *uap)
>         return (error);
>  }
>
> +static void
> +shutdown_nice_task_fn(void *arg, int pending __unused)
> +{
> +       int howto;
> +
> +       howto = (uintptr_t)arg;
> +       /* Send a signal to init(8) and have it shutdown the world. */
> +       PROC_LOCK(initproc);
> +       if (howto & RB_POWEROFF)
> +               kern_psignal(initproc, SIGUSR2);
> +       else if (howto & RB_POWERCYCLE)
> +               kern_psignal(initproc, SIGWINCH);
> +       else if (howto & RB_HALT)
> +               kern_psignal(initproc, SIGUSR1);
> +       else
> +               kern_psignal(initproc, SIGINT);
> +       PROC_UNLOCK(initproc);
> +}
> +
> +static struct task shutdown_nice_task = TASK_INITIALIZER(0,
> +    &shutdown_nice_task_fn, NULL);
> +
>  /*
>   * Called by events that want to shut down.. e.g  <CTL><ALT><DEL> on a PC
>   */
> @@ -283,20 +306,14 @@ void
>  shutdown_nice(int howto)
>  {
>
> -       if (initproc != NULL) {
> -               /* Send a signal to init(8) and have it shutdown the
> world. */
> -               PROC_LOCK(initproc);
> -               if (howto & RB_POWEROFF)
> -                       kern_psignal(initproc, SIGUSR2);
> -               else if (howto & RB_POWERCYCLE)
> -                       kern_psignal(initproc, SIGWINCH);
> -               else if (howto & RB_HALT)
> -                       kern_psignal(initproc, SIGUSR1);
> -               else
> -                       kern_psignal(initproc, SIGINT);
> -               PROC_UNLOCK(initproc);
> +       if (initproc != NULL && !SCHEDULER_STOPPED()) {
> +               shutdown_nice_task.ta_context = (void *)(uintptr_t)howto;
> +               taskqueue_enqueue(taskqueue_fast, &shutdown_nice_task);
>         } else {
> -               /* No init(8) running, so simply reboot. */
> +               /*
> +                * No init(8) running, or scheduler would not allow it
> +                * to run, so simply reboot.
> +                */
>                 kern_reboot(howto | RB_NOSYNC);
>         }
>  }
>

I like this elegance. I think we should commit this. While it would be nice
to support the super-fast interrupts that bde has done, we already don't
support it in a number of places. Conceptually, I like the notion of the
tilde escape machine in the tty layer, but I don't like the trade offs.
Here we only defer to reboot, which requires a fair amount of machinery
working to work right. it's not the dependencies I'd prefer, but usually it
doesn't matter. However, to do that, it would also mean that breaking to
the debugger would need to be done in the tty layer, which requires more of
the context switching mechanisms to be working, which would be a bigger
loss. So this is a good trade-off.

If kern_reboot() can't be called from a fast interrupt handler in bdeBSD,
it would be a simple matter to put a wrapper around it here so his kernel
can context switch. Though, if the scheduler isn't running, I'm not sure
what you can do in this situation. The system is already hung or in a
really bad way, so it would have to jump to somewhere late in the
kern_reboot() code, though I'm not sure. It's kinda hard to know what's
safe just from descriptions from bde. We're already not synching the data
to the disks, so I'm not sure what more can be done.

I think we should commit this change. It makes things better, but

Warner



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfpYApi85_%2BhucGy0hZibZ8enSSbNeg5gS2Zz5_%2BFw0K_g>