From owner-freebsd-arch@FreeBSD.ORG Fri Dec 7 14:25:37 2012 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8A524B49; Fri, 7 Dec 2012 14:25:37 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-we0-f182.google.com (mail-we0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id DFE698FC08; Fri, 7 Dec 2012 14:25:36 +0000 (UTC) Received: by mail-we0-f182.google.com with SMTP id u54so282495wey.13 for ; Fri, 07 Dec 2012 06:25:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=snEVFON7T9m19p2rfo6bveHgW0MIv5AtdLqJAaPDKSc=; b=lxw1PXrjhI+9eEepErl6WqvG/JavM3kvPo8VskzhKJqyLI0xn4QZOFoOYxNne1NMNH OcSkLcOFgmNFkesicIstqb6uuTjsG7Yh4ynxSN2PlGt9HgsAVfWjn9unQa7jDMCzJ6Ns oBb/h8XqYXAXpWyitpDFaf/4RcFM0TcDQTmwUkc4MMiXUj5Oz6cGwnyd3G414Pj4l/py e3eUj/RurC+yuwwr5LGdTFrOEnkc67TL71VvHbKRtP0rFlH/Cpostpw7+j45D1kmuHhg SjMXLLusSAHDnVTU3S3psCA/2NYtM5WSoGQgDAyGkcGdkXvf1IlFAD4b6mD+Huy+WaqA /jaA== MIME-Version: 1.0 Received: by 10.180.104.69 with SMTP id gc5mr15308785wib.13.1354890335533; Fri, 07 Dec 2012 06:25:35 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.217.57.9 with HTTP; Fri, 7 Dec 2012 06:25:35 -0800 (PST) In-Reply-To: <201212070825.qB78P9rY055786@svn.freebsd.org> References: <201212070825.qB78P9rY055786@svn.freebsd.org> Date: Fri, 7 Dec 2012 06:25:35 -0800 X-Google-Sender-Auth: 1XeWfSWxS9_Af2C4V2xD5orcuGg Message-ID: Subject: Re: svn commit: r243980 - in head/sys: kern sys From: Adrian Chadd To: Alfred Perlstein Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-arch@freebsd.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Dec 2012 14:25:37 -0000 Ok, so can we now please split 'panic during debugging but we do handle the case' and 'panic during live runs as it really does mean we're hosed' assert checks, so coders can choose which is appropriate? Or does that now exist? Adrian On 7 December 2012 00:25, Alfred Perlstein wrote: > Author: alfred > Date: Fri Dec 7 08:25:08 2012 > New Revision: 243980 > URL: http://svnweb.freebsd.org/changeset/base/243980 > > Log: > Allow KASSERT to log instead of panic. > > This is to allow debug images to be used without taking down the > system when non-fatal asserts are hit. > > The following sysctls are added: > > debug.kassert.warn_only: 1 = log, 0 = panic > > debug.kassert.do_ktr: set to a ktr mask for logging via KTR > > debug.kassert.do_log: 1 = log, 0 = quiet > > debug.kassert.warnings: stats, number of kasserts hit > > debug.kassert.log_panic_at: > number of kasserts before we actually panic, 0 = never > > debug.kassert.log_pps_limit: pps limit for log messages > > debug.kassert.log_mute_at: stop warning after N kasserts, 0 = never stop > > debug.kassert.kassert: set this sysctl to trigger a kassert > > Discussed with: scottl, gnn, marcel > Sponsored by: iXsystems > > Modified: > head/sys/kern/kern_shutdown.c > head/sys/sys/systm.h > > Modified: head/sys/kern/kern_shutdown.c > ============================================================================== > --- head/sys/kern/kern_shutdown.c Fri Dec 7 07:24:15 2012 (r243979) > +++ head/sys/kern/kern_shutdown.c Fri Dec 7 08:25:08 2012 (r243980) > @@ -55,6 +55,7 @@ __FBSDID("$FreeBSD$"); > #include > #include > #include > +#include > #include > #include > #include > @@ -150,6 +151,7 @@ static void poweroff_wait(void *, int); > static void shutdown_halt(void *junk, int howto); > static void shutdown_panic(void *junk, int howto); > static void shutdown_reset(void *junk, int howto); > +static void vpanic(const char *fmt, va_list ap) __dead2; > > /* register various local shutdown events */ > static void > @@ -538,6 +540,120 @@ shutdown_reset(void *junk, int howto) > /* NOTREACHED */ /* assuming reset worked */ > } > > +#ifdef INVARIANTS > +static int kassert_warn_only = 0; > +#ifdef KTR > +static int kassert_do_ktr = 0; > +#endif > +static int kassert_do_log = 1; > +static int kassert_log_pps_limit = 4; > +static int kassert_log_mute_at = 0; > +static int kassert_log_panic_at = 0; > +static int kassert_warnings = 0; > + > +SYSCTL_NODE(_debug, OID_AUTO, kassert, CTLFLAG_RW, NULL, "kassert options"); > + > +SYSCTL_INT(_debug_kassert, OID_AUTO, warn_only, CTLFLAG_RW | CTLFLAG_TUN, > + &kassert_warn_only, 0, > + "KASSERT triggers a panic (1) or just a warning (0)"); > +TUNABLE_INT("debug.kassert.warn_only", &kassert_warn_only); > + > +#ifdef KTR > +SYSCTL_UINT(_debug_kassert, OID_AUTO, do_ktr, CTLFLAG_RW | CTLFLAG_TUN, > + &kassert_do_ktr, 0, > + "KASSERT does a KTR, set this to the KTRMASK you want"); > +TUNABLE_INT("debug.kassert.do_ktr", &kassert_do_ktr); > +#endif > + > +SYSCTL_INT(_debug_kassert, OID_AUTO, do_log, CTLFLAG_RW | CTLFLAG_TUN, > + &kassert_do_log, 0, "KASSERT triggers a panic (1) or just a warning (0)"); > +TUNABLE_INT("debug.kassert.do_log", &kassert_do_log); > + > +SYSCTL_INT(_debug_kassert, OID_AUTO, warnings, CTLFLAG_RW | CTLFLAG_TUN, > + &kassert_warnings, 0, "number of KASSERTs that have been triggered"); > +TUNABLE_INT("debug.kassert.warnings", &kassert_warnings); > + > +SYSCTL_INT(_debug_kassert, OID_AUTO, log_panic_at, CTLFLAG_RW | CTLFLAG_TUN, > + &kassert_log_panic_at, 0, "max number of KASSERTS before we will panic"); > +TUNABLE_INT("debug.kassert.log_panic_at", &kassert_log_panic_at); > + > +SYSCTL_INT(_debug_kassert, OID_AUTO, log_pps_limit, CTLFLAG_RW | CTLFLAG_TUN, > + &kassert_log_pps_limit, 0, "limit number of log messages per second"); > +TUNABLE_INT("debug.kassert.log_pps_limit", &kassert_log_pps_limit); > + > +SYSCTL_INT(_debug_kassert, OID_AUTO, log_mute_at, CTLFLAG_RW | CTLFLAG_TUN, > + &kassert_log_mute_at, 0, "max number of KASSERTS to log"); > +TUNABLE_INT("debug.kassert.log_mute_at", &kassert_log_mute_at); > + > +static int kassert_sysctl_kassert(SYSCTL_HANDLER_ARGS); > + > +SYSCTL_PROC(_debug_kassert, OID_AUTO, kassert, > + CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_SECURE, NULL, 0, > + kassert_sysctl_kassert, "I", "set to trigger a test kassert"); > + > +static int > +kassert_sysctl_kassert(SYSCTL_HANDLER_ARGS) > +{ > + int error, i; > + > + error = sysctl_wire_old_buffer(req, sizeof(int)); > + if (error == 0) { > + i = 0; > + error = sysctl_handle_int(oidp, &i, 0, req); > + } > + if (error != 0 || req->newptr == NULL) > + return (error); > + KASSERT(0, ("kassert_sysctl_kassert triggered kassert %d", i)); > + return (0); > +} > + > +/* > + * Called by KASSERT, this decides if we will panic > + * or if we will log via printf and/or ktr. > + */ > +void > +kassert_panic(const char *fmt, ...) > +{ > + static char buf[256]; > + va_list ap; > + > + va_start(ap, fmt); > + (void)vsnprintf(buf, sizeof(buf), fmt, ap); > + va_end(ap); > + > + /* > + * panic if we're not just warning, or if we've exceeded > + * kassert_log_panic_at warnings. > + */ > + if (!kassert_warn_only || > + (kassert_log_panic_at > 0 && > + kassert_warnings >= kassert_log_panic_at)) { > + va_start(ap, fmt); > + vpanic(fmt, ap); > + /* NORETURN */ > + } > +#ifdef KTR > + if (kassert_do_ktr) > + CTR0(ktr_mask, buf); > +#endif /* KTR */ > + /* > + * log if we've not yet met the mute limit. > + */ > + if (kassert_do_log && > + (kassert_log_mute_at == 0 || > + kassert_warnings < kassert_log_mute_at)) { > + static struct timeval lasterr; > + static int curerr; > + > + if (ppsratecheck(&lasterr, &curerr, kassert_log_pps_limit)) { > + printf("KASSERT failed: %s\n", buf); > + kdb_backtrace(); > + } > + } > + atomic_add_int(&kassert_warnings, 1); > +} > +#endif > + > /* > * Panic is called on unresolvable fatal errors. It prints "panic: mesg", > * and then reboots. If we are called twice, then we avoid trying to sync > @@ -546,12 +662,20 @@ shutdown_reset(void *junk, int howto) > void > panic(const char *fmt, ...) > { > + va_list ap; > + > + va_start(ap, fmt); > + vpanic(fmt, ap); > +} > + > +static void > +vpanic(const char *fmt, va_list ap) > +{ > #ifdef SMP > cpuset_t other_cpus; > #endif > struct thread *td = curthread; > int bootopt, newpanic; > - va_list ap; > static char buf[256]; > > spinlock_enter(); > @@ -587,7 +711,6 @@ panic(const char *fmt, ...) > newpanic = 1; > } > > - va_start(ap, fmt); > if (newpanic) { > (void)vsnprintf(buf, sizeof(buf), fmt, ap); > panicstr = buf; > @@ -598,7 +721,6 @@ panic(const char *fmt, ...) > vprintf(fmt, ap); > printf("\n"); > } > - va_end(ap); > #ifdef SMP > printf("cpuid = %d\n", PCPU_GET(cpuid)); > #endif > > Modified: head/sys/sys/systm.h > ============================================================================== > --- head/sys/sys/systm.h Fri Dec 7 07:24:15 2012 (r243979) > +++ head/sys/sys/systm.h Fri Dec 7 08:25:08 2012 (r243980) > @@ -73,14 +73,16 @@ extern int vm_guest; /* Running as virt > enum VM_GUEST { VM_GUEST_NO = 0, VM_GUEST_VM, VM_GUEST_XEN }; > > #ifdef INVARIANTS /* The option is always available */ > +void kassert_panic(const char *fmt, ...); > + > #define KASSERT(exp,msg) do { \ > if (__predict_false(!(exp))) \ > - panic msg; \ > + kassert_panic msg; \ > } while (0) > #define VNASSERT(exp, vp, msg) do { \ > if (__predict_false(!(exp))) { \ > vn_printf(vp, "VNASSERT failed\n"); \ > - panic msg; \ > + kassert_panic msg; \ > } \ > } while (0) > #else