From owner-freebsd-arch@FreeBSD.ORG Tue Aug 23 16:39:05 2011 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B709A106566B; Tue, 23 Aug 2011 16:39:05 +0000 (UTC) (envelope-from mdf356@gmail.com) Received: from mail-qy0-f175.google.com (mail-qy0-f175.google.com [209.85.216.175]) by mx1.freebsd.org (Postfix) with ESMTP id 58E328FC08; Tue, 23 Aug 2011 16:39:05 +0000 (UTC) Received: by qyk4 with SMTP id 4so2495155qyk.13 for ; Tue, 23 Aug 2011 09:39:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=K2WuxiXYc6xO80RK2nHTkfZwKjZHsbgNBhEPnUVywS8=; b=dgy54rTIDX+fgq5NSby2EqQYcbsMUKP4L79AUR98HymantcMZNXhTOwoz8MdtCsUyV y7AuvQekApaqxjWTRAc67c7d50VrbnFi1FptsKpWgjBVvqUTnbUeneO7aBoh0R/dbCXI hQ4pMKTwbrx0bE6KLWX31iNcjlVa2+ULq182U= MIME-Version: 1.0 Received: by 10.229.136.81 with SMTP id q17mr2459186qct.170.1314117544530; Tue, 23 Aug 2011 09:39:04 -0700 (PDT) Sender: mdf356@gmail.com Received: by 10.229.98.8 with HTTP; Tue, 23 Aug 2011 09:39:04 -0700 (PDT) In-Reply-To: <4E53B47C.2000901@FreeBSD.org> References: <4E53B47C.2000901@FreeBSD.org> Date: Tue, 23 Aug 2011 09:39:04 -0700 X-Google-Sender-Auth: IMnw2eDz2rmHjGcqDfbuoun-720 Message-ID: From: mdf@FreeBSD.org To: Andriy Gapon Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-arch@freebsd.org Subject: Re: common entry point for "software" and "hardware" "panics" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Aug 2011 16:39:05 -0000 On Tue, Aug 23, 2011 at 7:09 AM, Andriy Gapon wrote: > > Too many quote signs in the subject line... let me try to explain. > > Currently we have two sources of detecting some trouble/inconsistency tha= t > requires a system panic/reset or debugging. =A0One source is various chec= ks in the > program code (e.g. KASSERTs) that call panic() when a fatal inconsistency= is > detected. =A0The other source is the hardware that generates a trap when = something > is wrong from its point of view. =A0In this case the trap need not be a f= atal one, > so the software (the kernel) checks a type of trap and decides whether th= e > condition is fatal. =A0But let's distinguish the purely software source f= rom the > hardware+software source. > > Depending on the kernel options/configuration the kernel can also react i= n > different ways to the fatal conditions. =A0One way is to call panic(9) , = the other > way is to call kdb_trap. =A0But it's even a little bit more complicated t= han that. > > So, let's consider some possibilities. > > !KDB, software problem: > panic -> kern_reboot > > !KDB, fatal trap: > trap -> trap_fatal -> panic -> kern_reboot > > KDB, !KDB_UNATTENDED, software problem: > panic -> kdb_enter -> breakpoint ~> trap -> kdb_trap > > KDB, !KDB_UNATTENDED, fatal trap: > trap -> trap_fatal -> kdb_trap > > Also, kdb key from console: > kdb_enter -> breakpoint ~> trap -> kdb_trap > > panic key from console: > kdb_panic -> panic -> ... > > and also some code calls kdb_enter instead of panic in situations that re= quire > debugging: > kdb_enter -> breakpoint ~> kdb_trap > > So, we can see that in these examples that currently we do not have a fun= ction > that would be called in all the cases. > I think that it would be nice if we had some sort of a (semi-)universal f= ront-end > to panic and kdb_trap. =A0E.g. it could be useful for some common tasks l= ike > stopping other CPUs in SMP environment. =A0Then, it could be useful for p= rinting > some information useful in both cases like e.g. a stack trace. =A0Or perh= aps > deciding whether KDB should be actually entered in a common place. > > Unfortunately, this is not a proposal, just sort of musings on the topic. > Does anybody have some more concrete ideas here? > Thank you! I vote for the status quo. :-) That is, it seems to me that the intent behind kdb_enter() and panic() are very different. With a software fault panic is usually the right thing (since we have no way at the moment to e.g. restart the VM subsystem). debugger_on_panic then gets you a debugger if desired. kdb_enter() or breakpoint() should not be in "production" code since there may be no debugger. It seems useful to me only for intermediate debugging, and any particular use should go away when the problem is known and fixed. Cheers, matthew