From owner-freebsd-doc@FreeBSD.ORG Fri Jun 23 13:58:45 2006 Return-Path: X-Original-To: freebsd-doc@freebsd.org Delivered-To: freebsd-doc@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1EA2E16A492; Fri, 23 Jun 2006 13:58:45 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from fw.zoral.com.ua (ll-227.216.82.212.sovam.net.ua [212.82.216.227]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1489843D46; Fri, 23 Jun 2006 13:58:43 +0000 (GMT) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by fw.zoral.com.ua (8.13.4/8.13.4) with ESMTP id k5NDwb6j049688 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 23 Jun 2006 16:58:37 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.13.6/8.13.6) with ESMTP id k5NDwbhe073870; Fri, 23 Jun 2006 16:58:37 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.13.6/8.13.6/Submit) id k5NDwbHE073869; Fri, 23 Jun 2006 16:58:37 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Fri, 23 Jun 2006 16:58:37 +0300 From: Kostik Belousov To: Giorgos Keramidas Message-ID: <20060623135837.GI5115@deviant.kiev.zoral.com.ua> References: <20060607084346.GA21391@deviant.kiev.zoral.com.ua> <20060623132558.GD7062@gothmog.pc> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="KZLWU/9q3evlN4nQ" Content-Disposition: inline In-Reply-To: <20060623132558.GD7062@gothmog.pc> User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV version 0.88.2, clamav-milter version 0.88.2 on fw.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-0.1 required=5.0 tests=ALL_TRUSTED,SPF_NEUTRAL autolearn=failed version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on fw.zoral.com.ua Cc: freebsd-doc@freebsd.org Subject: Re: [patch] deadlock debugging X-BeenThere: freebsd-doc@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Documentation project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Jun 2006 13:58:45 -0000 --KZLWU/9q3evlN4nQ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Jun 23, 2006 at 04:25:58PM +0300, Giorgos Keramidas wrote: > On 2006-06-07 11:43, Kostik Belousov wrote: > > Reports of the deadlocks are reccurrent topic on the current- > > and stable- lists. Many of us have to repeat the instructions > > on how to provide the useful initial bug report from them. > > > > Please, comment proposed addition to the kernel debugging > > chapter of the developer handbook. >=20 > Hi Kostik, >=20 > > Obviously, I am not an english native speaker. Your corrections > > for both factual material and grammar/style are very much > > welcome ! > > > > P.S. I'm not on the list, do not remove CC: to me on replying. >=20 > Ok :) >=20 > This seems like a useful addition to the developer's handbook, > but I have some minor comments. See inline text below: >=20 > > Index: en_US.ISO8859-1/books/developers-handbook/kerneldebug/chapter.sg= ml > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > RCS file: /usr/local/arch/ncvs/doc/en_US.ISO8859-1/books/developers-han= dbook/kerneldebug/chapter.sgml,v > > retrieving revision 1.64 > > diff -u -r1.64 chapter.sgml > > --- en_US.ISO8859-1/books/developers-handbook/kerneldebug/chapter.sgml = 5 Jan 2006 20:03:34 -0000 1.64 > > +++ en_US.ISO8859-1/books/developers-handbook/kerneldebug/chapter.sgml = 7 Jun 2006 08:39:20 -0000 > > @@ -821,6 +821,41 @@ > > on any configured console driver, including a serial > > console. > > > > + > > + > > + Debugging the Deadlocks >=20 > `Debugging Kernel Deadlocks' is probably a better title here, since > deadlocks can only occur in the kernel and `the Deadlocks' doesn't > really make this as obvious as I'd probably want it to be. >=20 > > + You may experience so called deadlocks, the situation where > > + system stops doing useful work. To provide the useful bug report > > + in this situation, you shall use ddb as described above. Please, > > + include the output of ps and > > + trace for suspected processes in the > > + report. >=20 > This paragraph has a few minor syntax buglets. English is not my native > language, but I would probably rewrite this as: >=20 > | Modern &os; releases have been extended with support for > | Symmetric Multiprocessing (SMP). To support highly parallel > | processing, the &os; kernel uses a lot of internal locking and > | synchronization primitives, to allow multiple kernel threads > | to run concurrently on systems that can support such a mode of > | operation. Bugs in the use of these internal locking > | mechanisms can lead to a situation where one or more kernel > | threads block compete for the same resources and block > | indefinitely waiting for each other. When this happens, the > | system may become unstable, leading either to a crash or > | appear to hang. This hang is called a > | deadlock. > | > | Debugging a deadlock may be a tricky and difficult thing, > | but &os; provides some tools that may assist you in tracking > | down the problem or collect information about the deadlock > | when it occurs. > | > | One of these tools is the kernel debugger, > | DDB, which you can use as described > | in the previous sections to collect useful information for > | such a bug. DDB commands that are > | very useful and may provide information that helps debugging a > | deadlock are: > | > | > | ps > | trace > | > | > | Use the ps command to list all the > | processes and then use trace on processes > | that are suspects for having caused the deadlock. > | > | Other commands that can provide useful information for > | tracking down the cause of a deadlock are: > | > | > | show allcpu > | show alllocks > | show lockedvnods > | > | > | Useful information about what each process was doing, at > | the time the deadlock occured, can be listed with: > | > | > | where PID > | > | > | The output of the where command tends > | to be very useful for the processes listed in the output of > | the show commands. > | > | To obtain meaningful backtraces for threaded processes, > | use thread thread-id first, to switch to > | the correct thread, and then get a backtrace > | with where. >=20 > Does this version look ok to you? I can handle the merging of this > change with your initial diff/patch >=20 > > + If possible, consider doing further investigation. Receipt > > + below is especially usefull if you suspect deadlock occurs in the > > + VFS layer. Add the options > > + makeoptions DEBUG=3D-g > > + options INVARIANTS > > + options INVARIANT_SUPPORT > > + options WITNESS > > + options DEBUG_LOCKS > > + options DEBUG_VFS_LOCKS > > + options DIAGNOSTIC > > + > > + to the kernel config. When deadlock occurs, in addition to the > > + output of the ps command, provide information > > + from the show allpcpu, show > > + alllocks and show > > + lockedvnods. More, please provide output of the > > + where pid for each process id mentioned in > > + the output of the show commands. > > + > > + > > + For threaded processes, to obtain meaningful backtraces, use > > + thread thread-id to switch to the thread > > + stack, and do backtrace with where. > > + > > >=20 > This part is also nice, but IMHO it would be even nicer if we could > expand it a bit more. How about something like this? >=20 > | > | > | Deadlocks are pretty nasty bugs, since they are not very > | easy to reproduce. Their occurence depends on specific > | timing, synchronization, system load and many more factors. > | This makes it hard to reliably reproduce a deadlock bug. > | Since reproducing a bug is some times a crucial part of > | gathering all the necessary information, you may have to spend > | some time investigating the deadlock. Naturally, this is not > | always possible for production systems, but if you can > | reproduce the deadlock on a test system which can afford > | staying off-line for extended periods of time, then consider > | staying inside DDB while you are > | investigating the deadlock further. > | > | A serial console can be extremely helpful in collecting > | DDB output. > | > | If it's impossible to set up a serial console > | (i.e. because you cannot find or afford a second system to > | configure as a testbed), emulators like > | emulators/qemu, > | emulators/vmware2 or > | emulators/bochs may prove a > | very efficient way of debugging kernel issues, like a > | deadlock. >=20 > Part #2 ... >=20 > | > | > | Apart from the usual kernel options that are useful for > | debugging kernel problems, there are some options that are > | prticularly useful and targetted at debugging locking > | problems. These options are: > | > | options INVARIANTS > | options INVARIANT_SUPPORT > | options WITNESS > | options DEBUG_LOCKS > | options DEBUG_VFS_LOCKS > | options DIAGNOSTIC >=20 > Any help in expanding these parts (especially the second one) is more > than welcome :-) I like you changes, they provide useful context and give the proper exposure to the problems. My intent for the addition was to have the place for pointing out when asked "how to debug deadlocks" ? Could you additions and my do-it guide coexist side-by-side ? For instance, by summarizing the information developers want to obtain from the problem machine, at the end of section ? --KZLWU/9q3evlN4nQ Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (FreeBSD) iD8DBQFEm/OMC3+MBN1Mb4gRAsYDAJ9WmXYJ4BDgs8SmDpe37e5EF6gKSQCfRWVc kKjbgq1fC5RMzZMPVk9mUDo= =6ydb -----END PGP SIGNATURE----- --KZLWU/9q3evlN4nQ--