From owner-freebsd-stable@FreeBSD.ORG Wed Nov 16 17:19:58 2005 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DF60716A41F; Wed, 16 Nov 2005 17:19:58 +0000 (GMT) (envelope-from lars+lister.freebsd@adventuras.no) Received: from mail.adventuras.no (mail.adventuras.no [194.63.250.215]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0A9A043D69; Wed, 16 Nov 2005 17:19:42 +0000 (GMT) (envelope-from lars+lister.freebsd@adventuras.no) Received: from mail.adventuras.no (seven [127.0.0.1]) by mail.adventuras.no (8.12.10/8.12.10) with ESMTP id jAGHJPjt008049 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Wed, 16 Nov 2005 18:19:25 +0100 Received: (from apache@localhost) by mail.adventuras.no (8.12.10/8.12.10/Submit) id jAGHJO4h008047; Wed, 16 Nov 2005 18:19:24 +0100 Received: from 213.236.228.129 (SquirrelMail authenticated user lars) by mail.adventuras.no with HTTP; Wed, 16 Nov 2005 18:19:24 +0100 (CET) Message-ID: <63732.213.236.228.129.1132161564.squirrel@mail.adventuras.no> In-Reply-To: <20051116162421.GE76352@green.homeunix.org> References: <20051115065740.GH39882@cirb503493.alcatel.com.au> <20051115100813.74195.qmail@web36214.mail.mud.yahoo.com> <20051115103821.GJ39882@cirb503493.alcatel.com.au> <54759.213.236.228.129.1132153296.squirrel@mail.adventuras.no> <20051116162421.GE76352@green.homeunix.org> Date: Wed, 16 Nov 2005 18:19:24 +0100 (CET) From: "Lars Kristiansen" To: "Brian Fundakowski Feldman" User-Agent: SquirrelMail/1.4.4-1 MIME-Version: 1.0 Content-Type: multipart/mixed;boundary="----=_20051116181924_41008" X-Priority: 3 (Normal) Importance: Normal X-Adventuras-MailScanner-Information: Please contact the ISP for more information X-Adventuras: du kan filtrere etter AdvSpamScore over 5-10 X-Adventuras-SpamCheck: not spam, SpamAssassin (score=-4.399, required 6, autolearn=not spam, ALL_TRUSTED -1.80, BAYES_00 -2.60) X-MailScanner-From: lars+lister.freebsd@adventuras.no Cc: freebsd-stable@freebsd.org Subject: Re: Swapfile problem in 6? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Nov 2005 17:19:59 -0000 ------=_20051116181924_41008 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit > On Wed, Nov 16, 2005 at 04:01:36PM +0100, Lars Kristiansen wrote: >> > On Tue, 2005-Nov-15 02:08:12 -0800, Rob wrote: >> >> makeoptions DEBUG=-g >> >> options INVARIANTS >> >> options WITNESS >> >> options WITNESS_KDB >> >> options KDB >> >> options DDB >> >> options DDB_NUMSYM >> >> options GDB >> >> >> >>Is that enough? >> > >> > If your system is headless, you probably want 'options >> BREAK_TO_DEBUGGER' >> > as well. >> > >> > First question is: Does the system still deadlock? INVARIANTS and >> > WITNESS will have added sanity checks which might have picked up the >> > problem. >> > >> >>1) Can I debug a kernel that does not crash, but >> >> just hangs in a deadlock? Everything seems to >> >> be frozen, except pinging the PC.... >> > >> > Have a look at >> > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-online-ddb.html >> > and ddb(4). Unless you have another system handy, you might like to >> print >> > out ddb(4) - it's difficult to read man pages when you're in the >> kernel >> > debugger :-). >> > >> >>2) Is such debugging possible on a headless PC >> >> without a keyboard attached? >> >> I do have serial console access. >> > >> > Yes. See above URL. The advantage is that you can (hopefully) >> > capture a log of your debug session. Send a serial BREAK and you >> > should get a DDB> prompt. >> > >> > Basically, wait until your system deadlocks. BREAK into DDB. >> > As a start, run 'show lockedvnods', 'ps'. My guess is that you'll >> > see a lock that has a number of waiters - which is probably the >> > culprit. Use 'panic' to get a crashdump and then you can use kgdb >> > to rummage around once you reboot - see >> > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-gdb.html >> > >> > If in doubt, post the output from the above commands here and someone >> > will hopefully provide further input. >> >> Hello again, I am the "me too"-guy with console-access. >> I am not a programmer and it is the first time I see debugging screen. >> >> It deadlocked again, and I did as advised above: >> (ddb: show lockedvnods; ps ; panic) >> but did not understand much of the output. >> Looked maybe like syncer and swap_pager was locked? >> Do i need to write all this down or can I get the output saved >> somewhere? >> >> I got a 32MB coredump but the same lack of understanding applies. >> >> Please tell me if I can be of any help! This is fun. > > Do you have the ability to connect another computer by RS-232? > It's easy to get a serial terminal console going (err that is > if you find the right guide as opposed to stabbing blindly and > just referencing man pages as I like to do.) The coredump > should supply the same (and more) information, and someone > can walk through with you doing a post-mortem gdb session. > > For example, try doing the following now that you have the coredump: > # ps wwwauxlH -N /boot/kernel/kernel -M /var/crash/vmcore.whatever Sure, I will get a serial terminal console going and try to repeat this process from it. In the meantime here is output from the above ps command provided as attachement. -- Lars > > -- > Brian Fundakowski Feldman \'[ FreeBSD > ]''''''''''\ > <> green@FreeBSD.org \ The Power to > Serve! \ > Opinions expressed are my own. > \,,,,,,,,,,,,,,,,,,,,,,\ > ------=_20051116181924_41008 Content-Type: text/plain; name="psfromdump.txt" Content-Transfer-Encoding: 8bit Content-Disposition: attachment; filename="psfromdump.txt" USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND UID PPID CPU PRI NI MWCHAN root 0 0.0 0.0 0 0 ?? DLs 1Jan70 0:00.09 [swapper] 0 0 0 -16 0 vmwait root 1 0.0 0.0 748 0 ?? DLs 1Jan70 0:00.55 [init] 0 0 52 8 0 wait root 2 0.0 0.0 0 0 ?? DL 1Jan70 0:02.31 [g_event] 0 0 0 -8 0 - root 3 0.0 0.0 0 0 ?? DL 1Jan70 0:10.52 [g_up] 0 0 0 -8 0 - root 4 0.0 0.0 0 0 ?? DL 1Jan70 0:09.21 [g_down] 0 0 0 -8 0 - root 5 0.0 0.0 0 0 ?? DL 1Jan70 0:00.00 [kqueue taskq] 0 0 0 8 0 - root 6 0.0 0.0 0 0 ?? DL 1Jan70 0:00.00 [thread taskq] 0 0 0 8 0 - root 7 0.0 0.0 0 0 ?? DL 1Jan70 0:01.71 [fdc0] 0 0 0 -8 0 - root 8 0.0 0.0 0 0 ?? DL 1Jan70 0:13.57 [pagedaemon] 0 0 1 -16 0 wswbuf root 9 0.0 0.0 0 0 ?? DL 1Jan70 0:02.10 [vmdaemon] 0 0 0 20 0 psleep root 10 0.0 0.0 0 0 ?? DL 1Jan70 0:00.00 [ktrace] 0 0 0 -16 0 ktrace root 11 0.0 0.0 0 0 ?? RL 1Jan70 44:26.68 [idle] 0 0 67 171 0 - root 12 0.0 0.0 0 0 ?? WL 1Jan70 0:00.00 [irq0: clk] 0 0 0 -84 0 - root 13 0.0 0.0 0 0 ?? RL 1Jan70 0:00.02 [irq1: atkbd0] 0 0 1 -60 0 - root 14 0.0 0.0 0 0 ?? WL 1Jan70 0:00.00 [irq3:] 0 0 0 -21 0 - root 15 0.0 0.0 0 0 ?? WL 1Jan70 0:00.00 [irq4: sio0] 0 0 0 -60 0 - root 16 0.0 0.0 0 0 ?? WL 1Jan70 0:00.00 [irq5:] 0 0 0 -21 0 - root 17 0.0 0.0 0 0 ?? WL 1Jan70 0:00.00 [irq6: fdc0] 0 0 0 -64 0 - root 18 0.0 0.0 0 0 ?? WL 1Jan70 0:00.00 [irq7: ppc0] 0 0 0 -60 0 - root 19 0.0 0.0 0 0 ?? WL 1Jan70 0:00.00 [irq8: rtc] 0 0 0 -84 0 - root 20 0.0 0.0 0 0 ?? WL 1Jan70 0:00.00 [irq9:] 0 0 0 -21 0 - root 21 0.0 0.0 0 0 ?? WL 1Jan70 0:00.23 [irq10: ed0] 0 0 0 -68 0 - root 22 0.0 0.0 0 0 ?? WL 1Jan70 0:00.00 [irq11: xl0 uhci 0 0 0 -68 0 - root 23 0.0 0.0 0 0 ?? WL 1Jan70 0:00.00 [irq12:] 0 0 0 -21 0 - root 24 0.0 0.0 0 0 ?? WL 1Jan70 0:00.00 [irq13:] 0 0 0 -21 0 - root 25 0.0 0.0 0 0 ?? WL 1Jan70 0:01.92 [irq14: ata0] 0 0 0 -64 0 - root 26 0.0 0.0 0 0 ?? WL 1Jan70 0:00.15 [irq15: ata1] 0 0 0 -64 0 - root 27 0.0 0.0 0 0 ?? WL 1Jan70 0:00.47 [swi1: net] 0 0 0 -44 0 - root 28 0.0 0.0 0 0 ?? WL 1Jan70 0:32.11 [swi4: clock sio 0 0 2 -32 0 - root 29 0.0 0.0 0 0 ?? WL 1Jan70 0:00.00 [swi3: vm] 0 0 0 -36 0 - root 30 0.0 0.0 0 0 ?? DL 1Jan70 0:01.95 [yarrow] 0 0 0 -16 0 - root 31 0.0 0.0 0 0 ?? WL 1Jan70 0:00.00 [swi2: cambio] 0 0 0 -40 0 - root 32 0.0 0.0 0 0 ?? WL 1Jan70 0:00.00 [swi6: task queu 0 0 0 -24 0 - root 33 0.0 0.0 0 0 ?? WL 1Jan70 0:00.00 [swi6:+] 0 0 0 -24 0 - root 34 0.0 0.0 0 0 ?? WL 1Jan70 0:00.00 [swi5:+] 0 0 0 -28 0 - root 35 0.0 0.0 0 0 ?? DL 1Jan70 0:00.00 [usb0] 0 0 0 8 0 usbevt root 36 0.0 0.0 0 0 ?? DL 1Jan70 0:00.00 [usbtask] 0 0 0 8 0 usbtsk root 37 0.0 0.0 0 0 ?? WL 1Jan70 0:00.00 [swi0: sio] 0 0 0 -48 0 - root 38 0.0 0.0 0 0 ?? DL 1Jan70 0:02.11 [pagezero] 0 0 0 171 0 pgzero root 39 0.0 0.0 0 0 ?? DL 1Jan70 0:00.97 [bufdaemon] 0 0 0 -16 0 psleep root 40 0.0 0.0 0 0 ?? DL 1Jan70 0:00.80 [syncer] 0 0 0 -16 0 vmwait root 41 0.0 0.0 0 0 ?? DL 1Jan70 0:00.25 [vnlru] 0 0 0 -4 0 vlruwt root 42 0.0 0.0 0 0 ?? DL 1Jan70 0:00.00 [nfsiod 0] 0 0 5 8 0 - root 43 0.0 0.0 0 0 ?? DL 1Jan70 0:00.00 [nfsiod 1] 0 0 5 8 0 - root 44 0.0 0.0 0 0 ?? DL 1Jan70 0:00.00 [nfsiod 2] 0 0 5 8 0 - root 45 0.0 0.0 0 0 ?? DL 1Jan70 0:00.00 [nfsiod 3] 0 0 5 8 0 - root 46 0.0 0.0 0 0 ?? DL 1Jan70 0:02.05 [schedcpu] 0 0 0 -32 0 - root 175 0.0 0.0 0 0 ?? DL 1Jan70 0:14.88 [md0] 0 0 1 -16 0 vmwait root 207 0.0 0.0 1436 480 ?? DW - 0:00.00 [dhclient] 0 1 122 111 0 select _dhcp 227 0.0 0.0 1436 500 ?? DWs - 0:00.00 [dhclient] 65 1 0 96 0 select root 267 0.0 0.0 1532 528 ?? DWs - 0:00.00 [pflogd] 0 1 83 4 0 sbwait _pflogd 270 0.0 0.0 1596 0 ?? DL 1Jan70 0:01.02 [pflogd] 64 267 0 96 0 pfault root 310 0.0 0.0 508 8 ?? DWs - 0:00.00 [devd] 0 1 123 111 0 select root 343 0.0 0.0 1340 112 ?? WWs - 0:00.00 [syslogd] 0 1 0 76 0 - bind 403 0.0 0.0 4200 132 ?? WWs - 0:00.00 [named] 53 1 0 96 0 - root 444 0.0 0.0 1448 24 ?? WWs - 0:00.00 [rpcbind] 0 1 0 96 0 - root 455 0.0 0.0 1492 32 ?? WWs - 0:00.00 [amd] 0 1 0 96 0 - root 499 0.0 0.0 1256 60 ?? WWs - 0:00.00 [usbd] 0 1 0 96 0 - root 526 0.0 0.0 3028 0 ?? DLs 1Jan70 0:01.54 [ntpd] 0 1 0 96 0 pfault root 544 0.0 0.0 3496 764 ?? DWs - 0:00.00 [sshd] 0 1 0 96 0 select root 550 0.0 1.6 3528 376 ?? DLs 1Jan70 0:00.49 [sendmail] 0 1 0 96 0 pfault smmsp 554 0.0 0.0 3408 708 ?? WWs - 0:00.00 [sendmail] 25 1 119 20 0 - root 566 0.0 0.0 1364 104 ?? WWs - 0:00.00 [cron] 0 1 0 8 0 - dhcpd 594 0.0 0.0 2268 328 ?? DWs - 0:00.00 [dhcpd] 1002 1 128 112 0 select root 606 0.0 0.0 1556 452 ?? WW - 0:00.00 [smartd] 0 1 58 8 0 - root 623 0.0 0.0 1432 544 ?? DWs - 0:00.00 [inetd] 0 1 93 107 0 select root 639 0.0 0.0 1316 496 ?? WWs+ - 0:00.00 [getty] 0 1 92 5 0 - root 640 0.0 0.0 1316 496 ?? DWs+ - 0:00.00 [getty] 0 1 95 5 0 ttyin root 641 0.0 0.0 1316 496 ?? WWs+ - 0:00.00 [getty] 0 1 92 5 0 - root 642 0.0 0.0 1316 496 ?? DWs+ - 0:00.00 [getty] 0 1 92 5 0 ttyin root 643 0.0 0.0 1316 496 ?? DWs+ - 0:00.00 [getty] 0 1 91 5 0 ttyin root 644 0.0 0.0 1316 496 ?? DWs+ - 0:00.00 [getty] 0 1 93 5 0 ttyin root 645 0.0 0.0 1316 496 ?? DWs+ - 0:00.00 [getty] 0 1 94 5 0 ttyin root 646 0.0 0.0 1316 496 ?? DWs+ - 0:00.00 [getty] 0 1 95 5 0 ttyin root 652 0.0 0.0 6248 908 ?? DWs - 0:00.00 [sshd] 0 544 3 4 0 sbwait r 655 0.0 0.0 6224 308 ?? DW - 0:00.00 [sshd] 1001 652 0 96 0 select r 656 0.0 0.0 3288 620 ?? DWs - 0:00.00 [bash] 1001 655 0 8 0 wait root 659 0.0 0.0 1660 620 ?? DW - 0:00.00 [su] 0 656 0 8 0 wait root 660 0.0 0.0 4580 736 ?? DW - 0:00.00 [csh] 0 659 0 20 0 pause root 662 0.0 0.0 1996 108 ?? WW+ - 0:00.00 [screen] 0 660 0 20 0 - root 663 0.0 0.0 2164 308 ?? DWs - 0:00.00 [screen] 0 662 0 96 0 select root 664 0.0 0.0 4924 1388 ?? DWs - 0:00.00 [csh] 0 663 0 20 0 pause root 665 0.0 0.0 6248 1404 ?? DWs - 0:00.00 [sshd] 0 544 1 4 0 sbwait r 668 0.0 0.0 6224 316 ?? WW - 0:00.00 [sshd] 1001 665 0 96 0 - r 669 0.0 0.0 3292 256 ?? DWs+ - 0:00.00 [bash] 1001 668 2 5 0 ttyin root 696 0.0 3.8 15612 908 ?? DN+ 1Jan70 0:32.96 [ruby18] 0 664 11 -8 19 piperd root 759 0.0 0.0 1688 0 ?? DNL+ 1Jan70 0:00.22 [sh] 0 696 7 115 19 pfault ------=_20051116181924_41008--