Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 16 Nov 2005 18:19:24 +0100 (CET)
From:      "Lars Kristiansen" <lars+lister.freebsd@adventuras.no>
To:        "Brian Fundakowski Feldman" <green@freebsd.org>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: Swapfile problem in 6?
Message-ID:  <63732.213.236.228.129.1132161564.squirrel@mail.adventuras.no>
In-Reply-To: <20051116162421.GE76352@green.homeunix.org>
References:  <20051115065740.GH39882@cirb503493.alcatel.com.au> <20051115100813.74195.qmail@web36214.mail.mud.yahoo.com> <20051115103821.GJ39882@cirb503493.alcatel.com.au> <54759.213.236.228.129.1132153296.squirrel@mail.adventuras.no> <20051116162421.GE76352@green.homeunix.org>

next in thread | previous in thread | raw e-mail | index | archive | help
------=_20051116181924_41008
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 8bit

> On Wed, Nov 16, 2005 at 04:01:36PM +0100, Lars Kristiansen wrote:
>> > On Tue, 2005-Nov-15 02:08:12 -0800, Rob wrote:
>> >> makeoptions DEBUG=-g
>> >> options INVARIANTS
>> >> options WITNESS
>> >> options WITNESS_KDB
>> >> options KDB
>> >> options DDB
>> >> options DDB_NUMSYM
>> >> options GDB
>> >>
>> >>Is that enough?
>> >
>> > If your system is headless, you probably want 'options
>> BREAK_TO_DEBUGGER'
>> > as well.
>> >
>> > First question is: Does the system still deadlock?  INVARIANTS and
>> > WITNESS will have added sanity checks which might have picked up the
>> > problem.
>> >
>> >>1) Can I debug a kernel that does not crash, but
>> >>   just hangs in a deadlock? Everything seems to
>> >>   be frozen, except pinging the PC....
>> >
>> > Have a look at
>> > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-online-ddb.html
>> > and ddb(4).  Unless you have another system handy, you might like to
>> print
>> > out ddb(4) - it's difficult to read man pages when you're in the
>> kernel
>> > debugger :-).
>> >
>> >>2) Is such debugging possible on a headless PC
>> >>   without a keyboard attached?
>> >>   I do have serial console access.
>> >
>> > Yes.  See above URL.  The advantage is that you can (hopefully)
>> > capture a log of your debug session.  Send a serial BREAK and you
>> > should get a DDB> prompt.
>> >
>> > Basically, wait until your system deadlocks.  BREAK into DDB.
>> > As a start, run 'show lockedvnods', 'ps'.  My guess is that you'll
>> > see a lock that has a number of waiters - which is probably the
>> > culprit.  Use 'panic' to get a crashdump and then you can use kgdb
>> > to rummage around once you reboot - see
>> > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-gdb.html
>> >
>> > If in doubt, post the output from the above commands here and someone
>> > will hopefully provide further input.
>>
>> Hello again, I am the "me too"-guy with console-access.
>> I am not a programmer and it is the first time I see debugging screen.
>>
>> It deadlocked again, and I did as advised above:
>> (ddb: show lockedvnods; ps ; panic)
>>  but did not understand much of the output.
>>  Looked maybe like syncer and swap_pager was locked?
>> Do i need to write all this down or can I get the output saved
>> somewhere?
>>
>> I got a 32MB coredump but the same lack of understanding applies.
>>
>> Please tell me if I can be of any help! This is fun.
>
> Do you have the ability to connect another computer by RS-232?
> It's easy to get a serial terminal console going (err that is
> if you find the right guide as opposed to stabbing blindly and
> just referencing man pages as I like to do.)  The coredump
> should supply the same (and more) information, and someone
> can walk through with you doing a post-mortem gdb session.
>
> For example, try doing the following now that you have the coredump:
> # ps wwwauxlH -N /boot/kernel/kernel -M /var/crash/vmcore.whatever

Sure, I will get a serial terminal console going and try to repeat this
process from it.

In the meantime here is output from the above ps command provided as
attachement.

--
Lars

>
> --
> Brian Fundakowski Feldman                           \'[ FreeBSD
> ]''''''''''\
>   <> green@FreeBSD.org                               \  The Power to
> Serve! \
>  Opinions expressed are my own.
> \,,,,,,,,,,,,,,,,,,,,,,\
>

------=_20051116181924_41008
Content-Type: text/plain; name="psfromdump.txt"
Content-Transfer-Encoding: 8bit
Content-Disposition: attachment; filename="psfromdump.txt"

USER      PID %CPU %MEM   VSZ   RSS  TT  STAT STARTED      TIME COMMAND            UID  PPID CPU PRI NI MWCHAN
root        0  0.0  0.0     0     0  ??  DLs   1Jan70   0:00.09 [swapper]            0     0   0 -16  0 vmwait
root        1  0.0  0.0   748     0  ??  DLs   1Jan70   0:00.55 [init]               0     0  52   8  0 wait  
root        2  0.0  0.0     0     0  ??  DL    1Jan70   0:02.31 [g_event]            0     0   0  -8  0 -     
root        3  0.0  0.0     0     0  ??  DL    1Jan70   0:10.52 [g_up]               0     0   0  -8  0 -     
root        4  0.0  0.0     0     0  ??  DL    1Jan70   0:09.21 [g_down]             0     0   0  -8  0 -     
root        5  0.0  0.0     0     0  ??  DL    1Jan70   0:00.00 [kqueue taskq]       0     0   0   8  0 -     
root        6  0.0  0.0     0     0  ??  DL    1Jan70   0:00.00 [thread taskq]       0     0   0   8  0 -     
root        7  0.0  0.0     0     0  ??  DL    1Jan70   0:01.71 [fdc0]               0     0   0  -8  0 -     
root        8  0.0  0.0     0     0  ??  DL    1Jan70   0:13.57 [pagedaemon]         0     0   1 -16  0 wswbuf
root        9  0.0  0.0     0     0  ??  DL    1Jan70   0:02.10 [vmdaemon]           0     0   0  20  0 psleep
root       10  0.0  0.0     0     0  ??  DL    1Jan70   0:00.00 [ktrace]             0     0   0 -16  0 ktrace
root       11  0.0  0.0     0     0  ??  RL    1Jan70  44:26.68 [idle]               0     0  67 171  0 -     
root       12  0.0  0.0     0     0  ??  WL    1Jan70   0:00.00 [irq0: clk]          0     0   0 -84  0 -     
root       13  0.0  0.0     0     0  ??  RL    1Jan70   0:00.02 [irq1: atkbd0]       0     0   1 -60  0 -     
root       14  0.0  0.0     0     0  ??  WL    1Jan70   0:00.00 [irq3:]              0     0   0 -21  0 -     
root       15  0.0  0.0     0     0  ??  WL    1Jan70   0:00.00 [irq4: sio0]         0     0   0 -60  0 -     
root       16  0.0  0.0     0     0  ??  WL    1Jan70   0:00.00 [irq5:]              0     0   0 -21  0 -     
root       17  0.0  0.0     0     0  ??  WL    1Jan70   0:00.00 [irq6: fdc0]         0     0   0 -64  0 -     
root       18  0.0  0.0     0     0  ??  WL    1Jan70   0:00.00 [irq7: ppc0]         0     0   0 -60  0 -     
root       19  0.0  0.0     0     0  ??  WL    1Jan70   0:00.00 [irq8: rtc]          0     0   0 -84  0 -     
root       20  0.0  0.0     0     0  ??  WL    1Jan70   0:00.00 [irq9:]              0     0   0 -21  0 -     
root       21  0.0  0.0     0     0  ??  WL    1Jan70   0:00.23 [irq10: ed0]         0     0   0 -68  0 -     
root       22  0.0  0.0     0     0  ??  WL    1Jan70   0:00.00 [irq11: xl0 uhci     0     0   0 -68  0 -     
root       23  0.0  0.0     0     0  ??  WL    1Jan70   0:00.00 [irq12:]             0     0   0 -21  0 -     
root       24  0.0  0.0     0     0  ??  WL    1Jan70   0:00.00 [irq13:]             0     0   0 -21  0 -     
root       25  0.0  0.0     0     0  ??  WL    1Jan70   0:01.92 [irq14: ata0]        0     0   0 -64  0 -     
root       26  0.0  0.0     0     0  ??  WL    1Jan70   0:00.15 [irq15: ata1]        0     0   0 -64  0 -     
root       27  0.0  0.0     0     0  ??  WL    1Jan70   0:00.47 [swi1: net]          0     0   0 -44  0 -     
root       28  0.0  0.0     0     0  ??  WL    1Jan70   0:32.11 [swi4: clock sio     0     0   2 -32  0 -     
root       29  0.0  0.0     0     0  ??  WL    1Jan70   0:00.00 [swi3: vm]           0     0   0 -36  0 -     
root       30  0.0  0.0     0     0  ??  DL    1Jan70   0:01.95 [yarrow]             0     0   0 -16  0 -     
root       31  0.0  0.0     0     0  ??  WL    1Jan70   0:00.00 [swi2: cambio]       0     0   0 -40  0 -     
root       32  0.0  0.0     0     0  ??  WL    1Jan70   0:00.00 [swi6: task queu     0     0   0 -24  0 -     
root       33  0.0  0.0     0     0  ??  WL    1Jan70   0:00.00 [swi6:+]             0     0   0 -24  0 -     
root       34  0.0  0.0     0     0  ??  WL    1Jan70   0:00.00 [swi5:+]             0     0   0 -28  0 -     
root       35  0.0  0.0     0     0  ??  DL    1Jan70   0:00.00 [usb0]               0     0   0   8  0 usbevt
root       36  0.0  0.0     0     0  ??  DL    1Jan70   0:00.00 [usbtask]            0     0   0   8  0 usbtsk
root       37  0.0  0.0     0     0  ??  WL    1Jan70   0:00.00 [swi0: sio]          0     0   0 -48  0 -     
root       38  0.0  0.0     0     0  ??  DL    1Jan70   0:02.11 [pagezero]           0     0   0 171  0 pgzero
root       39  0.0  0.0     0     0  ??  DL    1Jan70   0:00.97 [bufdaemon]          0     0   0 -16  0 psleep
root       40  0.0  0.0     0     0  ??  DL    1Jan70   0:00.80 [syncer]             0     0   0 -16  0 vmwait
root       41  0.0  0.0     0     0  ??  DL    1Jan70   0:00.25 [vnlru]              0     0   0  -4  0 vlruwt
root       42  0.0  0.0     0     0  ??  DL    1Jan70   0:00.00 [nfsiod 0]           0     0   5   8  0 -     
root       43  0.0  0.0     0     0  ??  DL    1Jan70   0:00.00 [nfsiod 1]           0     0   5   8  0 -     
root       44  0.0  0.0     0     0  ??  DL    1Jan70   0:00.00 [nfsiod 2]           0     0   5   8  0 -     
root       45  0.0  0.0     0     0  ??  DL    1Jan70   0:00.00 [nfsiod 3]           0     0   5   8  0 -     
root       46  0.0  0.0     0     0  ??  DL    1Jan70   0:02.05 [schedcpu]           0     0   0 -32  0 -     
root      175  0.0  0.0     0     0  ??  DL    1Jan70   0:14.88 [md0]                0     0   1 -16  0 vmwait
root      207  0.0  0.0  1436   480  ??  DW   -         0:00.00 [dhclient]           0     1 122 111  0 select
_dhcp     227  0.0  0.0  1436   500  ??  DWs  -         0:00.00 [dhclient]          65     1   0  96  0 select
root      267  0.0  0.0  1532   528  ??  DWs  -         0:00.00 [pflogd]             0     1  83   4  0 sbwait
_pflogd   270  0.0  0.0  1596     0  ??  DL    1Jan70   0:01.02 [pflogd]            64   267   0  96  0 pfault
root      310  0.0  0.0   508     8  ??  DWs  -         0:00.00 [devd]               0     1 123 111  0 select
root      343  0.0  0.0  1340   112  ??  WWs  -         0:00.00 [syslogd]            0     1   0  76  0 -     
bind      403  0.0  0.0  4200   132  ??  WWs  -         0:00.00 [named]             53     1   0  96  0 -     
root      444  0.0  0.0  1448    24  ??  WWs  -         0:00.00 [rpcbind]            0     1   0  96  0 -     
root      455  0.0  0.0  1492    32  ??  WWs  -         0:00.00 [amd]                0     1   0  96  0 -     
root      499  0.0  0.0  1256    60  ??  WWs  -         0:00.00 [usbd]               0     1   0  96  0 -     
root      526  0.0  0.0  3028     0  ??  DLs   1Jan70   0:01.54 [ntpd]               0     1   0  96  0 pfault
root      544  0.0  0.0  3496   764  ??  DWs  -         0:00.00 [sshd]               0     1   0  96  0 select
root      550  0.0  1.6  3528   376  ??  DLs   1Jan70   0:00.49 [sendmail]           0     1   0  96  0 pfault
smmsp     554  0.0  0.0  3408   708  ??  WWs  -         0:00.00 [sendmail]          25     1 119  20  0 -     
root      566  0.0  0.0  1364   104  ??  WWs  -         0:00.00 [cron]               0     1   0   8  0 -     
dhcpd     594  0.0  0.0  2268   328  ??  DWs  -         0:00.00 [dhcpd]           1002     1 128 112  0 select
root      606  0.0  0.0  1556   452  ??  WW   -         0:00.00 [smartd]             0     1  58   8  0 -     
root      623  0.0  0.0  1432   544  ??  DWs  -         0:00.00 [inetd]              0     1  93 107  0 select
root      639  0.0  0.0  1316   496  ??  WWs+ -         0:00.00 [getty]              0     1  92   5  0 -     
root      640  0.0  0.0  1316   496  ??  DWs+ -         0:00.00 [getty]              0     1  95   5  0 ttyin 
root      641  0.0  0.0  1316   496  ??  WWs+ -         0:00.00 [getty]              0     1  92   5  0 -     
root      642  0.0  0.0  1316   496  ??  DWs+ -         0:00.00 [getty]              0     1  92   5  0 ttyin 
root      643  0.0  0.0  1316   496  ??  DWs+ -         0:00.00 [getty]              0     1  91   5  0 ttyin 
root      644  0.0  0.0  1316   496  ??  DWs+ -         0:00.00 [getty]              0     1  93   5  0 ttyin 
root      645  0.0  0.0  1316   496  ??  DWs+ -         0:00.00 [getty]              0     1  94   5  0 ttyin 
root      646  0.0  0.0  1316   496  ??  DWs+ -         0:00.00 [getty]              0     1  95   5  0 ttyin 
root      652  0.0  0.0  6248   908  ??  DWs  -         0:00.00 [sshd]               0   544   3   4  0 sbwait
r         655  0.0  0.0  6224   308  ??  DW   -         0:00.00 [sshd]            1001   652   0  96  0 select
r         656  0.0  0.0  3288   620  ??  DWs  -         0:00.00 [bash]            1001   655   0   8  0 wait  
root      659  0.0  0.0  1660   620  ??  DW   -         0:00.00 [su]                 0   656   0   8  0 wait  
root      660  0.0  0.0  4580   736  ??  DW   -         0:00.00 [csh]                0   659   0  20  0 pause 
root      662  0.0  0.0  1996   108  ??  WW+  -         0:00.00 [screen]             0   660   0  20  0 -     
root      663  0.0  0.0  2164   308  ??  DWs  -         0:00.00 [screen]             0   662   0  96  0 select
root      664  0.0  0.0  4924  1388  ??  DWs  -         0:00.00 [csh]                0   663   0  20  0 pause 
root      665  0.0  0.0  6248  1404  ??  DWs  -         0:00.00 [sshd]               0   544   1   4  0 sbwait
r         668  0.0  0.0  6224   316  ??  WW   -         0:00.00 [sshd]            1001   665   0  96  0 -     
r         669  0.0  0.0  3292   256  ??  DWs+ -         0:00.00 [bash]            1001   668   2   5  0 ttyin 
root      696  0.0  3.8 15612   908  ??  DN+   1Jan70   0:32.96 [ruby18]             0   664  11  -8 19 piperd
root      759  0.0  0.0  1688     0  ??  DNL+  1Jan70   0:00.22 [sh]                 0   696   7 115 19 pfault
------=_20051116181924_41008--




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?63732.213.236.228.129.1132161564.squirrel>