Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 16 Nov 2005 19:20:01 +0100 (CET)
From:      "Lars Kristiansen" <lars+lister.freebsd@adventuras.no>
To:        freebsd-stable@freebsd.org
Subject:   Re: Swapfile problem in 6?
Message-ID:  <64897.213.236.228.129.1132165201.squirrel@mail.adventuras.no>
In-Reply-To: <63732.213.236.228.129.1132161564.squirrel@mail.adventuras.no>
References:  <20051115065740.GH39882@cirb503493.alcatel.com.au> <20051115100813.74195.qmail@web36214.mail.mud.yahoo.com> <20051115103821.GJ39882@cirb503493.alcatel.com.au> <54759.213.236.228.129.1132153296.squirrel@mail.adventuras.no> <20051116162421.GE76352@green.homeunix.org> <63732.213.236.228.129.1132161564.squirrel@mail.adventuras.no>

next in thread | previous in thread | raw e-mail | index | archive | help
------=_20051116192001_86947
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 8bit

>> On Wed, Nov 16, 2005 at 04:01:36PM +0100, Lars Kristiansen wrote:
>>> > On Tue, 2005-Nov-15 02:08:12 -0800, Rob wrote:
>>> >> makeoptions DEBUG=-g
>>> >> options INVARIANTS
>>> >> options WITNESS
>>> >> options WITNESS_KDB
>>> >> options KDB
>>> >> options DDB
>>> >> options DDB_NUMSYM
>>> >> options GDB
>>> >>
>>> >>Is that enough?
>>> >
>>> > If your system is headless, you probably want 'options
>>> BREAK_TO_DEBUGGER'
>>> > as well.
>>> >
>>> > First question is: Does the system still deadlock?  INVARIANTS and
>>> > WITNESS will have added sanity checks which might have picked up the
>>> > problem.
>>> >
>>> >>1) Can I debug a kernel that does not crash, but
>>> >>   just hangs in a deadlock? Everything seems to
>>> >>   be frozen, except pinging the PC....
>>> >
>>> > Have a look at
>>> > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-online-ddb.html
>>> > and ddb(4).  Unless you have another system handy, you might like to
>>> print
>>> > out ddb(4) - it's difficult to read man pages when you're in the
>>> kernel
>>> > debugger :-).
>>> >
>>> >>2) Is such debugging possible on a headless PC
>>> >>   without a keyboard attached?
>>> >>   I do have serial console access.
>>> >
>>> > Yes.  See above URL.  The advantage is that you can (hopefully)
>>> > capture a log of your debug session.  Send a serial BREAK and you
>>> > should get a DDB> prompt.
>>> >
>>> > Basically, wait until your system deadlocks.  BREAK into DDB.
>>> > As a start, run 'show lockedvnods', 'ps'.  My guess is that you'll
>>> > see a lock that has a number of waiters - which is probably the
>>> > culprit.  Use 'panic' to get a crashdump and then you can use kgdb
>>> > to rummage around once you reboot - see
>>> > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-gdb.html
>>> >
>>> > If in doubt, post the output from the above commands here and someone
>>> > will hopefully provide further input.
>>>
>>> Hello again, I am the "me too"-guy with console-access.
>>> I am not a programmer and it is the first time I see debugging screen.
>>>
>>> It deadlocked again, and I did as advised above:
>>> (ddb: show lockedvnods; ps ; panic)
>>>  but did not understand much of the output.
>>>  Looked maybe like syncer and swap_pager was locked?
>>> Do i need to write all this down or can I get the output saved
>>> somewhere?
>>>
>>> I got a 32MB coredump but the same lack of understanding applies.
>>>
>>> Please tell me if I can be of any help! This is fun.
>>
>> Do you have the ability to connect another computer by RS-232?
>> It's easy to get a serial terminal console going (err that is
>> if you find the right guide as opposed to stabbing blindly and
>> just referencing man pages as I like to do.)  The coredump
>> should supply the same (and more) information, and someone
>> can walk through with you doing a post-mortem gdb session.
>>
>> For example, try doing the following now that you have the coredump:
>> # ps wwwauxlH -N /boot/kernel/kernel -M /var/crash/vmcore.whatever
>
> Sure, I will get a serial terminal console going and try to repeat this
> process from it.
>
> In the meantime here is output from the above ps command provided as
> attachement.
>
> --
> Lars

Yes, it deadlocked almost immediately.
A debug session is attached.

--
Lars

>
>>
>> --
>> Brian Fundakowski Feldman                           \'[ FreeBSD
>> ]''''''''''\
>>   <> green@FreeBSD.org                               \  The Power to
>> Serve! \
>>  Opinions expressed are my own.
>> \,,,,,,,,,,,,,,,,,,,,,,\
>>
> _______________________________________________
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"

------=_20051116192001_86947
Content-Type: text/plain; name="swapfile_ds.txt"
Content-Transfer-Encoding: 8bit
Content-Disposition: attachment; filename="swapfile_ds.txt"

KDB: enter: Line break on console
[thread pid 11 tid 100005 ]
Stopped at      0xc0671fcb = kdb_enter+0x2b:    nop     
db> show lockedvnods                                                                                                                        
Locked vnodes                                                                                                                               
                                                                                                                                            
0xc12cdbb0: tag syncer, type VNON                                                                                                           
    usecount 1, writecount 0, refcount 2 mountedhere 0                                                                                      
    flags ()                                                                                                                                
     lock type syncer: EXCL (count 1) by thread 0xc1143a80 (pid 40)                                                                         
                                                                                                                                            
0xc12cdaa0: tag ufs, type VREG                                                                                                              
    usecount 1, writecount 0, refcount 3 mountedhere 0                                                                                      
    flags (VV_SYSTEM)                                                                                                                       
     lock type snaplk: EXCL (count 1) by thread 0xc126d780 (pid 175)                                                                        
        ino 231, on dev ad0s1f                                                                                                              
                                                                                                                                            
0xc1301550: tag ufs, type VREG                                                                                                              
    usecount 1, writecount 1, refcount 111 mountedhere 0                                                                                    
    flags ()                                                                                                                                
    v_object 0xc1306dec ref 0 pages 436                                                                                                     
     lock type ufs: EXCL (count 1) by thread 0xc126d780 (pid 175)
        ino 8155, on dev ad0s1f
db> ps
  pid   proc     uid  ppid  pgrp  flag   stat  wmesg    wchan  cmd
  724 c1544624    0   675   675 0004002 [SLPQ pfault 0xc09b2a98][SLP] sh
  675 c1544a3c    0   648   675 0004002 [SLPQ piperd 0xc1296b28][SLP] ruby18
  648 c1544c48    0   647   648 0004002 [SLPQ pause 0xc1544c7c][SLP][SWAP] csh
  647 c1546000    0     1   647 0004102 [SLPQ wait 0xc1546000][SLP][SWAP] login
  646 c131020c    0     1   646 0004002 [SLPQ ttyin 0xc1226410][SLP][SWAP] getty
  645 c1310830    0     1   645 0004002 [SLPQ ttyin 0xc1226810][SLP][SWAP] getty
  644 c146420c    0     1   644 0004002 [SLPQ ttyin 0xc1226c10][SLP][SWAP] getty
  643 c1464830    0     1   643 0004002 [SLPQ ttyin 0xc1216010][SLP][SWAP] getty
  642 c1464c48    0     1   642 0004002 [SLPQ ttyin 0xc1212010][SLP][SWAP] getty
  641 c11ddc48    0     1   641 0004002 [SLPQ ttyin 0xc1212c10][SLP][SWAP] getty
  640 c1310418    0     1   640 0004002 [SLPQ ttyin 0xc1218010][SLP][SWAP] getty
  639 c1464418    0     1   639 0004002 [SLPQ ttyin 0xc1211010][SLP][SWAP] getty
  623 c1464624    0     1   623 0000000 [SLPQ select 0xc09a46e4][SLP][SWAP] inetd
  606 c1464000    0     1   605 0000000 [SLPQ nanslp 0xc0959bcc][SLP][SWAP] smartd
  594 c1310a3c 1002     1   594 0000100 [SLPQ select 0xc09a46e4][SLP][SWAP] dhcpd
  566 c1310000    0     1   566 0000000 [SWAP] cron
  554 c1464a3c   25     1   554 0000100 [SLPQ pause 0xc1464a70][SLP][SWAP] sendmail
  550 c126cc48    0     1   550 0000100 [SLPQ select 0xc09a46e4][SLP] sendmail
  544 c126c624    0     1   544 0000100 [SWAP] sshd
  526 c12c820c    0     1   526 0000000 [SLPQ pfault 0xc09b2a98][SLP] ntpd
  499 c126c418    0     1   499 0000000 [SWAP] usbd
  459 c126ca3c    0     1   459 0000000 [SWAP] amd
  444 c12c8418    0     1   444 0000000 [SWAP] rpcbind
  403 c12c8830   53     1   403 0000100 [SWAP] named
  343 c126c830    0     1   343 0000000 [SWAP] syslogd
  310 c11dd830    0     1   310 0000000 [SLPQ select 0xc09a46e4][SLP][SWAP] devd
  276 c11dd624   64   267   267 0000100 [SLPQ pfault 0xc09b2a98][SLP] pflogd
  267 c12c8a3c    0     1   267 0000000 [SLPQ sbwait 0xc13220a8][SLP][SWAP] pflogd
  227 c12c8c48   65     1   227 0000100 [SWAP] dhclient
  207 c11dda3c    0     1    47 0000002 [SLPQ select 0xc09a46e4][SLP][SWAP] dhclient
  175 c12c8624    0     0     0 0000204 [SLPQ vmwait 0xc09b2a98][SLP] md0
   46 c126a000    0     0     0 0000204 [SLPQ - 0xc4715d04][SLP] schedcpu
   45 c126a20c    0     0     0 0000204 [SLPQ - 0xc09acc6c][SLP] nfsiod 3
   44 c126a418    0     0     0 0000204 [SLPQ - 0xc09acc68][SLP] nfsiod 2
   43 c126a624    0     0     0 0000204 [SLPQ - 0xc09acc64][SLP] nfsiod 1
   42 c126a830    0     0     0 0000204 [SLPQ - 0xc09acc60][SLP] nfsiod 0
   41 c126aa3c    0     0     0 0000204 [SLPQ vlruwt 0xc126aa3c][SLP] vnlru
   40 c126ac48    0     0     0 0000204 [SLPQ vmwait 0xc09b2a98][SLP] syncer
   39 c126c000    0     0     0 0000204 [SLPQ psleep 0xc09a4c2c][SLP] bufdaemon
   38 c1142c48    0     0     0 000020c [SLPQ pgzero 0xc09b3224][SLP] pagezero
    9 c11dc000    0     0     0 0000204 [SLPQ psleep 0xc09b2d74][SLP] vmdaemon
    8 c11dc20c    0     0     0 0000204 [SLPQ wswbuf0 0xc09b2514][SLP] pagedaemon
   37 c11dc418    0     0     0 0000204 [IWAIT] swi0: sio
    7 c11dc624    0     0     0 0000204 [SLPQ - 0xc121023c][SLP] fdc0
   36 c11dc830    0     0     0 0000204 [SLPQ usbtsk 0xc0956884][SLP] usbtask
   35 c11dca3c    0     0     0 0000204 [SLPQ usbevt 0xc11e9210][SLP] usb0
   34 c11dcc48    0     0     0 0000204 [IWAIT] swi5:+
    6 c11dd000    0     0     0 0000204 [SLPQ - 0xc1117400][SLP] thread taskq
   33 c11dd20c    0     0     0 0000204 [IWAIT] swi6:+
   32 c11dd418    0     0     0 0000204 [IWAIT] swi6: task queue
   31 c1133624    0     0     0 0000204 [IWAIT] swi2: cambio
    5 c1133830    0     0     0 0000204 [SLPQ - 0xc1117800][SLP] kqueue taskq
   30 c1133a3c    0     0     0 0000204 [SLPQ - 0xc09545a0][SLP] yarrow
    4 c1133c48    0     0     0 0000204 [SLPQ - 0xc09570c8][SLP] g_down
    3 c1142000    0     0     0 0000204 [SLPQ - 0xc09570c4][SLP] g_up
    2 c114220c    0     0     0 0000204 [SLPQ - 0xc09570bc][SLP] g_event
   29 c1142418    0     0     0 0000204 [IWAIT] swi3: vm
   28 c1142624    0     0     0 000020c [IWAIT] swi4: clock sio
   27 c1142830    0     0     0 0000204 [IWAIT] swi1: net
   26 c1142a3c    0     0     0 0000204 [IWAIT] irq15: ata1
   25 c111e20c    0     0     0 0000204 [IWAIT] irq14: ata0
   24 c111e418    0     0     0 0000204 [IWAIT] irq13:
   23 c111e624    0     0     0 0000204 [IWAIT] irq12:
   22 c111e830    0     0     0 0000204 [IWAIT] irq11: xl0 uhci0
   21 c111ea3c    0     0     0 0000204 [IWAIT] irq10: ed0
   20 c111ec48    0     0     0 0000204 [IWAIT] irq9:
   19 c1133000    0     0     0 0000204 [IWAIT] irq8: rtc
   18 c113320c    0     0     0 0000204 [IWAIT] irq7: ppc0
   17 c1133418    0     0     0 0000204 [IWAIT] irq6: fdc0
   16 c1118000    0     0     0 0000204 [IWAIT] irq5:
   15 c111820c    0     0     0 0000204 [IWAIT] irq4: sio0
   14 c1118418    0     0     0 0000204 [IWAIT] irq3:
   13 c1118624    0     0     0 0000204 [IWAIT] irq1: atkbd0
   12 c1118830    0     0     0 0000204 [IWAIT] irq0: clk
   11 c1118a3c    0     0     0 000020c [CPU 0] idle
    1 c1118c48    0     0     1 0004200 [SLPQ wait 0xc1118c48][SLP] init
   10 c111e000    0     0     0 0000204 [SLPQ ktrace 0xc0957b18][SLP] ktrace
    0 c09571c0    0     0     0 0000200 [SLPQ vmwait 0xc09b2a98][SLP] swapper
db>
------=_20051116192001_86947--




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?64897.213.236.228.129.1132165201.squirrel>