Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 1 Oct 2012 08:07:44 -0400 (EDT)
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        Norbert Aschendorff <norbert.aschendorff@yahoo.de>
Cc:        freebsd-stable@freebsd.org
Subject:   panic "Sleeping thread owns a non-sleepable lock" via cv_timedwait_signal, was "rsync over NFS"
Message-ID:  <509617515.1463700.1349093264849.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <506843B2.5060907@yahoo.de>

next in thread | previous in thread | raw e-mail | index | archive | help
Norbert Aschendorff wrote:
> Hi,
> 
> my FreeBSD-9/stable machine (FreeBSD freebsd-tower.goebo.site
> 9.1-PRERELEASE FreeBSD 9.1-PRERELEASE #2 r241044M: Sat Sep 29 12:52:01
> CEST 2012 lbo@freebsd-tower.goebo.site:/usr/obj/usr/src/sys/GENERIC
> i386) crashes reproducibly when rsync-ing files to an NFSv4 share on
> the FreeBSD machine. The crash makes the system reboot. The crash
> creates files in /var/crash which may be obtained here: [1].
> 
> This problem is not limited to the self-compiled kernel/world
> (stable/9)
> but appears also on pre-compiled 9.1-PRERELEASE. I did not test
> 9.0-RELEASE.
> 
> If I do not use rsync on this NFS share, everything works completely
> fine.
> 
> Workaround: Use rsync over SSH.
> 
> --Norbert
> 
> [1] http://lbo.spheniscida.de/Files/nfs-rsync-crash.tgz (25K), vmcore
> of
> around 300M (90M gzipped, 64M LZMA'd) not included
> 
>From a quick look, the panic is:
Sleeping thread (tid 100099, pid 1599) owns a non-sleepable lock
called from the server side krpc via cv_timedwait_sig().

I assume this means that another mutex or similar is held as well as
the one passed in as an argument to cv_timedwait_sig()?
(I'll keep looking, but I can't spot where another one might be held
 by the NFS or krpc code.)

I'm not knowledgible when it comes to gdb and crash dumps. Is there an
easy command Norbert can type to see all the locks held by
tid 100099, pid 1599?

Is the NFS client using Kerberos or AUTH_SYS for the mount? (And if you
are using Kerberos, have you tried the rsync with an AUTH_SYS mount?)

Does anyone happen to know of outstanding issues (or problems with WITNESS)
for cv_timedwait_sig() called with a locked mutex as the argument lock?
(The mutex will probably get locked by another thread related to the same
 pid, once sleepq_timedwait_sig() unlocks the argument mutex.)

Here's the backtrace from the crash info he referenced, in case someone
else can gain more insight from it:
Unread portion of the kernel message buffer:
Sleeping thread (tid 100099, pid 1599) owns a non-sleepable lock
KDB: stack backtrace of thread 100099:
#0 0xc0aae034 at mi_switch+0xe4
#1 0xc0ae3799 at sleepq_switch+0xd9
#2 0xc0ae3c06 at sleepq_catch_signals+0x3d6
#3 0xc0ae3d04 at sleepq_timedwait_sig+0x14
#4 0xc0a591ff at _cv_timedwait_sig+0x17f
#5 0xc0c87f7e at svc_run_internal+0x7ce
#6 0xc0c87706 at svc_run+0xc6
#7 0xc09f24c4 at nfsrvd_nfsd+0x1d4
#8 0xc0a00ad9 at nfssvc_nfsd+0x109
#9 0xc0c70c58 at sys_nfssvc+0x98
#10 0xc0dfd288 at syscall+0x378
#11 0xc0de64b1 at Xint0x80_syscall+0x21
panic: sleeping thread
cpuid = 0

rick
____________________________________________
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to
> "freebsd-stable-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?509617515.1463700.1349093264849.JavaMail.root>