Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 12 Oct 2014 00:41:23 -0700
From:      Adrian Chadd <adrian@freebsd.org>
To:        Nick Sivo <nick@ycombinator.com>
Cc:        FreeBSD Questions <freebsd-questions@freebsd.org>
Subject:   Re: Next Steps to Debug ZFS Hang?
Message-ID:  <CAJ-VmomegEW2odtt+bnK7v7xKqgJB=90Z2vpyKSbb3sbQ7q76w@mail.gmail.com>
In-Reply-To: <1412732931033.813626ca@Nodemailer>
References:  <1412732931033.813626ca@Nodemailer>

Next in thread | Previous in thread | Raw E-Mail | Index | Archive | Help
Hi!

A bunch of ZFS hangs were found / fixed in FreeBSD-HEAD and I -think-
backported to FreeBSD-10.

I don't know if they've been backported to -9. Certainly not to 9.2; I
think I found / reported some after 9.2 was released.

In the kernel debugger (ddb), you can try "show allproc" to get a list of procs.

https://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug-online-ddb.html
for more information.

Why can't you get crashdumps?

are you able to:

* update to freebsd-9-stable?
* .. if you can, update to freebsd-10-stable? 10.1 is about to be released soon.

You could try "procstat -ta" to see the threads running. Do it as root
to see all the threads. TDNAME is the thread name; WCHAN is what's
important to figure out why it's sleeping.

I hope this helps!



-a


On 7 October 2014 18:48, Nick Sivo <nick@ycombinator.com> wrote:
> Hello,
>
>
> I've been having trouble with ZFS on my server. For the most part it works splendidly, but occasionally I'll experience permanent hangs.
>
>
> For example, right now on one of my ZFS filesystems (the others are fine), I can read, write, and stat files, but if I run ls in any directory, ls and the terminal will hang. CTRL-C, and kill -9 can't kill it:
>
>
> In top:
>   PID USERNAME    THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
>
>  5868 nsivo         1  20    0 14456K  1016K zfs     0   0:00  0.00% ls
>
>
> In ps:
> USER      PID  %CPU %MEM     VSZ     RSS TT  STAT STARTED        TIME COMMAND
>
> nsivo    5868   0.0  0.0   14456    1016  2- D+    2:35PM     0:00.00 ls
>
>
> Eventually the entire system hangs, and can't be shutdown cleanly.
>
>
> What are the next steps to debug this? I'm a software developer, but am not familiar with kernel debugging. Is there a way to discover in which syscall ls is stuck? Ideally without requiring a crash dump?
>
>
> Thanks for reading,
> Nick
>
>
>
> -Nick
> _______________________________________________
> freebsd-questions@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-questions
> To unsubscribe, send any mail to "freebsd-questions-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <http://docs.FreeBSD.org/cgi/mid.cgi?CAJ-VmomegEW2odtt+bnK7v7xKqgJB=90Z2vpyKSbb3sbQ7q76w>