From owner-freebsd-current@FreeBSD.ORG Sat May 25 17:52:13 2013 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id E63CDBE; Sat, 25 May 2013 17:52:13 +0000 (UTC) (envelope-from roger.pau@citrix.com) Received: from SMTP.EU.CITRIX.COM (smtp.eu.citrix.com [46.33.159.39]) by mx1.freebsd.org (Postfix) with ESMTP id B366FCFB; Sat, 25 May 2013 17:52:12 +0000 (UTC) X-IronPort-AV: E=Sophos;i="4.87,741,1363132800"; d="scan'208";a="4989012" Received: from lonpex01cl01.citrite.net ([10.30.203.101]) by LONPIPO01.EU.CITRIX.COM with ESMTP/TLS/AES128-SHA; 25 May 2013 17:52:06 +0000 Received: from [192.168.1.30] (10.30.203.1) by LONPEX01CL01.citrite.net (10.30.203.101) with Microsoft SMTP Server id 14.2.342.3; Sat, 25 May 2013 18:52:04 +0100 Message-ID: <51A0FA43.2040503@citrix.com> Date: Sat, 25 May 2013 19:52:03 +0200 From: =?ISO-8859-1?Q?Roger_Pau_Monn=E9?= User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:17.0) Gecko/20130509 Thunderbird/17.0.6 MIME-Version: 1.0 To: John Baldwin Subject: Re: FreeBSD-HEAD gets stuck on vnode operations References: <5190CBEC.5000704@citrix.com> <20130514163149.GS3047@kib.kiev.ua> <51927143.4080102@citrix.com> <201305201434.55406.jhb@freebsd.org> In-Reply-To: <201305201434.55406.jhb@freebsd.org> Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 8bit X-Originating-IP: [10.30.203.1] Cc: Konstantin Belousov , freebsd-current@freebsd.org, "current@freebsd.org" X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 May 2013 17:52:14 -0000 On 20/05/13 20:34, John Baldwin wrote: > On Tuesday, May 14, 2013 1:15:47 pm Roger Pau Monné wrote: >> On 14/05/13 18:31, Konstantin Belousov wrote: >>> On Tue, May 14, 2013 at 06:08:45PM +0200, Roger Pau Monn? wrote: >>>> On 13/05/13 17:00, Konstantin Belousov wrote: >>>>> On Mon, May 13, 2013 at 04:33:04PM +0200, Roger Pau Monn? wrote: >>>>>> On 13/05/13 13:18, Roger Pau Monn? wrote: >>>> >>>> Thanks for taking a look, >>>> >>>>>> I would like to explain this a little bit more, the syncer process >>>>>> doesn't get blocked on the _mtx_trylock_flags_ call, it just continues >>>>>> looping forever in what seems to be an endless loop around >>>>>> mnt_vnode_next_active/ffs_sync. Also while in this state there is no >>>>>> noticeable disk activity, so I'm unsure of what is happening. >>>>> How many CPUs does your VM have ? >>>> >>>> 7 vCPUs, but I've also seen this issue with 4 and 16 vCPUs. >>>> >>>>> >>>>> The loop you describing means that other thread owns the vnode >>>>> interlock. Can you track what this thread does ? E.g. look at the >>>>> vp->v_interlock.mtx_lock, which is basically a pointer to the struct >>>>> thread owning the mutex, clear low bits as needed. Then you can >>>>> inspect the thread and get a backtrace. >>>> >>>> There are no other threads running, only syncer is running on CPU 1 (see >>>> ps in previous email). All other CPUs are idle, and as seen from the ps >>>> quite a lot of threads are blocked in vnode related operations, either >>>> "*Name Cac", "*vnode_fr" or "*vnode in". I've also attached the output >>>> of alllocks in the previous email. >>> This is not useful. You need to look at the mutex which fails the >>> trylock operation in the mnt_vnode_next_active(), see who owns it, >>> and then 'unwind' the locking dependencies from there. >> >> Sorry, now I get it, let's see if I can find the locked vnodes and the >> thread that owns them... > > You can use 'show lock
v_interlock>' to find an owning > thread and then use 'show sleepchain '. If you are using kgdb on the > live system (probably easier) then you can grab my scripts at > www.freebsd.org/~jhb/gdb/ (do 'cd /path/to/scripts; source gdb6'). You can > then find the offending thread and do 'mtx_owner &vp->v_interlock' and then > 'sleepchain ' Hello, Sorry for the delay in debugging this, but the fact that the issue itself is hard to reproduce, and that I didn't have much time prevented me from digging into this. It seems like the locked node is always the same, or at least I haven't been able to see this loop with more than one node locked (so far): db> show lock 0xfffffe0030cdf448 class: sleep mutex name: vnode interlock flags: {DEF} state: {OWNED} owner: 0xfffffe008d1e9000 (tid 101020, pid 66630, "cc") db> show sleepchain 0xfffffe008d1e9000 thread 101020 (pid 66630, cc) inhibited db> tr 66630 Tracing pid 66630 tid 101020 td 0xfffffe008d1e9000 sched_switch() at sched_switch+0x482/frame 0xffffff8120ff3630 mi_switch() at mi_switch+0x179/frame 0xffffff8120ff3670 turnstile_wait() at turnstile_wait+0x3ac/frame 0xffffff8120ff36c0 __mtx_lock_sleep() at __mtx_lock_sleep+0x255/frame 0xffffff8120ff3740 __mtx_lock_flags() at __mtx_lock_flags+0xda/frame 0xffffff8120ff3780 vdropl() at vdropl+0x255/frame 0xffffff8120ff37b0 vputx() at vputx+0x27c/frame 0xffffff8120ff3810 namei() at namei+0x3dd/frame 0xffffff8120ff38c0 kern_statat_vnhook() at kern_statat_vnhook+0x99/frame 0xffffff8120ff3a40 sys_stat() at sys_stat+0x2d/frame 0xffffff8120ff3ae0 amd64_syscall() at amd64_syscall+0x265/frame 0xffffff8120ff3bf0 Xfast_syscall() at Xfast_syscall+0xfb/frame 0xffffff8120ff3bf0 --- syscall (188, FreeBSD ELF64, sys_stat), rip = 0x18280ea, rsp = 0x7fffffffa148, rbp = 0x7fffffffa170 --- db> ps pid ppid pgrp uid state wmesg wchan cmd 66630 66591 66588 0 L+ *vnode_fr 0xfffffe000bc9fa80 cc 66591 66590 66588 0 S+ wait 0xfffffe0030dbd4b8 cc 66590 66588 66588 0 S+ wait 0xfffffe008de20000 sh 66588 66572 66588 0 S+ wait 0xfffffe008d8a8970 sh 66572 49649 49649 0 S+ select 0xfffffe003029dac0 make 49649 49631 49649 0 S+ wait 0xfffffe000d801970 sh 49631 49629 49629 0 S+ select 0xfffffe00302b3cc0 make 49629 49614 49629 0 S+ wait 0xfffffe008dbfa970 sh 49614 45214 45214 0 S+ select 0xfffffe00303e3640 make 45214 45207 45214 0 S+ wait 0xfffffe008dbfa000 sh 45207 45205 45200 0 S+ select 0xfffffe000d66d3c0 make 45205 45200 45200 0 S+ wait 0xfffffe008da1b4b8 sh 45200 757 45200 0 S+ select 0xfffffe000d6093c0 make 30611 737 30611 0 S+ ttyin 0xfffffe000bd70ca8 bash 2325 756 2325 0 Ss+ ttyin 0xfffffe008dd6eca8 bash 757 756 757 0 Ss+ wait 0xfffffe000d7fd4b8 bash 756 755 756 0 Ss select 0xfffffe000d783a40 screen 755 744 755 0 S+ pause 0xfffffe000d9560a8 screen 744 743 744 0 S+ wait 0xfffffe000d4d6970 bash 743 742 743 1001 S+ wait 0xfffffe000d4d5000 su 742 741 742 1001 Ss+ wait 0xfffffe000d6674b8 sh 741 738 738 1001 S select 0xfffffe000d60a8c0 sshd 738 680 738 0 Ss select 0xfffffe000d60a940 sshd 737 1 737 0 Ss+ wait 0xfffffe000d605970 login 736 1 736 0 Ss+ ttyin 0xfffffe000bd718a8 getty 735 1 735 0 Ss+ ttyin 0xfffffe000bd71ca8 getty 734 1 734 0 Ss+ ttyin 0xfffffe000bfcc0a8 getty 733 1 733 0 Ss+ ttyin 0xfffffe000bfcc4a8 getty 732 1 732 0 Ss+ ttyin 0xfffffe000bfcc8a8 getty 731 1 731 0 Ss+ ttyin 0xfffffe000bd700a8 getty 730 1 730 0 Ss+ ttyin 0xfffffe000bd704a8 getty 729 1 729 0 Ss+ ttyin 0xfffffe000bd708a8 getty 690 1 690 0 Ss nanslp 0xffffffff81343734 cron 686 1 686 25 Ss pause 0xfffffe000d77a560 sendmail 683 1 683 0 Ss select 0xfffffe000d7834c0 sendmail 680 1 680 0 Ss select 0xfffffe000d4a5e40 sshd 580 1 580 0 Ss select 0xfffffe000d66d040 syslogd 486 1 486 0 Ss select 0xfffffe000d783540 devd 485 1 485 65 Ss select 0xfffffe000d962040 dhclient 445 1 445 0 Ss select 0xfffffe000d66d1c0 dhclient 18 0 0 0 DL sdflush 0xffffffff8153675c [softdepflush] 17 0 0 0 DL vlruwt 0xfffffe000d4d5970 [vnlru] 9 0 0 0 RL CPU 2 [syncer] 8 0 0 0 DL psleep 0xffffffff8152f72c [bufdaemon] 7 0 0 0 DL pgzero 0xffffffff81537e8c [pagezero] 6 0 0 0 DL psleep 0xffffffff815370d0 [vmdaemon] 5 0 0 0 DL psleep 0xffffffff815b3304 [pagedaemon] 4 0 0 0 DL ccb_scan 0xffffffff813038b0 [xpt_thrd] 3 0 0 0 DL waiting_ 0xffffffff815a73e0 [sctp_iterator] 2 0 0 0 DL balloon 0xffffffff8079d310 [balloon] 16 0 0 0 SL xbread 0xfffffe00feffc000 [xenstore_rcv] 15 0 0 0 SL waitev 0xffffffff81538098 [xenwatch] 14 0 0 0 DL - 0xffffffff8131f988 [yarrow] 13 0 0 0 DL (threaded) [geom] 100021 D - 0xffffffff8159bc70 [g_down] 100020 D - 0xffffffff8159bc68 [g_up] 100019 D - 0xffffffff8159bc60 [g_event] 12 0 0 0 WL (threaded) [intr] 100050 I [irq778: xn0] 100049 I [irq777: xbd1] 100048 I [irq776: xbd0] 100042 I [irq7: ppc0] 100041 I [swi0: uart] 100040 I [irq12: psm0] 100039 I [irq1: atkbd0] 100036 I [irq775: xenstore0] 100035 I [irq15: ata1] 100034 I [irq14: ata0] 100032 I [swi5: fast taskq] 100030 I [swi6: Giant taskq] 100029 I [swi6: task queue] 100024 I [swi2: cambio] 100018 I [swi3: vm] 100017 I [swi4: clock] 100016 I [swi4: clock] 100015 I [swi4: clock] 100014 I [swi4: clock] 100013 I [swi4: clock] 100012 I [swi4: clock] 100011 I [swi4: clock] 100010 I [swi1: netisr 0] 11 0 0 0 RL (threaded) [idle] 100009 Run CPU 6 [idle: cpu6] 100008 Run CPU 5 [idle: cpu5] 100007 Run CPU 4 [idle: cpu4] 100006 Run CPU 3 [idle: cpu3] 100005 CanRun [idle: cpu2] 100004 Run CPU 1 [idle: cpu1] 100003 Run CPU 0 [idle: cpu0] 1 0 1 0 SLs wait 0xfffffe000bc77970 [init] 10 0 0 0 DL audit_wo 0xffffffff815aaae0 [audit] 0 0 0 0 DLs (threaded) [kernel] 100044 D - 0xffffffff81342593 [deadlkres] 100043 D - 0xfffffe000bfc5300 [mca taskq] 100033 D - 0xfffffe000bcaa600 [kqueue taskq] 100031 D - 0xfffffe000bcaaa00 [thread taskq] 100028 D - 0xfffffe000be5c100 [ffs_trim taskq] 100027 D - 0xfffffe000bca1300 [acpi_task_2] 100026 D - 0xfffffe000bca1300 [acpi_task_1] 100025 D - 0xfffffe000bca1300 [acpi_task_0] 100022 D - 0xfffffe000bc6e300 [firmware taskq] 100000 D sched 0xffffffff8159bd68 [swapper]