From owner-freebsd-fs@FreeBSD.ORG Wed Oct 26 22:27:05 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EB3511065673; Wed, 26 Oct 2011 22:27:05 +0000 (UTC) (envelope-from pluknet@gmail.com) Received: from mail-vw0-f54.google.com (mail-vw0-f54.google.com [209.85.212.54]) by mx1.freebsd.org (Postfix) with ESMTP id 8C36E8FC14; Wed, 26 Oct 2011 22:27:05 +0000 (UTC) Received: by vws11 with SMTP id 11so2900478vws.13 for ; Wed, 26 Oct 2011 15:27:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=eYX/A1DDpxXoits2a23if+JhfBsUwGu0GRaGe0lFCJk=; b=H3rD00y8MNK+xsP+XDZWHKBihHCT8W+qB8PbX2WRGvc27IDZJWejcIJNeeovyNPNlG HYrA0J1MAavv2jeN2tLaF+/+dQIWld5UH/GigBSFg3RULMYcZU2a0gb6Sa5ejQL1OscD j8Itta51LicMxEjdG2dg1Y5ww/EpiysPSsgLI= MIME-Version: 1.0 Received: by 10.182.59.5 with SMTP id v5mr2956460obq.78.1319666562689; Wed, 26 Oct 2011 15:02:42 -0700 (PDT) Received: by 10.182.160.97 with HTTP; Wed, 26 Oct 2011 15:02:42 -0700 (PDT) In-Reply-To: References: Date: Thu, 27 Oct 2011 02:02:42 +0400 Message-ID: From: Sergey Kandaurov To: Subbsd Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org Subject: Re: FreeBSD 9.0-RC1 freeze after swapoff/swapon procedure on md/vnode-backend file X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Oct 2011 22:27:06 -0000 On 26 October 2011 23:31, Subbsd wrote: > Hi > > I get easy reproducible =A0a hang-up servers that use the file-based > swap file after swapoff / swapon procedure (in this case, some of the > data must be swapped). For example: > > 1) dd if=3D/dev/zero of=3D/usr/swp1 bs=3D1m count=3D100 > 2) mdconfig -a -t vnode -f /usr/swp1 > 3) swapon /dev/md0 > 4) begin to allocated memory, for example by simple: > tail /dev/zero > > 5) after a filling of some percent, do swapoff /dev/md0, then swapon > /dev/md0. you can try this procedure again. > > The system may stop responding to commands and freezes or locks up > after some time. From the outside - the core lives (icmp response > goes) but the disk system is not available. > > PS: one of my server to my mind is frozen without swapoff/on - just > had three swapfile, a day after he crashed. Something interesting while trying to reproduce your problem: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1048970, size: 4096 swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1057174, size: 4096 panic: swapoff: failed to locate 16056 swap blocks cpuid =3D 1 KDB: stack backtrace: db_trace_self_wrapper() at 0xffffffff802e009a =3D db_trace_self_wrapper+0x2= a kdb_backtrace() at 0xffffffff80486d87 =3D kdb_backtrace+0x37 panic() at 0xffffffff8044f6ee =3D panic+0x2ee swapoff_one() at 0xffffffff80687425 =3D swapoff_one+0x475 sys_swapoff() at 0xffffffff8068789b =3D sys_swapoff+0x1bb amd64_syscall() at 0xffffffff806c60c9 =3D amd64_syscall+0x299 Xfast_syscall() at 0xffffffff806b1467 =3D Xfast_syscall+0xf7 --- syscall (424, FreeBSD ELF64, sys_swapoff), rip =3D 0x800ab307c, rsp =3D 0x7fffffffd9c8, rbp =3D 0 --- KDB: enter: panic [ thread pid 63255 tid 100211 ] Stopped at 0xffffffff8048696b =3D kdb_enter+0x3b: movq $0,0x735f72(%rip) Below is a trace for a process on another CPU that's doing intensive malloc+bzero in userland: db> bt 63066 Tracing pid 63066 tid 100199 td 0xfffffe000e89f000 cpustop_handler() at 0xffffffff806bb46b =3D cpustop_handler+0x2b ipi_nmi_handler() at 0xffffffff806bb540 =3D ipi_nmi_handler+0x50 trap() at 0xffffffff806c7035 =3D trap+0x2a5 nmi_calltrap() at 0xffffffff806b15bf =3D nmi_calltrap+0x8 --- trap 0x13, rip =3D 0xffffffff8043e0d0, rsp =3D 0xffffffff80dc4dc0, rbp =3D 0xffffff80908de750 --- _mtx_unlock_flags() at 0xffffffff8043e0d0 =3D _mtx_unlock_flags+0x170 swp_pager_meta_ctl() at 0xffffffff806841aa =3D swp_pager_meta_ctl+0xea swap_pager_haspage() at 0xffffffff80684272 =3D swap_pager_haspage+0x42 vm_fault_hold() at 0xffffffff8068e379 =3D vm_fault_hold+0x599 trap_pfault() at 0xffffffff806c6c26 =3D trap_pfault+0xe6 trap() at 0xffffffff806c733f =3D trap+0x5af calltrap() at 0xffffffff806b1183 =3D calltrap+0x8 --- trap 0xc, rip =3D 0x4006ed, rsp =3D 0x7fffffffdad0, rbp =3D 0x7fffffffd= ae0 --- That corresponds to kgdb: #9 0xffffffff8044f6e4 in panic (fmt=3DVariable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:624 #10 0xffffffff80687425 in swapoff_one (sp=3D0xfffffe000dc3dc00, cred=3D0xffffff808e5bf340) at /usr/src/sys/vm/swap_pager.c:1774 #11 0xffffffff8068789b in sys_swapoff (td=3D0xfffffe000e90f000, uap=3DVariable "uap" is not available. ) at /usr/src/sys/vm/swap_pager.c:2236 #12 0xffffffff806c60c9 in amd64_syscall (td=3D0xfffffe000e90f000, traced=3D= 0) at subr_syscall.c:131 ---Type to continue, or q to quit--- #13 0xffffffff806b1467 in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:387 and (kgdb) thr 108 [Switching to thread 108 (Thread 100199)]#0 cpustop_handler () at /usr/src/sys/amd64/amd64/mp_machdep.c:1394 1394 CPU_SET_ATOMIC(cpu, &stopped_cpus); (kgdb) bt #0 cpustop_handler () at /usr/src/sys/amd64/amd64/mp_machdep.c:1394 #1 0xffffffff806bb540 in ipi_nmi_handler () at /usr/src/sys/amd64/amd64/mp_machdep.c:1376 #2 0xffffffff806c7035 in trap (frame=3D0xffffffff80dc4d10) at /usr/src/sys/amd64/amd64/trap.c:200 #3 0xffffffff806b15bf in nmi_calltrap () at /usr/src/sys/amd64/amd64/exception.S:501 #4 0xffffffff8043e0d0 in _mtx_unlock_flags (m=3D0xffffffff80d8af80, opts= =3D0, file=3D0xffffffff80791e48 "/usr/src/sys/vm/swap_pager.c", line=3D2040) at /usr/src/sys/kern/kern_mutex.c:221 [smth. wrong there -- no further stack: swap_pager_*, etc] Here both swap_pager_swapoff() and swp_pager_meta_ctl() contend on swhash_mtx. Or rather that's due to low limit set on retries counter? Let's see again for another crash: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1062783, size: 4096 panic: swapoff: failed to locate 22133 swap blocks cpuid =3D 2 KDB: stack backtrace: db_trace_self_wrapper() at 0xffffffff802e009a =3D db_trace_self_wrapper+0x2= a kdb_backtrace() at 0xffffffff80486d87 =3D kdb_backtrace+0x37 panic() at 0xffffffff8044f6ee =3D panic+0x2ee swapoff_one() at 0xffffffff80687425 =3D swapoff_one+0x475 sys_swapoff() at 0xffffffff8068789b =3D sys_swapoff+0x1bb amd64_syscall() at 0xffffffff806c60c9 =3D amd64_syscall+0x299 Xfast_syscall() at 0xffffffff806b1467 =3D Xfast_syscall+0xf7 --- syscall (424, FreeBSD ELF64, sys_swapoff), rip =3D 0x800ab307c, rsp =3D 0x7fffffffd9c8, rbp =3D 0 --- KDB: enter: panic [ thread pid 2807 tid 100149 ] Stopped at 0xffffffff8048696b =3D kdb_enter+0x3b: movq $0,0x735f72(%rip) Hmm... now the remain all 3 cpus are on idle. The hungry process is not on CPU now and just sleeping, nothing interesting there: Tracing command hungry pid 2707 tid 100077 td 0xfffffe00029a4900 sched_switch() at 0xffffffff804798f7 =3D sched_switch+0x167 mi_switch() at 0xffffffff80458faa =3D mi_switch+0x1ea turnstile_wait() at 0xffffffff8049872d =3D turnstile_wait+0x27d _mtx_lock_sleep() at 0xffffffff8043e500 =3D _mtx_lock_sleep+0x130 _mtx_lock_flags() at 0xffffffff8043e8ac =3D _mtx_lock_flags+0x16c pmap_enter() at 0xffffffff806c3867 =3D pmap_enter+0x87 vm_fault_hold() at 0xffffffff8068f39c =3D vm_fault_hold+0x15bc trap_pfault() at 0xffffffff806c6c26 =3D trap_pfault+0xe6 trap() at 0xffffffff806c733f =3D trap+0x5af calltrap() at 0xffffffff806b1183 =3D calltrap+0x8 --- trap 0xc, rip =3D 0x4006ed, rsp =3D 0x7fffffffdad0, rbp =3D 0x7fffffffd= ae0 --- However, a pagedaemon kernel process now stuck at the same place, although it's at _mtx_lock_sleep unlike _mtx_unlock_flags from a previous crash. Tracing command pagedaemon pid 4 tid 100059 td 0xfffffe00026b5900 sched_switch() at 0xffffffff804798f7 =3D sched_switch+0x167 mi_switch() at 0xffffffff80458faa =3D mi_switch+0x1ea turnstile_wait() at 0xffffffff8049872d =3D turnstile_wait+0x27d _mtx_lock_sleep() at 0xffffffff8043e500 =3D _mtx_lock_sleep+0x130 _mtx_lock_flags() at 0xffffffff8043e8ac =3D _mtx_lock_flags+0x16c swp_pager_meta_ctl() at 0xffffffff8068413a =3D swp_pager_meta_ctl+0x7a swap_pager_haspage() at 0xffffffff80684272 =3D swap_pager_haspage+0x42 vm_page_cache() at 0xffffffff806a133a =3D vm_page_cache+0x2ca vm_pageout() at 0xffffffff806a40a1 =3D vm_pageout+0x1201 fork_exit() at 0xffffffff8041f965 =3D fork_exit+0x135 fork_trampoline() at 0xffffffff806b16ae =3D fork_trampoline+0xe --- trap 0, rip =3D 0, rsp =3D 0xffffff80003c5d00, rbp =3D 0 --- --=20 wbr, pluknet