From owner-freebsd-fs@FreeBSD.ORG  Wed Oct 26 22:27:05 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id EB3511065673;
	Wed, 26 Oct 2011 22:27:05 +0000 (UTC)
	(envelope-from pluknet@gmail.com)
Received: from mail-vw0-f54.google.com (mail-vw0-f54.google.com
	[209.85.212.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 8C36E8FC14;
	Wed, 26 Oct 2011 22:27:05 +0000 (UTC)
Received: by vws11 with SMTP id 11so2900478vws.13
	for <multiple recipients>; Wed, 26 Oct 2011 15:27:04 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to
	:cc:content-type:content-transfer-encoding;
	bh=eYX/A1DDpxXoits2a23if+JhfBsUwGu0GRaGe0lFCJk=;
	b=H3rD00y8MNK+xsP+XDZWHKBihHCT8W+qB8PbX2WRGvc27IDZJWejcIJNeeovyNPNlG
	HYrA0J1MAavv2jeN2tLaF+/+dQIWld5UH/GigBSFg3RULMYcZU2a0gb6Sa5ejQL1OscD
	j8Itta51LicMxEjdG2dg1Y5ww/EpiysPSsgLI=
MIME-Version: 1.0
Received: by 10.182.59.5 with SMTP id v5mr2956460obq.78.1319666562689; Wed, 26
	Oct 2011 15:02:42 -0700 (PDT)
Received: by 10.182.160.97 with HTTP; Wed, 26 Oct 2011 15:02:42 -0700 (PDT)
In-Reply-To: <CAFt_eMqo7uX5Y1Fp+RQKvauGa_qO12bHYuH-4+UWgnmDJ2owXw@mail.gmail.com>
References: <CAFt_eMqo7uX5Y1Fp+RQKvauGa_qO12bHYuH-4+UWgnmDJ2owXw@mail.gmail.com>
Date: Thu, 27 Oct 2011 02:02:42 +0400
Message-ID: <CAE-mSOJcuhi_O=T1jUmr9WbWFJua8WJPc+BWEnXoQWXWZgE3sg@mail.gmail.com>
From: Sergey Kandaurov <pluknet@gmail.com>
To: Subbsd <subbsd@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org
Subject: Re: FreeBSD 9.0-RC1 freeze after swapoff/swapon procedure on
 md/vnode-backend file
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Oct 2011 22:27:06 -0000

On 26 October 2011 23:31, Subbsd <subbsd@gmail.com> wrote:
> Hi
>
> I get easy reproducible =A0a hang-up servers that use the file-based
> swap file after swapoff / swapon procedure (in this case, some of the
> data must be swapped). For example:
>
> 1) dd if=3D/dev/zero of=3D/usr/swp1 bs=3D1m count=3D100
> 2) mdconfig -a -t vnode -f /usr/swp1
> 3) swapon /dev/md0
> 4) begin to allocated memory, for example by simple:
> tail /dev/zero
>
> 5) after a filling of some percent, do swapoff /dev/md0, then swapon
> /dev/md0. you can try this procedure again.
>
> The system may stop responding to commands and freezes or locks up
> after some time. From the outside - the core lives (icmp response
> goes) but the disk system is not available.
>
> PS: one of my server to my mind is frozen without swapoff/on - just
> had three swapfile, a day after he crashed.

Something interesting while trying to reproduce your problem:

swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1048970, size: 4096
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1057174, size: 4096
panic: swapoff: failed to locate 16056 swap blocks
cpuid =3D 1
KDB: stack backtrace:
db_trace_self_wrapper() at 0xffffffff802e009a =3D db_trace_self_wrapper+0x2=
a
kdb_backtrace() at 0xffffffff80486d87 =3D kdb_backtrace+0x37
panic() at 0xffffffff8044f6ee =3D panic+0x2ee
swapoff_one() at 0xffffffff80687425 =3D swapoff_one+0x475
sys_swapoff() at 0xffffffff8068789b =3D sys_swapoff+0x1bb
amd64_syscall() at 0xffffffff806c60c9 =3D amd64_syscall+0x299
Xfast_syscall() at 0xffffffff806b1467 =3D Xfast_syscall+0xf7
--- syscall (424, FreeBSD ELF64, sys_swapoff), rip =3D 0x800ab307c, rsp
=3D 0x7fffffffd9c8, rbp =3D 0 ---
KDB: enter: panic
[ thread pid 63255 tid 100211 ]
Stopped at      0xffffffff8048696b =3D kdb_enter+0x3b:    movq
$0,0x735f72(%rip)

Below is a trace for a process on another CPU that's doing intensive
malloc+bzero in userland:

db> bt 63066
Tracing pid 63066 tid 100199 td 0xfffffe000e89f000
cpustop_handler() at 0xffffffff806bb46b =3D cpustop_handler+0x2b
ipi_nmi_handler() at 0xffffffff806bb540 =3D ipi_nmi_handler+0x50
trap() at 0xffffffff806c7035 =3D trap+0x2a5
nmi_calltrap() at 0xffffffff806b15bf =3D nmi_calltrap+0x8
--- trap 0x13, rip =3D 0xffffffff8043e0d0, rsp =3D 0xffffffff80dc4dc0, rbp
=3D 0xffffff80908de750 ---
_mtx_unlock_flags() at 0xffffffff8043e0d0 =3D _mtx_unlock_flags+0x170
swp_pager_meta_ctl() at 0xffffffff806841aa =3D swp_pager_meta_ctl+0xea
swap_pager_haspage() at 0xffffffff80684272 =3D swap_pager_haspage+0x42
vm_fault_hold() at 0xffffffff8068e379 =3D vm_fault_hold+0x599
trap_pfault() at 0xffffffff806c6c26 =3D trap_pfault+0xe6
trap() at 0xffffffff806c733f =3D trap+0x5af
calltrap() at 0xffffffff806b1183 =3D calltrap+0x8
--- trap 0xc, rip =3D 0x4006ed, rsp =3D 0x7fffffffdad0, rbp =3D 0x7fffffffd=
ae0 ---

That corresponds to kgdb:

#9  0xffffffff8044f6e4 in panic (fmt=3DVariable "fmt" is not available.
) at /usr/src/sys/kern/kern_shutdown.c:624
#10 0xffffffff80687425 in swapoff_one (sp=3D0xfffffe000dc3dc00,
    cred=3D0xffffff808e5bf340) at /usr/src/sys/vm/swap_pager.c:1774
#11 0xffffffff8068789b in sys_swapoff (td=3D0xfffffe000e90f000,
uap=3DVariable "uap" is not available.
)
    at /usr/src/sys/vm/swap_pager.c:2236
#12 0xffffffff806c60c9 in amd64_syscall (td=3D0xfffffe000e90f000, traced=3D=
0)
    at subr_syscall.c:131
---Type <return> to continue, or q <return> to quit---
#13 0xffffffff806b1467 in Xfast_syscall ()
    at /usr/src/sys/amd64/amd64/exception.S:387

and

(kgdb) thr 108
[Switching to thread 108 (Thread 100199)]#0  cpustop_handler ()
    at /usr/src/sys/amd64/amd64/mp_machdep.c:1394
1394            CPU_SET_ATOMIC(cpu, &stopped_cpus);
(kgdb) bt
#0  cpustop_handler () at /usr/src/sys/amd64/amd64/mp_machdep.c:1394
#1  0xffffffff806bb540 in ipi_nmi_handler ()
    at /usr/src/sys/amd64/amd64/mp_machdep.c:1376
#2  0xffffffff806c7035 in trap (frame=3D0xffffffff80dc4d10)
    at /usr/src/sys/amd64/amd64/trap.c:200
#3  0xffffffff806b15bf in nmi_calltrap ()
    at /usr/src/sys/amd64/amd64/exception.S:501
#4  0xffffffff8043e0d0 in _mtx_unlock_flags (m=3D0xffffffff80d8af80, opts=
=3D0,
    file=3D0xffffffff80791e48 "/usr/src/sys/vm/swap_pager.c", line=3D2040)
    at /usr/src/sys/kern/kern_mutex.c:221
[smth. wrong there -- no further stack: swap_pager_*,  etc]

Here both swap_pager_swapoff() and swp_pager_meta_ctl() contend on
swhash_mtx. Or rather that's due to low limit set on retries counter?


Let's see again for another crash:

swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1062783, size: 4096
panic: swapoff: failed to locate 22133 swap blocks
cpuid =3D 2
KDB: stack backtrace:
db_trace_self_wrapper() at 0xffffffff802e009a =3D db_trace_self_wrapper+0x2=
a
kdb_backtrace() at 0xffffffff80486d87 =3D kdb_backtrace+0x37
panic() at 0xffffffff8044f6ee =3D panic+0x2ee
swapoff_one() at 0xffffffff80687425 =3D swapoff_one+0x475
sys_swapoff() at 0xffffffff8068789b =3D sys_swapoff+0x1bb
amd64_syscall() at 0xffffffff806c60c9 =3D amd64_syscall+0x299
Xfast_syscall() at 0xffffffff806b1467 =3D Xfast_syscall+0xf7
--- syscall (424, FreeBSD ELF64, sys_swapoff), rip =3D 0x800ab307c, rsp
=3D 0x7fffffffd9c8, rbp =3D 0 ---
KDB: enter: panic
[ thread pid 2807 tid 100149 ]
Stopped at      0xffffffff8048696b =3D kdb_enter+0x3b:    movq
$0,0x735f72(%rip)

Hmm... now the remain all 3 cpus are on idle.

The hungry process is not on CPU now and just sleeping, nothing
interesting there:

Tracing command hungry pid 2707 tid 100077 td 0xfffffe00029a4900
sched_switch() at 0xffffffff804798f7 =3D sched_switch+0x167
mi_switch() at 0xffffffff80458faa =3D mi_switch+0x1ea
turnstile_wait() at 0xffffffff8049872d =3D turnstile_wait+0x27d
_mtx_lock_sleep() at 0xffffffff8043e500 =3D _mtx_lock_sleep+0x130
_mtx_lock_flags() at 0xffffffff8043e8ac =3D _mtx_lock_flags+0x16c
pmap_enter() at 0xffffffff806c3867 =3D pmap_enter+0x87
vm_fault_hold() at 0xffffffff8068f39c =3D vm_fault_hold+0x15bc
trap_pfault() at 0xffffffff806c6c26 =3D trap_pfault+0xe6
trap() at 0xffffffff806c733f =3D trap+0x5af
calltrap() at 0xffffffff806b1183 =3D calltrap+0x8
--- trap 0xc, rip =3D 0x4006ed, rsp =3D 0x7fffffffdad0, rbp =3D 0x7fffffffd=
ae0 ---

However, a pagedaemon kernel process now stuck at the same place, although
it's at _mtx_lock_sleep unlike _mtx_unlock_flags from a previous crash.

Tracing command pagedaemon pid 4 tid 100059 td 0xfffffe00026b5900
sched_switch() at 0xffffffff804798f7 =3D sched_switch+0x167
mi_switch() at 0xffffffff80458faa =3D mi_switch+0x1ea
turnstile_wait() at 0xffffffff8049872d =3D turnstile_wait+0x27d
_mtx_lock_sleep() at 0xffffffff8043e500 =3D _mtx_lock_sleep+0x130
_mtx_lock_flags() at 0xffffffff8043e8ac =3D _mtx_lock_flags+0x16c
swp_pager_meta_ctl() at 0xffffffff8068413a =3D swp_pager_meta_ctl+0x7a
swap_pager_haspage() at 0xffffffff80684272 =3D swap_pager_haspage+0x42
vm_page_cache() at 0xffffffff806a133a =3D vm_page_cache+0x2ca
vm_pageout() at 0xffffffff806a40a1 =3D vm_pageout+0x1201
fork_exit() at 0xffffffff8041f965 =3D fork_exit+0x135
fork_trampoline() at 0xffffffff806b16ae =3D fork_trampoline+0xe
--- trap 0, rip =3D 0, rsp =3D 0xffffff80003c5d00, rbp =3D 0 ---



--=20
wbr,
pluknet