From owner-freebsd-stable@FreeBSD.ORG Sat Jul 7 06:39:08 2012 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9E18A106566C for ; Sat, 7 Jul 2012 06:39:08 +0000 (UTC) (envelope-from gmx@ross.cx) Received: from www81.your-server.de (www81.your-server.de [213.133.104.81]) by mx1.freebsd.org (Postfix) with ESMTP id 3C6408FC12 for ; Sat, 7 Jul 2012 06:39:08 +0000 (UTC) Received: from [92.76.69.32] (helo=michael-think) by www81.your-server.de with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.74) (envelope-from ) id 1SnOfa-0005v0-G9; Sat, 07 Jul 2012 08:39:06 +0200 Content-Type: text/plain; charset=iso-8859-15; format=flowed; delsp=yes To: freebsd-stable@freebsd.org, "Michael Ross" References: Date: Sat, 07 Jul 2012 08:38:56 +0200 MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: "Michael Ross" Message-ID: In-Reply-To: User-Agent: Opera Mail/12.00 (Win32) X-Authenticated-Sender: gmx@ross.cx X-Virus-Scanned: Clear (ClamAV 0.97.3/15115/Fri Jul 6 15:58:34 2012) Cc: Subject: Re: Trouble with gmirror and device ada X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 07 Jul 2012 06:39:08 -0000 I've got to correct and update myself: Am 06.07.2012, 19:19 Uhr, schrieb Michael Ross : > Hello, > > I rented a new machine a couple of days ago, > and it happens: > > Test: Transfer some 5GB of files to the machine > > Works fine as long as I use one of the drives individually. > > If I gmirror the drives > gmirror label gm0 ada0 > gmirror insert gm0 ada1 > ...wait for rebuild > > the machine reliably locks up on the file transfer, > with a frozen systat screen showing both drives at 100% busy: ok it doesn't actually lock up, it just stays at 100% busy drives for a (long) time. Last attempt I managed to transfer 690KB in 8 files before the machine stalled. So I interrupted the transfer. That was about 10 minutes ago. System has not yet recovered, drive load keeps jumping to 100% on an idle system, load 0,0,0. Mirror is synchronized. 20 minutes, still not recovered (as in, launching any program takes the better part of 5 minutes.) rebooted and transferred ~2.5GB before stall. I have no problems with buildworld and installing a bunch of bigger ports. dmesg: http://pastebin.com/GWWbLrL2 Systat looks as before/below, here's a vmstat -i: interrupt total rate irq1: atkbd0 14 0 irq16: re0 531857 191 irq20: atapci0 9188 3 cpu0:timer 322709 116 cpu1:timer 79970 28 Total 943738 339 origin> ps auxwww USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND root 11 199,0 0,0 0 32 ?? RL 7:21am 101:49,04 [idle] root 0 0,0 0,0 0 144 ?? DLs 7:21am 0:00,00 [kernel] root 1 0,0 0,0 6276 592 ?? ILs 7:21am 0:00,01 /sbin/init -- root 2 0,0 0,0 0 16 ?? DL 7:21am 0:00,00 [ctl_thrd] root 3 0,0 0,0 0 16 ?? DL 7:21am 0:00,00 [fdc0] root 4 0,0 0,0 0 16 ?? DL 7:21am 0:00,00 [sctp_iterator] root 5 0,0 0,0 0 16 ?? DL 7:21am 0:00,00 [xpt_thrd] root 6 0,0 0,0 0 16 ?? DL 7:21am 0:00,00 [pagedaemon] root 7 0,0 0,0 0 16 ?? DL 7:21am 0:00,00 [vmdaemon] root 8 0,0 0,0 0 16 ?? DL 7:21am 0:00,00 [pagezero] root 9 0,0 0,0 0 16 ?? DL 7:21am 0:00,01 [bufdaemon] root 10 0,0 0,0 0 16 ?? DL 7:21am 0:00,00 [audit] root 12 0,0 0,0 0 240 ?? WL 7:21am 0:05,98 [intr] root 13 0,0 0,0 0 48 ?? DL 7:21am 0:00,44 [geom] root 14 0,0 0,0 0 16 ?? DL 7:21am 0:00,12 [yarrow] root 15 0,0 0,0 0 320 ?? DL 7:21am 0:00,03 [usb] root 16 0,0 0,0 0 16 ?? DL 7:21am 0:00,01 [vnlru] root 17 0,0 0,0 0 16 ?? DL 7:21am 0:00,03 [syncer] root 18 0,0 0,0 0 16 ?? DL 7:21am 0:00,10 [softdepflush] root 19 0,0 0,0 0 16 ?? DL 7:21am 0:00,11 [g_mirror gm0] root 887 0,0 0,2 10376 3496 ?? Is 7:21am 0:00,00 /sbin/devd root 1033 0,0 0,1 12052 1692 ?? Is 7:21am 0:00,01 /usr/sbin/syslogd -s -s root 1119 0,0 0,1 12024 1856 ?? Is 7:21am 0:00,00 ntpd: [priv] (ntpd) _ntp 1120 0,0 0,1 12024 1904 ?? S 7:21am 0:00,03 ntpd: ntp engine (ntpd) _ntp 1122 0,0 0,1 12024 1884 ?? I 7:21am 0:00,00 ntpd: dns engine (ntpd) root 1131 0,0 0,2 46748 4712 ?? Is 7:21am 0:00,01 /usr/sbin/sshd root 1145 0,0 0,1 14128 1828 ?? Ss 7:21am 0:00,01 /usr/sbin/cron -s root 1192 0,0 0,3 67888 5524 ?? Ss 7:21am 0:00,08 sshd: root@pts/0 (sshd) root 1197 0,0 0,3 67888 5564 ?? Ss 7:21am 0:00,10 sshd: root@pts/1 (sshd) root 1277 0,0 0,1 22688 2164 ?? Is 7:40am 0:00,01 /usr/libexec/ftpd -D root 1176 0,0 0,1 12052 1644 v0 Is+ 7:21am 0:00,00 /usr/libexec/getty Pc ttyv0 root 1177 0,0 0,1 12052 1644 v1 Is+ 7:21am 0:00,00 /usr/libexec/getty Pc ttyv1 root 1178 0,0 0,1 12052 1644 v2 Is+ 7:21am 0:00,00 /usr/libexec/getty Pc ttyv2 root 1179 0,0 0,1 12052 1644 v3 Is+ 7:21am 0:00,00 /usr/libexec/getty Pc ttyv3 root 1180 0,0 0,1 12052 1644 v4 Is+ 7:21am 0:00,00 /usr/libexec/getty Pc ttyv4 root 1181 0,0 0,1 12052 1644 v5 Is+ 7:21am 0:00,00 /usr/libexec/getty Pc ttyv5 root 1182 0,0 0,1 12052 1644 v6 Is+ 7:21am 0:00,00 /usr/libexec/getty Pc ttyv6 root 1183 0,0 0,1 12052 1644 v7 Is+ 7:21am 0:00,00 /usr/libexec/getty Pc ttyv7 root 1195 0,0 0,2 17464 3968 0 Ss 7:21am 0:00,05 -csh (csh) root 1403 0,0 0,1 14188 1820 0 R+ 8:12am 0:00,00 ps auxwww root 1200 0,0 0,2 17464 3380 1 Is 7:21am 0:00,01 -csh (csh) root 1231 0,0 0,2 18680 3692 1 S+ 7:22am 0:02,07 systat -vms 1 > > > 10 users Load 0,41 0,44 0,20 6 Jul 18:47 > > Mem:KB REAL VIRTUAL VN PAGER SWAP > PAGER > Tot Share Tot Share Free in out > in out > Act 23496 6036 600772 12252 1361840 count > All 71680 6632 1074428k 28264 pages > Proc: > Interrupts > r p d s w Csw Trp Sys Int Sof Flt cow 121 > total > 28 199 2 121 4 67 zfod > atkbd0 1 > ozfod 4 > re0 16 > 0,4%Sys 0,0%Intr 0,0%User 0,0%Nice 99,6%Idle %ozfod > atapci0 20 > | | | | | | | | | | | daefr 94 > cpu0:timer > prcfr 23 > cpu1:timer > 1333 dtbuf 4 totfr > Namei Name-cache Dir-cache 111358 desvn react > Calls hits % hits % 1009 numvn pdwak > 3 3 100 32 frevn pdpgs > intrn > Disks ada0 ada1 pass0 pass1 302680 wire > KB/t 16,00 16,00 0,00 0,00 14716 act > tps 1 1 0 0 334260 inact > MB/s 0,02 0,02 0,00 0,00 cache > %busy 100 100 0 0 1361840 free > 217488 buf > > While the network stays responsive, i. e. I can ping the machine and > _connect_ via ssh, > I can't actually log in (or, in already open shell, execute anything). > System requires a hardware reset. Nothing in the logs whatsoever (no > surprise here). > > I have no KVM access to this system. > > OS is generic 9.0 stable from two days ago. > > I run 8.2-R on an identical machine without trouble. > I run 9.0 stable as of May 4th on an similiar (other CPU and NIC) > machine without trouble. > On both machines, the drives are recognized as ``ad''. > (Why btw? ``man ada'' says ``device ada'', but there is no such option > in the GENERIC config. > Do I get ``ada'' with ``device ATA_CAM ''? I'm going to try this next, > kick ata_cam from the kernel, see if drives are ``ad'' and system > doesn't crash.) Right, should have remembered the release notes. Still the other machine doesn't ``ada'' in spite of running 9.0-STABLE. > > > I'd appreciate suggestions on what I could do. > > Thanks, > > Michael > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"