From owner-freebsd-arm@freebsd.org Thu Jan 7 19:24:13 2016 Return-Path: Delivered-To: freebsd-arm@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 04B79A663E6 for ; Thu, 7 Jan 2016 19:24:13 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: from asp.reflexion.net (outbound-mail-210-4.reflexion.net [208.70.210.4]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id BDB4012FA for ; Thu, 7 Jan 2016 19:24:11 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: (qmail 29137 invoked from network); 7 Jan 2016 19:24:19 -0000 Received: from unknown (HELO rtc-sm-01.app.dca.reflexion.local) (10.81.150.1) by 0 (rfx-qmail) with SMTP; 7 Jan 2016 19:24:19 -0000 Received: by rtc-sm-01.app.dca.reflexion.local (Reflexion email security v7.80.0) with SMTP; Thu, 07 Jan 2016 14:24:12 -0500 (EST) Received: (qmail 25589 invoked from network); 7 Jan 2016 19:24:12 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with SMTP; 7 Jan 2016 19:24:12 -0000 X-No-Relay: not in my network X-No-Relay: not in my network Received: from [192.168.1.8] (c-76-115-7-162.hsd1.or.comcast.net [76.115.7.162]) by iron2.pdx.net (Postfix) with ESMTPSA id B2B151C43C6; Thu, 7 Jan 2016 11:24:06 -0800 (PST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\)) Subject: Re: FYI: various 11.0-CURRENT -r293227 (and older) hangs on arm (rpi2): a description of sorts From: Mark Millard In-Reply-To: <1452183170.1215.4.camel@freebsd.org> Date: Thu, 7 Jan 2016 11:24:09 -0800 Cc: freebsd-arm Content-Transfer-Encoding: quoted-printable Message-Id: References: <1452183170.1215.4.camel@freebsd.org> To: Ian Lepore X-Mailer: Apple Mail (2.2104) X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: "Porting FreeBSD to ARM processors." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jan 2016 19:24:13 -0000 On 2016-Jan-7, at 8:12 AM, Ian Lepore wrote: >=20 > On Thu, 2016-01-07 at 02:19 -0800, Mark Millard wrote: >> I've had various hangs when the rpi2 was busy over longish periods, >> both debug buildkernel/buildworld builds of the arm and non-debug >> variants. No log files or console messages produced. >>=20 >> I've not had any analogous issues with powerpc64 (PowerMac G5) or >> with amd64 (Virtual Box used on Mac OS X). >>=20 >> I've finally discovered that if I have, say, top running on the rpi2 >> serial console that top continues to update its display so long as I >> leave it alone during the hang. (Otherwise it hangs too.) So I >> finally have a little window for seeing some of what is happening. >>=20 >> An example top display showed after the hang: >>=20 >> Mem: 764M Active 12M Inact 141M Wired 98M Buf 8k free >> Swap: 2048M Total 29M Used 2019 Free 1% in use >>=20 >> (Yep: Just 8K free Mem.) >>=20 >=20 > That's not a problem. >=20 >> The unusual STATEs for processes seemed to be (for the specific >> hang): >>=20 >> STATE COMMANDs >> pfault [ld] [ld] /usr/sbin/syslogd >> vmwait [ld] [md0] [kernel] >> wswbuf [pagedaemon] >>=20 >> Those same 3 states seem to always be involved. Some of the processes >> vary from one hang to the next: the prior hang had build/genautoma , >> /usr/sbin/moused , and /usr/sbin/ntpd instead of 3 [ld]'s. >>=20 >> /usr/sbin/syslogd, [md0], [kernel], and [pagedaemon] and their states >> do not seem to vary (so far). >>=20 >>=20 >=20 > Everything is backed up waiting for slow sdcard IO. You can get an > amd64 system with many cores and gigabytes of ram into the same state > with an sdcard (or any other storage device that takes literally > seconds for any individual IO to complete). All the available buffers > get queued up to the one slow device, then you can't do anything that > requires IO (even launch tools to try to figure out what's going on). >=20 > -- Ian This is not the (or a) sdcard for the root file system, it is a fast, = 400GB+ SSD, USB 3.0 capable (not that rpi2 uses it that way). Note below = the "da0" and the size and such (other than /boot/msdos): ugen0.5: at usbus0 umass0: on usbus0 umass0: SCSI over Bulk-Only; quirks =3D 0x0100 umass0:0:0: Attached to scbus0 da0 at umass-sim0 bus 0 scbus0 target 0 lun 0 da0: Fixed Direct Access SPC-4 SCSI device da0: Serial Number XXXXXXXXXXXX Release APs da0: 40.000MB/s transfers da0: 457862MB (937703088 512 byte sectors) da0: quirks=3D0x2 Trying to mount root from ufs:/dev/ufs/RPI2rootfs [rw,noatime]... . . . Starting file system checks: /dev/ufs/RPI2rootfs: FILE SYSTEM CLEAN; SKIPPING CHECKS /dev/ufs/RPI2rootfs: clean, 109711666 free (14002 frags, 13712208 = blocks, 0.0% fragmentation) Mounting local file systems:. . . . > Filesystem 1M-blocks Used Avail Capacity Mounted on > /dev/ufs/RPI2rootfs 443473 16791 391203 4% / > devfs 0 0 0 100% /dev > /dev/mmcsd0s1 49 7 42 15% /boot/msdos In USB 3.0 contexts I have never observed seconds for an IO for these = types of SSDs and I use them that way extensively. Nor for USB 2.0 uses, = though that is not as common of a context for me. Nor have I had any = problems with the type of USB 3.0 capable hub messing up IO. I use this type of SSD to hold my Virtual Box virtual machine(s) that I = run amd64 FreeBSD in on Mac OS X. No problems there. But it is true that = I've never directly booted amd64 FreeBSD from one of these SSDs in a = non-virtual amd64 context. Ignoring that for a moment, so this is an acceptable/expected FreeBSD = behavior when a "disk" device is slow? Interesting. I've let it sit for = hours and the hangup does not clear: it is effectively deadlocked for = overall usage. The rpi2 never will be able to buildworld, buildkernel, = ports, etc. reliably if this is the sort of behavior that results. Back to this context: I there a way for me to confirm the queuing of = buffers to the SSD? Or at least some detail about its buffer usage? Can = I get some information from ddb that would confirm/deny/provide insight? =3D=3D=3D Mark Millard markmi at dsl-only.net