From owner-freebsd-current@freebsd.org  Wed Aug 15 06:31:08 2018
Return-Path: <owner-freebsd-current@freebsd.org>
Delivered-To: freebsd-current@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id CEBE9107776E
 for <freebsd-current@mailman.ysv.freebsd.org>;
 Wed, 15 Aug 2018 06:31:07 +0000 (UTC) (envelope-from tsoome@me.com)
Received: from st13p35im-asmtp002.me.com (st13p35im-asmtp002.me.com
 [17.164.199.65])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 8ED0A7CDF7
 for <freebsd-current@freebsd.org>; Wed, 15 Aug 2018 06:31:07 +0000 (UTC)
 (envelope-from tsoome@me.com)
Received: from process-dkim-sign-daemon.st13p35im-asmtp002.me.com by
 st13p35im-asmtp002.me.com
 (Oracle Communications Messaging Server 8.0.2.2.20180531 64bit (built May 31
 2018)) id <0PDH00K00ONVQB00@st13p35im-asmtp002.me.com> for
 freebsd-current@freebsd.org; Wed, 15 Aug 2018 06:31:06 +0000 (GMT)
Received: from icloud.com ([127.0.0.1]) by st13p35im-asmtp002.me.com
 (Oracle Communications Messaging Server 8.0.2.2.20180531 64bit (built May 31
 2018)) with ESMTPSA id <0PDH004VNORREN40@st13p35im-asmtp002.me.com>; Wed,
 15 Aug 2018 06:31:06 +0000 (GMT)
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0
 clxscore=1015 suspectscore=2 malwarescore=0 phishscore=0 adultscore=0
 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1
 engine=8.0.1-1707230000 definitions=main-1808150070
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,,
 definitions=2018-08-15_03:,, signatures=0
From: Toomas Soome <tsoome@me.com>
Message-id: <C3AC526B-56DD-4273-A3FB-7BBB472563E5@me.com>
MIME-version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\))
Subject: Re: boot errors since upgrading to 12-current
Date: Wed, 15 Aug 2018 09:31:03 +0300
In-reply-to: <24f2e3f5-67b3-a5ac-8394-a7b5ecd0ce39@zyxst.net>
Cc: freebsd-current@freebsd.org
To: tech-lists <tech-lists@zyxst.net>
References: <f3cb9196-0e89-6c4e-5e8f-d3c4e48e16dc@zyxst.net>
 <22F5A9FD-3167-4029-8CFF-B4096E9E69BB@me.com>
 <24f2e3f5-67b3-a5ac-8394-a7b5ecd0ce39@zyxst.net>
X-Mailer: Apple Mail (2.3445.9.1)
Content-Type: text/plain;
	charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Content-Filtered-By: Mailman/MimeDel 2.1.27
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.27
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
 <freebsd-current.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current/>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 15 Aug 2018 06:31:08 -0000


> On 15 Aug 2018, at 06:06, tech-lists <tech-lists@zyxst.net> wrote:
>=20
> On 14/08/2018 21:16, Toomas Soome wrote:
>>> On 14 Aug 2018, at 22:37, tech-lists <tech-lists@zyxst.net> wrote:
>>> Hello,
>>> context: amd64, FreeBSD 12.0-ALPHA1 #0 r337682, ZFS. The system is
>>> *not* root-on-zfs. It boots to an SSD. The three disks indicated
>>> below are spinning rust.
>>> NAME        STATE     READ WRITE CKSUM storage     ONLINE       0
>>> 0     0 raidz1-0  ONLINE       0     0     0 ada1    ONLINE       0
>>> 0     0 ada2    ONLINE       0     0     0 ada3    ONLINE       0
>>> 0     0
>>> This machine was running 11.2 up until about a month ago.
>>> Recently I've seen this flash up on the screen before getting to
>>> the beastie screen:
>>> BIOS drive C: is disk0 BIOS drive D: is disk1 BIOS drive E: is
>>> disk2 BIOS drive F: is disk3 BIOS drive G: is disk4 BIOS drive H:
>>> is disk5 BIOS drive I: is disk6 BIOS drive J: is disk7
>>> [the above is normal and has always has been seen on every boot]
>>> read 1 from 0 to 0xcbdb1330, error: 0x31 read 1 from 0 to
>>> 0xcbdb1330, error: 0x31 read 1 from 0 to 0xcbdb1330, error: 0x31 =
read 1 from 0 to 0xcbdb1330, error: 0x31 read 1 from 0 to
>>> 0xcbdb1330, error: 0x31 read 1 from 0 to 0xcbdb1330, error: 0x31 =
read 1 from 0 to 0xcbdb1330, error: 0x31 read 1 from 0 to
>>> 0xcbdb1330, error: 0x31
>>> the above has been happening since upgrading to -current a month
>>> ago
>>> ZFS: i/o error - all block copies unavailable ZFS: can't read MOS
>>> of pool storage
>>> the above is alarming and has been happening for the past couple of
>>> days, since upgrading to r337682 on the 12th August.
>>> The beastie screen then loads and it boots normally.
>>> Should I be concerned? Is the output indicative of a problem?
>> Not immediately and yes. In BIOS loader, we do all disk IO with INT13
>> and the error 0x31 is often hinting about missing media or some other
>> controller related error. Could you paste the output from loader
>> lsdev -v output?
>> The drive list appears as an result of probing the disks in
>> biosdisk.c. The read errors are from attempt to read 1 sector from
>> sector 0 (that is, to read the partition table from the disk). Why
>> this does end with error, would be interesting to know, unfortunately
>> that error does not tell us which disk was probed.
>=20
> Hi Toomas, thanks for looking at this.
>=20
> lsdev -v looks like this:
>=20
> OK lsdev -v
> disk devices:
> 	disk0: BIOS drive C (16514064 X 512):
> 	disk0s1: FreeBSD          111GB
> 	disk0s1a: FreeBSD UFS     108GB
> 	disk0s1b: FreeBSD swap    3881MB
>=20
> 	disk1: BIOS drive D (16514064 X 512):
> 	disk2: BIOS drive E (16514064 X 512):
> 	disk3: BIOS drive F (16514064 X 512):
> 	disk4: BIOS drive G (2880 X 512):
> read 1 from 0 to 0xcbde0a20, error 0x31
> 	disk5: BIOS drive D (2880 X 512):
> read 1 from 0 to 0xcbde0a20, error 0x31
> 	disk6: BIOS drive D (2880 X 512):
> read 1 from 0 to 0xcbde0a20, error 0x31
> 	disk7: BIOS drive D (2880 X 512):
> read 1 from 0 to 0xcbde0a20, error 0x31
> OK
>=20
> disk4 to disk7 corresponds with da0 to da3 which are sd/mmc devices =
without any media in. What made me notice it is it never showed the read =
1 from 0 to $random_value on 11-stable. The system runs 12-current now.

Yea, its not about random value, but the rework to process the missing =
media is on the way to current, stay tuned:)

>=20
> disk1 to disk3 are the hard drives making up ZFS. These are 4TB =
Western Digital SATA-3 WDC WD4001FAEX.

Well that does explain the problem, if you look on the sizes reported=E2=80=
=A6 so your BIOS is reporting wrong sizes, is unable to access whole 4TB =
space and the zfs reader is not getting the correct data from the disks =
- and is resulting with errors. Thats why you get the errors from =
=E2=80=98storage=E2=80=99 pool and yes, this is harmless for boot =
because you have separate (small) disk for the boot.

rgds,
toomas

>=20
>>> Since you are getting errors from data pool =E2=80=98storage=E2=80=99,=
 it does not
>>> affect the boot. Why the pool storage is unreadable - it likely has
>>> to do about the errors above, but can not tell for sure based on the
>>> data presented here=E2=80=A6.
>=20
> Thing is, the data pool works fine when boot completes. i.e it loads =
read/write and behaves normally.
>=20
> thanks,
> --=20
> J.