From owner-freebsd-current@FreeBSD.ORG Sat Oct 19 18:52:53 2013 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 3CDD5888; Sat, 19 Oct 2013 18:52:53 +0000 (UTC) (envelope-from Devin.Teske@fisglobal.com) Received: from mx1.fisglobal.com (mx1.fisglobal.com [199.200.24.190]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id F220A2AF4; Sat, 19 Oct 2013 18:52:52 +0000 (UTC) Received: from smtp.fisglobal.com ([10.132.206.15]) by ltcfislmsgpa07.fnfis.com (8.14.5/8.14.5) with ESMTP id r9JIqfQ1010068 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT); Sat, 19 Oct 2013 13:52:41 -0500 Received: from LTCFISWMSGMB21.FNFIS.com ([169.254.1.103]) by LTCFISWMSGHT04.FNFIS.com ([10.132.206.15]) with mapi id 14.02.0309.002; Sat, 19 Oct 2013 13:52:40 -0500 From: "Teske, Devin" To: Johan Broman Subject: Re: [CFT] Patch to bsdinstall to support root-on-ZFS and GELI Thread-Topic: [CFT] Patch to bsdinstall to support root-on-ZFS and GELI Thread-Index: AQHOzOHuoyZqOu2cyEmJYHyn2Q5Miw== Date: Sat, 19 Oct 2013 18:52:39 +0000 Message-ID: <13CA24D6AB415D428143D44749F57D720FC7F4AD@LTCFISWMSGMB21.FNFIS.com> References: <52629DA7.7090103@bridgenet.se> <5262A3EE.4050600@allanjude.com> <5262A5D0.3050604@bridgenet.se> <5262A69D.5070601@allanjude.com> <13CA24D6AB415D428143D44749F57D720FC7E9DE@LTCFISWMSGMB21.FNFIS.com> <13CA24D6AB415D428143D44749F57D720FC7EAEC@LTCFISWMSGMB21.FNFIS.com> <5262B304.3090605@allanjude.com> <5262BC56.4050004@bridgenet.se> In-Reply-To: <5262BC56.4050004@bridgenet.se> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.132.253.121] Content-Type: text/plain; charset="us-ascii" Content-ID: <6FE45813BAE963468DA617204D5B4A66@fisglobal.com> Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.10.8794, 1.0.431, 0.0.0000 definitions=2013-10-18_03:2013-10-18,2013-10-18,1970-01-01 signatures=0 Cc: Devin Teske , " Current" , "Teske, Devin" X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: Devin Teske List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 19 Oct 2013 18:52:53 -0000 On Oct 19, 2013, at 10:07 AM, Johan Broman wrote: >=20 >=20 > On 19/10/13 18:27, Allan Jude wrote: >> On 2013-10-19 11:55, Teske, Devin wrote: >>> On Oct 19, 2013, at 8:43 AM, Teske, Devin wrote: >>>=20 >>>> On Oct 19, 2013, at 8:34 AM, Allan Jude wrote: >>>>=20 >>>>> On 2013-10-19 11:31, Johan Broman wrote: >>>>>>=20 >>>>>> On 19/10/13 17:23, Allan Jude wrote: >>>>>>> On 2013-10-19 10:56, Johan Broman wrote: >>>>>>>> Hi! >>>>>>>>=20 >>>>>>>> Just tested the root-on-ZFS install option using FreeBSD 10 beta 1= . I >>>>>>>> have 4 SATA drives in my server. I select all four of them in a RA= IDZ1 >>>>>>>> setup. I hit enter to continue the installation and the zpool is >>>>>>>> created, but I'm then returned to the zpool selection screen again= . It >>>>>>>> turned out that two of the drives had previously been used in a >>>>>>>> (Linux) software mirror setup and because of this they got activat= ed >>>>>>>> in /dev/raid/r0. Because of this I ended up in an endless bsdinsta= ll >>>>>>>> loop. >>>>>>>>=20 >>>>>>>> Removing the raid device using the graid command resolved the >>>>>>>> situation. >>>>>>>>=20 >>>>>>>> Now maybe this is working as designed, but there was no warning/al= ert >>>>>>>> to the fact that the devices couldn't be used. Perhaps a warning >>>>>>>> should be rasied in this situation? >>>>>>>>=20 >>>>>>>> Thanks for all the great work on the new installer, really looking >>>>>>>> forward to FreeBSD 10! >>>>>>>>=20 >>>>>>>> Cheers >>>>>>>> Johan >>>>>>>> _______________________________________________ >>>>>>>> freebsd-current@freebsd.org mailing list >>>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-current >>>>>>>> To unsubscribe, send any mail to >>>>>>>> "freebsd-current-unsubscribe@freebsd.org" >>>>>>> Errors like that normally generate a msgbox dialog with the error o= utput >>>>>>> from whichever command failed. I'll have to dig into it and see whe= re >>>>>>> that problem is. I've seen other people have problems creating ZFS >>>>>>> arrays after graid, but in that case it was an incomplete graid lab= el >>>>>>> causing a device to be locked but not appear in the graid status ou= tput. >>>>>>>=20 >>>>>> Ah ok. A msgbox did appear but the drives that had the problem (ada2 >>>>>> and ada3) wasn't visible in the output. (not sure if the box itself >>>>>> has a size limit or maybe I was just unable to scroll down and see t= he >>>>>> errors?). The only visible output was that it was able to create >>>>>> labels on ada0 and ada1. >>>>>>=20 >>>>>> /Johan >>>>>> _______________________________________________ >>>>>> freebsd-current@freebsd.org mailing list >>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-current >>>>>> To unsubscribe, send any mail to >>>>>> "freebsd-current-unsubscribe@freebsd.org" >>>>> Ahh yes, you have to press 'page-down' to scroll the msgbox. I tried = to >>>>> add a scrollbar but turns out that is not possible. >>>>>=20 >>>>> The only indication that there is more message to read, is a small 'x= x%' >>>>> in the bottom right. We might have to look at breaking that output up= or >>>>> something. >>>>>=20 >>>>=20 >>>> The only reason for a msgbox widget to scroll is if it is displayed at >>>> maximum height or width of the screen and it *still* has more data >>>> to display than can be presented at-once. >>>>=20 >>> I should clarify... >>>=20 >>> The zfsboot script doesn't use dialog(1) directly. It uses the bsdconfi= g API. >>> That being said, msgbox widgets automatically scale their size to fit t= he >>> content being displayed. So whenever a msgbox is thrown up using this >>> API... the widget will never scroll unless the box can't be made big en= ough >>> to hold the entire content (either the screen resolution or terminal si= ze is >>> too small; we maxed out the size of the widget; and there's still hidden >>> content). >>>=20 >>> But... >>>=20 >>> While all of bsdconfig uses this API, hardly any of bsdinstall uses thi= s API. >>>=20 >>>=20 >>>=20 >>>> If... however... the msgbox widget is *not* full-height or full-width >>>> yet... it is requiring you to scroll -- then we've found a bug. >>>>=20 >>>> Can we get a screen shot? >>> So we really need to nail down precisely which error box this is so that >>> we can address whether the issue is in-fact an instance of using the old >>> error-box handling instead of the auto-sizing API. >>>=20 >>> So... >>>=20 >>> With this described API, you should never have to scroll a box unless it >>> can't fit all the data *and* you should be able to immediately identify= when >>> that becomes the case... >>>=20 >>> 1. The widget spans the entire width of the screen. >>>=20 >>> 2. The widget spans the entire height of the screen. >>>=20 >>> 3. Both 1 and 2. >>>=20 >>> It's in *those* cases that you should then *EXPECT* to find that the >>> region can scroll with cursor keys and page up/down (look for the >>> scroll percentage in the widget as Allan suggested. >>>=20 >>> I don't want to see the scroll percentage doohickey *unless* the widget >>> is auto-sized to full-width or full-height. Meaning, there's either a b= ug in >>> the API or someone fell into a trap (there are a couple). >>=20 >> the error output msgbox is huge, probably 100+ lines (the screen is >> what, 24 lines high, and with the ok button, top and bottom reserved >> space etc, can display maybe 18 lines at once) >>=20 >> It contains all the shell output from everything we do, creating the >> gparts, setting up gnop, all of the redundant destroys etc. >>=20 >> I don't think the TINY little % in the bottom right is really enough of >> an indicator to the user that they CAN scroll, let alone HOW to scroll >> (IIRC the arrow keys do not work, must use page down) >>=20 >=20 > I recreated the graid mirror on ada2 and ada3 and reran the installation.= I'm unable to scroll the msgbox using PgDn or arrow keys. There is no indi= cation that the action failed and I'm returned to the ZFS setup screen if I= hit OK. >=20 > I have screen shots (taken with my phone) of the msgbox and "ps auxwww" o= utput. Let me know what kind of debug info you would like. I've put the scr= een shots here: >=20 > http://212.181.212.146/bsdinstall >=20 It looks like one of the commands that is used to partition the disks is producing an error status on exit but ... no error? Double-check me on this, but... 1. It looks to me like this is what you're seeing (code-wise): >From http://svnweb.freebsd.org/base/head/usr.sbin/bsdinstall/scripts/zfsboo= t?revision=3D256553&view=3Dmarkup 989 if ! error=3D$( zfs_create_boot "$ZFSBOOT_POOL_NAME" \ 990 "$vdev_type" $real_disks 2>&1 ) 991 then 992 f_dialog_msgbox "$error" 993 f_interactive || f_die 994 continue 995 fi 2. That looks like our guy; f_dialog_msgbox() will use the currently-active dialog title. NOTE: This should probably be changed to be more clear in several ways. First, drop stdout to /dev/null keeping only stderr. Second, probably use ${error:-Unknown error has occurred} so that if some program returns error but doesn't produce an error message... we have some sensible fallback; Last, but not least, change the title to "Error" and put some prefix before the error text (with aforementioned fallback). 3. Diving into the "zfs_create_boot" function... and further, the "zfs_create_diskpart" function... There are any number of reasons why you would get thrown back to the ZFS Configuration menu. A few are listed below: Inability to write to $BSDINSTALL_TMPETC/fstab You've specified an invalid swapsize. NB: Therein lies another problem ... we don't catch the error from f_expand_number and then tell you why a custom swap size is perhaps invalid. "gpart create -s gpt $disk" failed "gpart destroy -F $disk" failed NB: This is irrespective of whether you chose MBR or GPT; it's a bug that should be fixed (we shouldn't return failure on the pre-cursory destruction of existing data. etc... So thank you !! looks like I've got some patching to do to improve the debugging. --=20 Cheers, Devin _____________ The information contained in this message is proprietary and/or confidentia= l. If you are not the intended recipient, please: (i) delete the message an= d all copies; (ii) do not disclose, distribute or use the message in any ma= nner; and (iii) notify the sender immediately. In addition, please be aware= that any message addressed to our domain is subject to archiving and revie= w by persons other than the intended recipient. Thank you.