Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 20 Nov 2020 19:27:44 +0100
From:      Mateusz Guzik <mjguzik@gmail.com>
To:        mike tancsa <mike@sentex.net>
Cc:        Philip Paeps <philip@freebsd.org>, "Bjoern A. Zeeb" <bz@freebsd.org>, netperf-admin@freebsd.org,  netperf-users@freebsd.org, Allan Jude <allanjude@freebsd.org>
Subject:   Re: zoo reboot Friday Nov 20 14:00 UTC
Message-ID:  <CAGudoHELFz7KyzQmRN8pCbgLQXPgCdHyDAQ4pzFLF%2BYswcP87A@mail.gmail.com>
In-Reply-To: <a716e874-d736-d8d5-9c45-c481f6b3dee7@sentex.net>
References:  <1f8e49ff-e3da-8d24-57f1-11f17389aa84@sentex.net> <2691e1fd-5a27-4dd0-2ef7-b1c06fd4e751@sentex.net> <A3934CD4-57C1-4215-99F2-9500CB9EDC7C@neville-neil.com> <5A5094BC-D417-4BA6-97E2-7CB522B51368@FreeBSD.org> <4ec6ed6f-b3b4-22ae-e1ec-93a46f3d88ea@sentex.net> <d2ffd0f1-1dd8-dc6b-9975-93f20d7974a4@sentex.net> <dc8fed75-0262-c614-3292-6b8ce5addcfc@sentex.net> <0ddec867-32b5-f667-d617-0ddc71726d09@sentex.net> <CAGudoHHNN8ZcgdkRSy0cSaPA6J9ZHVf%2BBQFiBcThrtQ0AMP%2BOw@mail.gmail.com> <5549CA9F-BCF4-4043-BA2F-A2C41D13D955@freebsd.org> <ad81b5f3-f6de-b908-c00f-fb8d6ac2a0b8@sentex.net> <CAGudoHETJZ0f_YjmCcUjb-Wcf1tKhSF719kXxXUB3p4RB0uuRQ@mail.gmail.com> <CAGudoHH=H4Xok5HG3Hbw7S=6ggdsi%2BN4zHirW50cmLGsLnhd4g@mail.gmail.com> <270b65c0-8085-fe2f-cf4f-7a2e4c17a2e8@sentex.net> <CAGudoHFLy2dxBMGd2AJZ6q6zBsU%2Bn8uLXLSiFZ1QGi_qibySVg@mail.gmail.com> <a716e874-d736-d8d5-9c45-c481f6b3dee7@sentex.net>

next in thread | previous in thread | raw e-mail | index | archive | help
swap and boot partitions resized, the ada0p3 partition got removed
from the pool and inserted back, it is rebuilding now:

root@zoo2:~ # zpool status
  pool: zroot
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Fri Nov 20 23:13:28 2020
	459G scanned at 1.00G/s, 291G issued at 650M/s, 3.47T total
	0B resilvered, 8.17% done, 01:25:48 to go
config:

	NAME                                            STATE     READ WRITE CKSUM
	zroot                                           DEGRADED     0     0     0
	  mirror-0                                      DEGRADED     0     0     0
	    replacing-0                                 DEGRADED     0     0     0
	      1517819109053923011                       OFFLINE      0     0
   0  was /dev/ada0p3/old
	      ada0p3                                    ONLINE       0     0     0
	    ada1                                        ONLINE       0     0     0
	  mirror-1                                      ONLINE       0     0     0
	    ada3p3                                      ONLINE       0     0     0
	    ada4p3                                      ONLINE       0     0     0
	  mirror-2                                      ONLINE       0     0     0
	    ada5p3                                      ONLINE       0     0     0
	    ada6p3                                      ONLINE       0     0     0
	special	
	  mirror-3                                      ONLINE       0     0     0
	    gptid/db15e826-1a9c-11eb-8d25-0cc47a1f2fa0  ONLINE       0     0     0
	    mfid1p2                                     ONLINE       0     0     0

errors: No known data errors

One pickle: i did 'zpool export zroot' to replace the drive, otherwise
zfs protested. subsequent zpool import was done slightly carelessly
and it mounted over /, meaning i lost access to original ufs. Should
there be a need to boot from it again someone will have to boot single
user and make sure to comment out swap in /etc/fstab or we will have
to replace the drive again.

That said, as I understand we are in position to take out the ufs
drive and reboot to be back in business.

The ufs drive will have to be mounted somewhere to sort out that swap.

On 11/20/20, mike tancsa <mike@sentex.net> wrote:
> On 11/20/2020 1:00 PM, Mateusz Guzik wrote:
>> So this happened after boot:
>>
>> root@zoo2:/home/mjg # swapinfo
>> Device          1K-blocks     Used    Avail Capacity
>> /dev/ada0p3     2928730500        0 2928730500     0%
>>
>> which i presume might have corrupted some of it.
>
> Oh, that makes sense now. When it was installed in the back, the drive
> posted as ada0. When we put it in zoo, it was on a farther down port,
> hence it came up as ada7. I had to manually mount / off ada7p2.  I
> updated fstab so as not to do that again already.  That mystery solved.
>
>     ---Mike
>
>
>> Allan pasted some one-liners to resize the boot and swap partition.
>>
>> With your permission I would like to run them and then offline/online
>> the disk to have it rebuild.
>>
>> As for longer plans what to do with it i think that's a different
>> subject, whatever new drives end up being used I'm sure the FreeBSD
>> Foundation can reimburse you with no difficulty.
>>
>>
>> On 11/20/20, mike tancsa <mike@sentex.net> wrote:
>>> Its a bit of an evolutionary mess the current state of zoo.  I wonder if
>>> we are better off re-installing the base OS fresh on a pair of SSD
>>> drives and have the base OS on it and leave all the user data on the
>>> current "zroot"... Considering 240G SSDs are $35 CDN it might be easier
>>> to just install fresh on it and not have to worry about resizing etc.
>>>
>>>     ---Mike
>>>
>>> On 11/20/2020 12:49 PM, Mateusz Guzik wrote:
>>>> On 11/20/20, Mateusz Guzik <mjguzik@gmail.com> wrote:
>>>>> CC'ing Allan Jude
>>>>>
>>>>> So:
>>>>>
>>>>>   pool: zroot
>>>>>  state: DEGRADED
>>>>> status: One or more devices could not be opened.  Sufficient replicas
>>>>> exist
>>>>> for
>>>>> 	the pool to continue functioning in a degraded state.
>>>>> action: Attach the missing device and online it using 'zpool online'.
>>>>>    see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-2Q
>>>>>   scan: scrub repaired 0B in 05:17:02 with 0 errors on Tue Aug 18
>>>>> 15:19:00
>>>>> 2020
>>>>> config:
>>>>>
>>>>> 	NAME                                            STATE     READ WRITE
>>>>> CKSUM
>>>>> 	zroot                                           DEGRADED     0     0
>>>>> 0
>>>>> 	  mirror-0                                      DEGRADED     0     0
>>>>> 0
>>>>> 	    1517819109053923011                         UNAVAIL      0     0
>>>>>    0  was /dev/ada0p3
>>>>> 	    ada1                                        ONLINE       0     0
>>>>> 0
>>>>> 	  mirror-1                                      ONLINE       0     0
>>>>> 0
>>>>> 	    ada3p3                                      ONLINE       0     0
>>>>> 0
>>>>> 	    ada4p3                                      ONLINE       0     0
>>>>> 0
>>>>> 	  mirror-2                                      ONLINE       0     0
>>>>> 0
>>>>> 	    ada5p3                                      ONLINE       0     0
>>>>> 0
>>>>> 	    ada6p3                                      ONLINE       0     0
>>>>> 0
>>>>> 	special	
>>>>> 	  mirror-3                                      ONLINE       0     0
>>>>> 0
>>>>> 	    gptid/db15e826-1a9c-11eb-8d25-0cc47a1f2fa0  ONLINE       0     0
>>>>> 0
>>>>> 	    mfid1p2                                     ONLINE       0     0
>>>>> 0
>>>>>
>>>>> errors: No known data errors
>>>>>
>>>>> # dmesg | grep ada0
>>>>> Trying to mount root from ufs:/dev/ada0p2 [rw]...
>>>>> ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
>>>>> ada0: <WDC WD3003FZEX-00Z4SA0 01.01A01> ACS-2 ATA SATA 3.x device
>>>>> ada0: Serial Number WD-WCC137TALF5K
>>>>> ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
>>>>> ada0: Command Queueing enabled
>>>>> ada0: 2861588MB (5860533168 512 byte sectors)
>>>>> ada0: quirks=0x1<4K>
>>>>> Mounting from ufs:/dev/ada0p2 failed with error 2; retrying for 3 more
>>>>> seconds
>>>>> Mounting from ufs:/dev/ada0p2 failed with error 2.
>>>>>   vfs.root.mountfrom=ufs:/dev/ada0p2
>>>>> GEOM_PART: Partition 'ada0p3' not suitable for kernel dumps (wrong
>>>>> type?)
>>>>> ZFS WARNING: Unable to attach to ada0p3.
>>>>> ZFS WARNING: Unable to attach to ada0p3.
>>>>> ZFS WARNING: Unable to attach to ada0p3.
>>>>> ZFS WARNING: Unable to attach to ada0p3.
>>>>> ZFS WARNING: Unable to attach to ada0p3.
>>>>> ZFS WARNING: Unable to attach to ada0p3.
>>>>>
>>>>> # gpart show ada0
>>>>> =>        34  5860533101  ada0  GPT  (2.7T)
>>>>>           34           6        - free -  (3.0K)
>>>>>           40          88     1  freebsd-boot  (44K)
>>>>>          128     3072000     2  freebsd-swap  (1.5G)
>>>>>      3072128  5857461000     3  freebsd-zfs  (2.7T)
>>>>>   5860533128           7        - free -  (3.5K)
>>>>>
>>>>> Running naive dd if=/dev/ada0p3 works, so I don't know what zfs
>>>>> complains
>>>>> about.
>>>>>
>>>> Also note Philip's point boot partition of 44k. Is that too small now?
>>>>
>>>>> On 11/20/20, mike tancsa <mike@sentex.net> wrote:
>>>>>> On 11/20/2020 11:40 AM, Philip Paeps wrote:
>>>>>>> On 2020-11-21 00:04:19 (+0800), Mateusz Guzik wrote:
>>>>>>>
>>>>>>>> Oh, that's a bummer. I wonder if there is a regression in the boot
>>>>>>>> loader though.
>>>>>>>>
>>>>>>>> Does the pool mount if you boot the system from a cd/over the
>>>>>>>> network/whatever?
>>>>>>> It's worth checking if the freebsd-boot partition is large enough.
>>>>>>> I
>>>>>>> noticed during the cluster refresh that we often use 108k for
>>>>>>> freebsd-boot but recent head wants 117k.  I've been bumping the
>>>>>>> bootblocks to 236k.
>>>>>>>
>>>>>>> So far, all the cluster machines I've upgraded booted though .. so
>>>>>>> ...
>>>>>>> I might be talking ex recto. :)
>>>>>>>
>>>>>> I put in an ssd drive and booted from it. One of the drives might
>>>>>> have
>>>>>> gotten loose or died in the power cycles, but there is still
>>>>>> redundancy
>>>>>> and I was able to mount the pool. Not sure why it cant find the file
>>>>>> ?
>>>>>>
>>>>>> root@zoo2:~ # diff /boot/lua/loader.lua /mnt/boot/lua/loader.lua
>>>>>> 29c29
>>>>>> < -- $FreeBSD$
>>>>>> ---
>>>>>>> -- $FreeBSD: head/stand/lua/loader.lua 359371 2020-03-27 17:37:31Z
>>>>>> freqlabs $
>>>>>> root@zoo2:~ #
>>>>>>
>>>>>>
>>>>>>  % ls -l /mnt/boot/lua/
>>>>>> total 110
>>>>>> -r--r--r--  1 root  wheel   4300 Nov 20 08:41 cli.lua
>>>>>> -r--r--r--  1 root  wheel   3288 Nov 20 08:41 color.lua
>>>>>> -r--r--r--  1 root  wheel  18538 Nov 20 08:41 config.lua
>>>>>> -r--r--r--  1 root  wheel  12610 Nov 20 08:41 core.lua
>>>>>> -r--r--r--  1 root  wheel  11707 Nov 20 08:41 drawer.lua
>>>>>> -r--r--r--  1 root  wheel   2456 Nov 20 08:41 gfx-beastie.lua
>>>>>> -r--r--r--  1 root  wheel   2235 Nov 20 08:41 gfx-beastiebw.lua
>>>>>> -r--r--r--  1 root  wheel   1958 Nov 20 08:41 gfx-fbsdbw.lua
>>>>>> -r--r--r--  1 root  wheel   2413 Nov 20 08:41 gfx-orb.lua
>>>>>> -r--r--r--  1 root  wheel   2140 Nov 20 08:41 gfx-orbbw.lua
>>>>>> -r--r--r--  1 root  wheel   3324 Nov 20 08:41 hook.lua
>>>>>> -r--r--r--  1 root  wheel   2395 Nov 20 08:41 loader.lua
>>>>>> -r--r--r--  1 root  wheel   2429 Sep 24 09:09 logo-beastie.lua
>>>>>> -r--r--r--  1 root  wheel   2203 Sep 24 09:09 logo-beastiebw.lua
>>>>>> -r--r--r--  1 root  wheel   1958 Sep 24 09:09 logo-fbsdbw.lua
>>>>>> -r--r--r--  1 root  wheel   2397 Sep 24 09:09 logo-orb.lua
>>>>>> -r--r--r--  1 root  wheel   2119 Sep 24 09:09 logo-orbbw.lua
>>>>>> -r--r--r--  1 root  wheel  14201 Nov 20 08:41 menu.lua
>>>>>> -r--r--r--  1 root  wheel   4299 Nov 20 08:41 password.lua
>>>>>> -r--r--r--  1 root  wheel   2227 Nov 20 08:41 screen.lua
>>>>>>
>>>>>>
>>>>>>
>>>>> --
>>>>> Mateusz Guzik <mjguzik gmail.com>
>>>>>
>>
>


-- 
Mateusz Guzik <mjguzik gmail.com>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAGudoHELFz7KyzQmRN8pCbgLQXPgCdHyDAQ4pzFLF%2BYswcP87A>