Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 19 Jul 2017 01:18:50 +0200
From:      Mark Martinec <Mark.Martinec+freebsd@ijs.si>
To:        freebsd-stable@freebsd.org
Cc:        Mark Johnston <markj@freebsd.org>
Subject:   Re: The 11.1-RC3 can only boot and attach disks in "Safe mode", otherwise gets stuck attaching
Message-ID:  <c8140f430fb2af93a6bc70a3df8cdadc@ijs.si>
In-Reply-To: <20170717232434.GB21048@wkstn-mjohnston.west.isilon.com>
References:  <e4acc16980fe65751325333870bf2b68@ijs.si> <20170717232434.GB21048@wkstn-mjohnston.west.isilon.com>

next in thread | previous in thread | raw e-mail | index | archive | help
2017-07-18 01:24, Mark Johnston wrote:
> Are you able to break into the debugger at this point? Try setting
> debug.kdb.break_to_debugger=1 and debug.kdb.alt_break_to_debugger=1 at
> the loader prompt, and hit the break key, or the key sequence
> <CR> ~ ctrl-b once the hang occurs. At the debugger prompt, try
> "bt" and "show allpcpu" to start.

Thank you for a prompt and good suggestion! I spent an afternoon
fiddling with the machine, with mixed results. Your suggestion to
break into debugger did not work, there was no reaction to <break>
or to <CR> ~ ctrl-b.

So I embarked on rebuilding the RC3 kernel with
   options KDB
   options DDB
   options BREAK_TO_DEBUGGER
   options ALT_BREAK_TO_DEBUGGER
   options INVARIANTS
   options INVARIANT_SUPPORT
   options WITNESS
   options WITNESS_SKIPSPIN
but then I realized the <debug> key is mapped-to by: alt ctrl <esc>,
which now does break into debugger - but not so early where the
holdup occurs.

The WITNESS produced some LOR warnings, but that is probably ok.
I came across a trace just before the problem area, but it flows
by so fast on a vt console and only the last 40 or so lines
remain on the screen (I have a photo), which do not look like
revealing much. Unfortunately this machine does not have a serial
interface.

So in my last attempt I rebuilt a kernel with INVARIANTS but
without WITNESS - and now I cannot reproduce the problem, with
or without a "safe mode". What is interesting here that now
the da0..da3 disks are attached first, and only then the ada
disks - and even within the group of disks on the same
controller their order has been shuffled - no idea what could
have caused it - and it may have avoided the problem by doing so.

Will play some more with this tomorrow...

   Mark


> On Tue, Jul 18, 2017 at 01:01:16AM +0200, Mark Martinec wrote:
>> Upgrading 11.0-RELEASE-p11 to 11.1-RC3 using the usual freebsd-update
>> upgrade
>> method I ended up with a system which gets stuck while trying to 
>> attach
>> the second set of disks. This happened already after the first phase 
>> of
>> the upgrade procedure (installing and re-booting with a new kernel).
>> 
>> The first set of disks (ada0 .. ada2) are attached successfully, also 
>> a
>> cd0, but then when the first of the set of four (a regular spinning
>> disk)
>> on an LSI controller is to be attached, the boot procedure just gets
>> stuck there:
>>    kernel: ada1: 300.000MB/s transfers (SATA 2.x, PIO4, PIO 8192bytes)
>>    kernel: ada1: Command Queueing enabled
>>    kernel: ada1: 305245MB (625142448 512 byte sectors)
>>    kernel: ada2 at ahcich6 bus 0 scbus8 target 0 lun 0
>>    kernel: ada2: <OCZ-VERTEX3 2.25> ATA8-ACS SATA 3.x device
>>    kernel: ada2: Serial Number OCZ-O1L6RF591R09Z5C8
>>    kernel: ada2: 300.000MB/s transfers (SATA 2.x, PIO4, PIO 8192bytes)
>>    kernel: ada2: Command Queueing enabled
>>    kernel: ada2: 114473MB (234441648 512 byte sectors)
>>    kernel: ada2: quirks=0x1<4K>
>>    kernel: da0 at mps0 bus 0 scbus0 target 2 lun 0
>> 
>> (stuck here, keyboard not responding, fans rising their pitch,
>>   presumably CPU is spinning)
[...]



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?c8140f430fb2af93a6bc70a3df8cdadc>