From owner-freebsd-stable@freebsd.org Tue Jul 18 23:18:58 2017 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 54C83DA224B for ; Tue, 18 Jul 2017 23:18:58 +0000 (UTC) (envelope-from Mark.Martinec+freebsd@ijs.si) Received: from mail.ijs.si (mail.ijs.si [IPv6:2001:1470:ff80::25]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E05DD35F2; Tue, 18 Jul 2017 23:18:57 +0000 (UTC) (envelope-from Mark.Martinec+freebsd@ijs.si) Received: from amavis-ori.ijs.si (localhost [IPv6:::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.ijs.si (Postfix) with ESMTPS id 3xBx1V1Sjpzjx; Wed, 19 Jul 2017 01:18:54 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ijs.si; h= user-agent:message-id:references:in-reply-to:organization :subject:subject:from:from:date:date:content-transfer-encoding :content-type:content-type:mime-version:received:received :received:received; s=jakla4; t=1500419930; x=1503011931; bh=zc/ mUQK8FCWRtJEBuJYtsOA4Ad+NDcW9nuNAPMJgfwM=; b=WZ70jUsDwmzpWpDOxqi zUsAPSMhpJh/D6pvW9YlNUij5rAljpbprtfDwm0fOTAmv6c3Gmbzjy1OsQcyj6He ARkwdp4xLuKBrfXbENHjxH/LWQ3z/Fkg4N8HT+4rNcZNN38OVE3fguLE+QUFWK/i POaPX+6qz6trOW7U7Sx+avY8= X-Virus-Scanned: amavisd-new at ijs.si Received: from mail.ijs.si ([IPv6:::1]) by amavis-ori.ijs.si (mail.ijs.si [IPv6:::1]) (amavisd-new, port 10026) with LMTP id WY3jZoXr4A9Q; Wed, 19 Jul 2017 01:18:50 +0200 (CEST) Received: from mildred.ijs.si (mailbox.ijs.si [IPv6:2001:1470:ff80::143:1]) by mail.ijs.si (Postfix) with ESMTP id 3xBx1Q3Zg3zjk; Wed, 19 Jul 2017 01:18:50 +0200 (CEST) Received: from nabiralnik.ijs.si (nabiralnik.ijs.si [IPv6:2001:1470:ff80::80:16]) by mildred.ijs.si (Postfix) with ESMTP id 3xBx1Q3Jgfz14X; Wed, 19 Jul 2017 01:18:50 +0200 (CEST) Received: from sleepy.ijs.si (2001:1470:ff80:e001::76) by nabiralnik.ijs.si with HTTP (HTTP/2.0 POST); Wed, 19 Jul 2017 01:18:50 +0200 MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Date: Wed, 19 Jul 2017 01:18:50 +0200 From: Mark Martinec To: freebsd-stable@freebsd.org Cc: Mark Johnston Subject: Re: The 11.1-RC3 can only boot and attach disks in "Safe mode", otherwise gets stuck attaching Organization: Jozef Stefan Institute In-Reply-To: <20170717232434.GB21048@wkstn-mjohnston.west.isilon.com> References: <20170717232434.GB21048@wkstn-mjohnston.west.isilon.com> Message-ID: X-Sender: Mark.Martinec+freebsd@ijs.si User-Agent: Roundcube Webmail/1.2.4 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Jul 2017 23:18:58 -0000 2017-07-18 01:24, Mark Johnston wrote: > Are you able to break into the debugger at this point? Try setting > debug.kdb.break_to_debugger=1 and debug.kdb.alt_break_to_debugger=1 at > the loader prompt, and hit the break key, or the key sequence > ~ ctrl-b once the hang occurs. At the debugger prompt, try > "bt" and "show allpcpu" to start. Thank you for a prompt and good suggestion! I spent an afternoon fiddling with the machine, with mixed results. Your suggestion to break into debugger did not work, there was no reaction to or to ~ ctrl-b. So I embarked on rebuilding the RC3 kernel with options KDB options DDB options BREAK_TO_DEBUGGER options ALT_BREAK_TO_DEBUGGER options INVARIANTS options INVARIANT_SUPPORT options WITNESS options WITNESS_SKIPSPIN but then I realized the key is mapped-to by: alt ctrl , which now does break into debugger - but not so early where the holdup occurs. The WITNESS produced some LOR warnings, but that is probably ok. I came across a trace just before the problem area, but it flows by so fast on a vt console and only the last 40 or so lines remain on the screen (I have a photo), which do not look like revealing much. Unfortunately this machine does not have a serial interface. So in my last attempt I rebuilt a kernel with INVARIANTS but without WITNESS - and now I cannot reproduce the problem, with or without a "safe mode". What is interesting here that now the da0..da3 disks are attached first, and only then the ada disks - and even within the group of disks on the same controller their order has been shuffled - no idea what could have caused it - and it may have avoided the problem by doing so. Will play some more with this tomorrow... Mark > On Tue, Jul 18, 2017 at 01:01:16AM +0200, Mark Martinec wrote: >> Upgrading 11.0-RELEASE-p11 to 11.1-RC3 using the usual freebsd-update >> upgrade >> method I ended up with a system which gets stuck while trying to >> attach >> the second set of disks. This happened already after the first phase >> of >> the upgrade procedure (installing and re-booting with a new kernel). >> >> The first set of disks (ada0 .. ada2) are attached successfully, also >> a >> cd0, but then when the first of the set of four (a regular spinning >> disk) >> on an LSI controller is to be attached, the boot procedure just gets >> stuck there: >> kernel: ada1: 300.000MB/s transfers (SATA 2.x, PIO4, PIO 8192bytes) >> kernel: ada1: Command Queueing enabled >> kernel: ada1: 305245MB (625142448 512 byte sectors) >> kernel: ada2 at ahcich6 bus 0 scbus8 target 0 lun 0 >> kernel: ada2: ATA8-ACS SATA 3.x device >> kernel: ada2: Serial Number OCZ-O1L6RF591R09Z5C8 >> kernel: ada2: 300.000MB/s transfers (SATA 2.x, PIO4, PIO 8192bytes) >> kernel: ada2: Command Queueing enabled >> kernel: ada2: 114473MB (234441648 512 byte sectors) >> kernel: ada2: quirks=0x1<4K> >> kernel: da0 at mps0 bus 0 scbus0 target 2 lun 0 >> >> (stuck here, keyboard not responding, fans rising their pitch, >> presumably CPU is spinning) [...]