From owner-freebsd-current@freebsd.org Fri May 27 14:43:51 2016 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 98D72B4B820 for ; Fri, 27 May 2016 14:43:51 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 86E3516FE for ; Fri, 27 May 2016 14:43:51 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: by mailman.ysv.freebsd.org (Postfix) id 823A9B4B81F; Fri, 27 May 2016 14:43:51 +0000 (UTC) Delivered-To: current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 81E64B4B81E for ; Fri, 27 May 2016 14:43:51 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 638BF16FD for ; Fri, 27 May 2016 14:43:51 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from ralph.baldwin.cx (c-73-231-226-104.hsd1.ca.comcast.net [73.231.226.104]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id BB187B94C; Fri, 27 May 2016 10:43:49 -0400 (EDT) From: John Baldwin To: gljennjohn@gmail.com Cc: current@freebsd.org Subject: Re: EARLY_AP_STARTUP hangs during boot Date: Fri, 27 May 2016 07:43:43 -0700 Message-ID: <5082784.kA81xcoze3@ralph.baldwin.cx> User-Agent: KMail/4.14.3 (FreeBSD/10.2-STABLE; KDE/4.14.3; amd64; ; ) In-Reply-To: <20160527095005.0e0dc1be@ernst.home> References: <20160516122242.39249a54@ernst.home> <2245981.CzRHAP1AJo@ralph.baldwin.cx> <20160527095005.0e0dc1be@ernst.home> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Fri, 27 May 2016 10:43:49 -0400 (EDT) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 May 2016 14:43:51 -0000 On Friday, May 27, 2016 09:50:05 AM Gary Jennejohn wrote: > On Thu, 26 May 2016 16:54:35 -0700 > John Baldwin wrote: > > > On Tuesday, May 17, 2016 06:47:41 PM Gary Jennejohn wrote: > > > On Mon, 16 May 2016 10:54:19 -0700 > > > John Baldwin wrote: > > > > > > > On Monday, May 16, 2016 12:22:42 PM Gary Jennejohn wrote: > > > > > I tried out EARLY_AP_STARTUP, but the kernel hangs and I can't > > > > > break into DDB. > > > > > > > > > > I did a verbose boot and the last lines I see are related to routing > > > > > MSI-X to various local APIC vectors. I copied the last few lines and > > > > > they look like this: > > > > > > > > > > msi: routing MSI-X IRQ 256 to local APIC 2 vector 48 > > > > > msi: routing MSI-X IRQ 257 to local APIC 3 vector 48 > > > > > msi: routing MSI-X IRQ 258 to local APIC 4 vector 48 > > > > > msi: routing MSI-X IRQ 256 to local APIC 0 vector 49 > > > ^^^^^^^ Assigning > > > > > > > > > > I tried disabling msi and msix in /boot/loader.conf, but the settings > > > > > were ignored (probabaly too early). > > > > > > > > No, those settings are not too early. However, the routing to different > > > > CPUs now happens earlier than it used to. What is the line before the > > > > MSI lines? You can take a picture with your phone/camera if that's simplest. > > > > > > > > > > Here a few lines before the MSI routing happens: > > > > > > hpet0: iomem 0xfed00000-0xfed003ff irq 0,8 on acpi0 > > > hpet0: vendor 0x4353, rev 0x1, 14318180 Hz, 3 timers, legacy route > > > hpet0: t0 : irqs 0x00c0ff (0), MSI, periodic > > > hpet0: t1 : irqs 0x00c0ff (0), MSI, periodic > > > hpet0: t2 : irqs 0x00c0ff (0), MSI, periodic > > > Timecounter "HPET" frequency 14318180 Hz quality 950 > > > > The assigning message means it is in the loop using > > bus_bind_intr() to setup per-CPU timers. Can you please try > > setting 'hint.hpet.0.per_cpu=0' at the loader prompt to see if > > disabling the use of per-CPU timers allows you to boot? > > > > Something has changed since the last time I generated a kernel with > this option. > > Now I get a NULL-pointer dereference in the kernel, doesn't matter > whether I set the hint or not. > > No crash dump is created. > > Here some trace copied from the console: > > Fatal trap 12: page fault while in kernel mode > cpuid = 0; apic id = 00 > fault virtual address = 0x1818 > fault code = supervisor write data, page not present > instruction pointer = 0x20:0xffffffff805492ef > [some stack trace] > taskgroup_adjust() at taskgroup_adjust+0x2f; frame 0xffffffff8196c90 > mi_startup() at mi_startup+0x118; frame 0xffffffff8196fcb0 Yeah, I have the same on my laptop here. I'll fix that and get back to you. -- John Baldwin