From owner-freebsd-current@FreeBSD.ORG Thu Sep 30 15:44:10 2010 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 04BBF1065672; Thu, 30 Sep 2010 15:44:10 +0000 (UTC) (envelope-from naylor.b.david@gmail.com) Received: from mail-ww0-f50.google.com (mail-ww0-f50.google.com [74.125.82.50]) by mx1.freebsd.org (Postfix) with ESMTP id 1E8098FC0A; Thu, 30 Sep 2010 15:44:08 +0000 (UTC) Received: by wwb17 with SMTP id 17so2727816wwb.31 for ; Thu, 30 Sep 2010 08:44:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:from:organization:to:subject :date:user-agent:cc:references:in-reply-to:mime-version:content-type :content-transfer-encoding:message-id; bh=VafLomEmZ4ZcgAuJ/8rGxRW0Ayis1D7ZPNWDQRE8+GI=; b=MWhB0+ikjDLiLBn3pi7oGEEgu+CzOitqX3e/hBGwiOrBAlMfBTIysoRFrXRNGKokdB OGQk/cQUaoLjXkLeDGgiALoNsYivcSliioppQFVzKA8bWSv/n/znq6kwYyHJm45jkPgL 1ZnBj1BbkD6O6Xg2oplHIWQHvbbH6bzC3jBTQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:organization:to:subject:date:user-agent:cc:references :in-reply-to:mime-version:content-type:content-transfer-encoding :message-id; b=hFFcaShayOz3XXytn5+eYYqDF0hBTRc5eY9nQgfaXoPnZwJjSdAupn5jeZ/1vqg0jN 9Hp9jjRUJwqtp1wKog8CZBM8pm3xzHrexLIAc+yqs8oXo30/lyjXrkGTMREDqAucCxDd HW9K1WrhP+U+TYX6UsgV/IZWK7F58TDIcXrjI= Received: by 10.227.135.141 with SMTP id n13mr3409680wbt.97.1285861425319; Thu, 30 Sep 2010 08:43:45 -0700 (PDT) Received: from dragon.dg (41-132-25-181.dsl.mweb.co.za [41.132.25.181]) by mx.google.com with ESMTPS id bc3sm8485624wbb.2.2010.09.30.08.43.40 (version=SSLv3 cipher=RC4-MD5); Thu, 30 Sep 2010 08:43:42 -0700 (PDT) From: David Naylor Organization: Private To: Alexander Motin Date: Thu, 30 Sep 2010 17:43:01 +0200 User-Agent: KMail/1.13.5 (FreeBSD/9.0-CURRENT; KDE/4.4.5; amd64; ; ) References: <201009291207.53146.naylor.b.david@gmail.com> <201009300755.46989.naylor.b.david@gmail.com> <4CA433B7.9010306@FreeBSD.org> In-Reply-To: <4CA433B7.9010306@FreeBSD.org> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart2414250.FMb40JKo8v"; protocol="application/pgp-signature"; micalg=pgp-sha1 Content-Transfer-Encoding: 7bit Message-Id: <201009301743.36007.naylor.b.david@gmail.com> Cc: freebsd-current@freebsd.org, Andriy Gapon Subject: Re: Safe-mode on amd64 broken X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Sep 2010 15:44:10 -0000 --nextPart2414250.FMb40JKo8v Content-Type: Text/Plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable On Thursday 30 September 2010 08:52:39 Alexander Motin wrote: > David Naylor wrote: > > On Thursday 30 September 2010 07:23:34 Alexander Motin wrote: > >> David Naylor wrote: > >>> On Wednesday 29 September 2010 18:25:13 Alexander Motin wrote: > >>>> David Naylor wrote: > >>>>> On Wednesday 29 September 2010 16:19:08 Andriy Gapon wrote: > >>>>>> What do you try to actually achieve? > >>>>>=20 > >>>>> I was trying to boot a system and it was panicking due to stray > >>>>> interrupts. It turned out to be caused by HPET. I found > >>>>> `hint.hpet.0.clock=3D0' which fixed the problem. > >>>>>=20 > >>>>> This means HPET does not work on any of my machines. The other one= 's > >>>>> symptoms are hda losing interrupts after a period of up-time. > >>>>=20 > >>>> What chipset do you use? Nvidia MCP5x? Could you send me your verbose > >>>> dmesg? > >>>=20 > >>> Yes, the one is a MCP51, the other is a ICH8M. > >>>=20 > >>> The desktop is a Gigabyte N650SLI-DS4L. Its symptom is hda losing > >>> interrupts after a period of time. > >>=20 > >> There are too many reports about different lost interrupts problems on > >> different controllers of MCP5x. I don't know the reason. Attached patch > >> should disable using regular HPET interrupts on NVidia chipsets. I hope > >> it will work as workaround. May be it is too aggressive, but better to > >> be safe then sorry. I assume that legacy_route mode may still work fine > >> there. It would be nice to test it. > >=20 > > I assume you mean hint.hpet.0.legacy_route=3D1? I'll give that a try l= ater > > today on both machines. >=20 > Make sure that both attimer and atrtc disabled, as mentioned in hpet(4). legacy_route worked on the desktop but not on the laptop (boot stalled). =20 Here is vmstat using default settings for the desktop: interrupt total rate irq1: atkbd0 64 0 irq12: psm0 756 3 irq14: ata0 1255 5 irq16: vgapci0 13576 54 irq17: dc0 1546 6 irq18: hpet0 456756 1834 irq20: atapci2 11557 46 irq21: hdac0 ohci0 17038 68 irq23: atapci1 11534 46 Total 514082 2064 I moved hpet to irq22 (allowed_irqs=3D"0x400000") and that also worked for = the=20 desktop. =20 > > Is your patch the same as hint.hpet.0.clock=3D0? >=20 > By default - effectively yes. But it still allows to configure > legacy_route, which is, for example, default for Linux. >=20 > >>> The laptop is a Acer 2920. Its symptom for a GENERIC is a panic sayi= ng > >>> stray interrupt (irq7), with a custom kernel booting stalls. > >>=20 > >> This is strange, as my Acer with the same ICH8M works fine in all > >> possible modes. Also IMHO stray interrupts are not a reason to panic. > >> Could you show what it looks like? > >=20 > > See http://markmail.org/message/smxnofrdmmkxyvnd for my previous email > > that includes the backtrace from that panic. When I booted in i386 safe > > mode the kernel reported stray interrupts on irq7. vmstat -i shows irq7 > > as "stray irq7". >=20 > I am not sure "stray irq7" related here. Instead more suspicious looks > probable irq20 interrupt sharing between HPET and uhci0 and the fact > that system panicked during interrupt handler registration by uhci0. I > can't be sure what IRQ was used by HPET there, as in only present dmesg > it was disabled, but as soon as HPET registered early, I think it > grabbed first possible - irq20. On my system HPET also uses irq20, but > uhci0 lives on irq16 and so irq20 is not shared. On the laptop uhci0 and ehci0 live on irq20. =20 > To collect more data you may try to hint HPET driver to avoid irq20 by > setting hint.hpet.0.allowed_irqs=3D0x00e00000 or other values. I've tried > same recipy to create sharing on my system, but still found no problem. This fixes the problem for the laptop. This also allows one-shot timing to= =20 work. Moving hpet to irq22 also worked. Here is the vmstat -i using the=20 above hint: interrupt total rate irq1: atkbd0 407 0 irq9: acpi0 1857 2 irq12: psm0 1005 1 irq14: ata0 1870 2 irq18: uhci4 2183 2 irq20: uhci0 ehci0 2421 3 irq21: hpet0 uhci1 502330 667 irq23: uhci2 ehci1 3 0 irq256: vgapci0 25023 33 irq257: hdac0 236 0 irq258: bge0 79 0 irq259: ahci0 27356 36 Total 564770 750 --nextPart2414250.FMb40JKo8v Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.15 (FreeBSD) iEUEABECAAYFAkyksCcACgkQUaaFgP9pFrJt5gCYs1WK5VPIEg5+HLyZTNIgHtC/ wACcCQjBrPbunKWXajfwEFBK7RmI1RE= =JmLK -----END PGP SIGNATURE----- --nextPart2414250.FMb40JKo8v--