From owner-freebsd-acpi@FreeBSD.ORG Tue Jun 5 18:32:19 2007 Return-Path: X-Original-To: freebsd-acpi@freebsd.org Delivered-To: freebsd-acpi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id B15BE16A41F for ; Tue, 5 Jun 2007 18:32:19 +0000 (UTC) (envelope-from pho@holm.cc) Received: from relay03.pair.com (relay03.pair.com [209.68.5.17]) by mx1.freebsd.org (Postfix) with SMTP id 75FCE13C457 for ; Tue, 5 Jun 2007 18:32:19 +0000 (UTC) (envelope-from pho@holm.cc) Received: (qmail 83440 invoked from network); 5 Jun 2007 18:32:17 -0000 Received: from 83.95.197.164 (HELO peter.osted.lan) (83.95.197.164) by relay03.pair.com with SMTP; 5 Jun 2007 18:32:17 -0000 X-pair-Authenticated: 83.95.197.164 Received: from peter.osted.lan (localhost.osted.lan [127.0.0.1]) by peter.osted.lan (8.13.6/8.13.6) with ESMTP id l55IWHA9024117; Tue, 5 Jun 2007 20:32:17 +0200 (CEST) (envelope-from pho@peter.osted.lan) Received: (from pho@localhost) by peter.osted.lan (8.13.6/8.13.6/Submit) id l55IWHlL024116; Tue, 5 Jun 2007 20:32:17 +0200 (CEST) (envelope-from pho) Date: Tue, 5 Jun 2007 20:32:17 +0200 From: Peter Holm To: John Baldwin Message-ID: <20070605183216.GA23211@peter.osted.lan> References: <20070604183419.GA73268@peter.osted.lan> <200706051027.29879.jhb@freebsd.org> <20070605164402.GA18091@peter.osted.lan> <200706051326.22581.jhb@freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200706051326.22581.jhb@freebsd.org> User-Agent: Mutt/1.4.2.1i Cc: freebsd-acpi@freebsd.org Subject: Re: Possible ACPI relared panic with Tyan S2720 X-BeenThere: freebsd-acpi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: ACPI and power management development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Jun 2007 18:32:19 -0000 On Tue, Jun 05, 2007 at 01:26:22PM -0400, John Baldwin wrote: > On Tuesday 05 June 2007 12:44:02 pm Peter Holm wrote: > > On Tue, Jun 05, 2007 at 10:27:29AM -0400, John Baldwin wrote: > > > On Tuesday 05 June 2007 04:44:54 am Nate Lawson wrote: > > > > Peter Holm wrote: > > > > > On Mon, Jun 04, 2007 at 12:45:23PM -0700, Nate Lawson wrote: > > > > >> This is a really confusing issue. All the trace you have shows is > that > > > > >> it occurs while transitioning the system from legacy to ACPI mode. > > > > >> Unfortunately, the details of what is going on are hidden in the BIOS > > > > >> since that write to a port triggers an SMI and the BIOS does the > rest. > > > > >> > > > > >> However, it seems like the BIOS is reserving more memory, using > memory > > > > >> it didn't reserve, or FreeBSD is using memory we shouldn't. John, > any > > > > >> insight on the SMAP output? > > > > >> > > > > >>> SMAP type=01 base=0000000000000000 len=000000000009fc00 > > > > >>> SMAP type=02 base=000000000009fc00 len=0000000000000400 > > > > >>> SMAP type=02 base=00000000000e0000 len=0000000000020000 > > > > >>> SMAP type=01 base=0000000000100000 len=000000003fef0000 > > > > >>> SMAP type=03 base=000000003fff0000 len=000000000000f000 > > > > >>> SMAP type=04 base=000000003ffff000 len=0000000000001000 > > > > >>> SMAP type=02 base=00000000fec00000 len=0000000000100000 > > > > >>> SMAP type=02 base=00000000fee00000 len=0000000000001000 > > > > >>> SMAP type=02 base=00000000fff80000 len=0000000000080000 > > > > >> Peter, can you figure out what phys address is getting overwritten? > > > > >> Seems like it's the loader that sets up the module list and the > loader's > > > > >> allocator may be using RAM it shouldn't. > > > > >> > > > > > > > > > > If I did it right (I used a vtophys() on the address): > > > > > > > > > > Address of mod->name(if_tun): 0xc3eed5ec, phys: 0x985ec > > > > > > > > So it's somewhere near 620K and the first region goes to 640K - 1 K. > > > > The last 1 K is type 2 (reserved). Nothing seems to show why switching > > > > to acpi mode results in an overwrite of data at 620K. I'm not sure > > > > where to look. > > > > > > > > There should be some way to write a guard pattern to that area but I'll > > > > have to think about it a bit first. Can you see if a BIOS update is > > > > available and try it out? What about seeing if you can pre-alloc (by > > > > hacking loader's SMAP code to reserve more of the first 640 K) and > > > > writing a pattern there, then verifying it at various points during boot > > > > to be sure we know exactly where the BIOS is writing? > > > > > > Err, the loader should not be storing modules that low. Did you kldload > the > > > module or load it via the loader? > > > > > > > I did not load the module. It's loaded automatically by the loader. > > > > This is my /boot/loader.conf > > > > kernel_options="-D" > > machdep.hyperthreading_allowed=1 > > hw.ata.atapi_dma=0 > > Are you sure it isn't loaded by ifconfig during boot and thus via an implicit > kldload? The loader only loads modules into memory > KERNLOAD (2MB for PAE, > 4MB for non-PAE). > No, I'm not sure at all! I have tried to manually load acpi.ko at the loader prompt and also to add acpi_load="YES" to /boot/loader.conf. This still overwrites the if_tun entry in the modules list. Typing unset acpi_load at the loader prompt works and I can then later load acpi: $ kldstat Id Refs Address Size Name 1 1 0xc0400000 889124 kernel $ kldload acpi.ko $ kldstat Id Refs Address Size Name 1 3 0xc0400000 889124 kernel 2 1 0xc48af000 57000 acpi.ko Just to summarize the problem: The memory corruption comes and goes depending on the kernel config file. I first identified the "cause" to be files committed by scottl at 2007/05/14 21:48, which just introduces new malloc types. Right now GENERIC works fine again, but if I remove the newly added: nodevice fwip # IP over FireWire (RFC 2734,3146) nodevice dcons # Dumb console driver nodevice dcons_crom # Configuration ROM for dcons the problem pops up again. -- Peter