From owner-freebsd-current@FreeBSD.ORG Thu Jun 17 23:59:11 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DB04316A4CE for ; Thu, 17 Jun 2004 23:59:11 +0000 (GMT) Received: from mail.sandvine.com (sandvine.com [199.243.201.138]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4151943D2D for ; Thu, 17 Jun 2004 23:59:11 +0000 (GMT) (envelope-from don@sandvine.com) Received: by mail.sandvine.com with Internet Mail Service (5.5.2657.72) id ; Thu, 17 Jun 2004 19:58:32 -0400 Message-ID: From: Don Bowman To: 'Matthew Dillon' Date: Thu, 17 Jun 2004 19:58:31 -0400 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2657.72) Content-Type: text/plain; charset="iso-8859-1" cc: Julian Elischer cc: "'current@freebsd.org'" Subject: RE: STI, HLT in acpi_cpu_idle_c1 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Jun 2004 23:59:12 -0000 From: Matthew Dillon [mailto:dillon@apollo.backplane.com] > Probably not P72.. that would result in weird, inconsistent panics > rather then consistent hangs. To make sure, just cool > your cpu down > a little (open the case and point a big fan at it). If nothing > changes then it isn't P72. Its definitely not hot, plenty of blowers, in an air-conditioned room, has been qualified in environmental chamber. > > The STI; HLT sequence is definitely working properly... operating > systems have depended on that code sequence forever. Going down > that path is a red herring. > > If NMI can't stop the other processors w/ IPI STOP then > the PC for those > cpus that you see in the dump is not necessarily going to be where > they are actually hung. Its not that they're hung, the emulator allows me to see the current PC, registers, etc. They really are sitting with interrupts locked off. In the case that i modified the db to time out on the stop ipi, i can believe that the stacks weren't necessarily consistent, although they seemed to be. In the case I'm using the emulator it seems correct. > > It kinda sounds like ACPI has bokered the other cpus. > I'm not sure > why one would even *want* to use ACPI to idle down Xeon's in an MP > system, actually :-) Its not so much that I want to use ACPI, its that the machine doesn't boot without it, and it can't be disabled later. You do want the HLT on idle, like the sysctl enabled on releng_4, otherwise the performance goes down and the power goes up. I will keep digging, thanks muchly for the input. The other option i will pursue is whether the APIC structure has been altered somehow, something changed in there, etc. --don