From owner-freebsd-stable@FreeBSD.ORG Wed May 24 14:57:00 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5AEB016A7FC for ; Wed, 24 May 2006 14:57:00 +0000 (UTC) (envelope-from gavin.atkinson@ury.york.ac.uk) Received: from mail-gw4.york.ac.uk (mail-gw4.york.ac.uk [144.32.128.249]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4A6DB43D48 for ; Wed, 24 May 2006 14:56:56 +0000 (GMT) (envelope-from gavin.atkinson@ury.york.ac.uk) Received: from buffy.york.ac.uk (buffy-128.york.ac.uk [144.32.128.160]) by mail-gw4.york.ac.uk (8.13.6/8.13.6) with ESMTP id k4OEtubQ023145 for ; Wed, 24 May 2006 15:55:56 +0100 (BST) Received: from buffy.york.ac.uk (localhost [127.0.0.1]) by buffy.york.ac.uk (8.13.6/8.13.6) with ESMTP id k4OEtuhF035550 for ; Wed, 24 May 2006 15:55:56 +0100 (BST) (envelope-from gavin.atkinson@ury.york.ac.uk) Received: (from ga9@localhost) by buffy.york.ac.uk (8.13.6/8.13.6/Submit) id k4OEtu3r035549 for freebsd-stable@freebsd.org; Wed, 24 May 2006 15:55:56 +0100 (BST) (envelope-from gavin.atkinson@ury.york.ac.uk) X-Authentication-Warning: buffy.york.ac.uk: ga9 set sender to gavin.atkinson@ury.york.ac.uk using -f From: Gavin Atkinson To: freebsd-stable@freebsd.org Content-Type: text/plain Content-Transfer-Encoding: 7bit Date: Wed, 24 May 2006 15:55:55 +0100 Message-Id: <1148482556.35287.18.camel@buffy.york.ac.uk> Mime-Version: 1.0 X-Mailer: Evolution 2.6.1 FreeBSD GNOME Team Port X-York-MailScanner: Found to be clean X-York-MailScanner-From: gavin.atkinson@ury.york.ac.uk Subject: 6.1-RELEASE amd64 panic: bad pte X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 May 2006 14:57:01 -0000 Hi, I've just seen the following panic on a dual cpu amd64 box: FreeBSD 6.1-RELEASE FreeBSD 6.1-RELEASE #0: Sun May 7 04:15:57 UTC 2006 root@bloom.cse.buffalo.edu:/usr/obj/usr/src/sys/SMP amd64 Panic happened during "cd /usr/port/databases/mysql4-server && make install" - box was otherwise idle. TPTE at 0xffff8000040028c8 IS ZERO @ VA 800519000 panic: bad pte cpuid = 1 Uptime: 7d0h12m25s Dumping 2047 MB (2 chunks) chunk 0: 1MB (150 pages) ... ok chunk 1: 2047MB (523888 pages) 2031 2015 1999 1983 1967 1951 1935 1919 1903 1887 1871 1855 1839 1823 1807 1791 1775 1759 1743 1727 1711 1695 1679 1663 1647 1631 1615 1599 1583 1567 1551 1535 1519 1503 1487 1471 1455 1439 1423 1407 1391 1375 1359 1343 1327 1311 1295 1279 1263 1247 1231 1215 1199 1183 1167 1151 1135 1119 1103 1087 1071 1055 1039 1023 1007 991 975 959 943 927 911 895 879 863 847 831 815 799 783 767 751 735 719 703 687 671 655 639 623 607 591 575 559 543 527 511 495 479 463 447 431 415 399 383 367 351 335 319 303 287 271 255 239 223 207 191 175 159 143 127 111 95 79 63 47 31 15 ... ok Dump complete As this is a -RELEASE kernel, no kernel.debug exits. As an aside, what happened to the idea of generating these and putting them on the CD or the ftp site? (kgdb) bt #0 0xffffffff803d204d in doadump () #1 0xffffffff803d2074 in doadump () #2 0x0000000000000004 in ?? () #3 0xffffffff803d2677 in boot () #4 0x0000000000000019 in ?? () #5 0x000000000fd2cf65 in ?? () #6 0xffffff005d789260 in ?? () #7 0x0000000000000104 in ?? () #8 0x0000000000000000 in ?? () Previous frame identical to this frame (corrupt stack?) # this is the TPTE pointer from the panic printf (kgdb) x/20 0xffff8000040028c8 0xffff8000040028c8: Cannot access memory at address 0xffff8000040028c8 # This address appears in the backtrace (kgdb) x/40 0xffffff005d789260 0xffffff005d789260: 0x5bf62000 0xffffff00 0x7b9867e0 0xffffff00 0xffffff005d789270: 0x00000000 0x00000000 0x5bf62020 0xffffff00 0xffffff005d789280: 0x00000000 0x00000000 0x7b9867f8 0xffffff00 0xffffff005d789290: 0x00000000 0x00000000 0x62981a80 0xffffff00 0xffffff005d7892a0: 0x00000000 0x00000000 0xb420b890 0xffffffff 0xffffff005d7892b0: 0x00000000 0x00000000 0x00000000 0x00000000 0xffffff005d7892c0: 0x00000000 0x00000000 0x5d7892c0 0xffffff00 0xffffff005d7892d0: 0x62981a80 0xffffff00 0x628d8500 0xffffff00 0xffffff005d7892e0: 0x60e06b80 0xffffff00 0x000186f4 0x05010002 0xffffff005d7892f0: 0x00000000 0x00000000 0x00000000 0x00000000 Pointers to this structure are littered throughout the memory following the stack pointer - it seems to be a thread pointer. The first value (0xffffff005bf62000) appears to be a pointer to struct proc, but I'm not sure if there's anything useful that can be found from that either. The process was "grep", pid 84450, p_flags are P_EXEC|P_WEXIT|P_CONTROLT - so it happened while the process was exiting, which makes sense given the panic is called from pmap_remove_pages(). I'm happy to do more digging if there are any lines of investigation anyone can suggest. Possibly-relevant bits of dmesg: Copyright (c) 1992-2006 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 6.1-RELEASE #0: Sun May 7 04:15:57 UTC 2006 root@bloom.cse.buffalo.edu:/usr/obj/usr/src/sys/SMP ACPI APIC Table: Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: AMD Opteron(tm) Processor 248 (2193.76-MHz K8-class CPU) Origin = "AuthenticAMD" Id = 0xf5a Stepping = 10 Features=0x78bfbff AMD Features=0xe0500800 real memory = 2146893824 (2047 MB) avail memory = 2062053376 (1966 MB) FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 MADT: Forcing active-low polarity and level trigger for SCI ioapic0 irqs 0-23 on motherboard ioapic1 irqs 24-27 on motherboard ioapic2 irqs 28-31 on motherboard I have the coredump available for further analysis if anyone wants it, but without a debug kernel I appreciate it may not be useful. I'm going to compile one up, in case it happens again. Gavin