Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 14 Nov 2010 20:11:18 GMT
From:      Loic Pefferkorn <loic-freebsd@loicp.eu>
To:        freebsd-gnats-submit@FreeBSD.org
Subject:   kern/152250: [patch] Kernel panic when hw.ciss.expose_hidden_physical is set
Message-ID:  <201011142011.oAEKBIAH018826@www.freebsd.org>
Resent-Message-ID: <201011142020.oAEKK83C027504@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         152250
>Category:       kern
>Synopsis:       [patch] Kernel panic when hw.ciss.expose_hidden_physical is set
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sun Nov 14 20:20:08 UTC 2010
>Closed-Date:
>Last-Modified:
>Originator:     Loic Pefferkorn
>Release:        7.2-RELEASE
>Organization:
>Environment:
FreeBSD squeak.estat 7.2-STABLE FreeBSD 7.2-STABLE #5: Sun Nov 14 20:35:21 CET 2010     root@squeak.estat:/usr/obj/usr/src/sys/GENERIC  amd64
>Description:
HP ProLiant DL360 G6 server with an HP StorageWorks MSL4048 Tape Library

# grep ciss /boot/loader.conf 
hw.ciss.expose_hidden_physical=1


When the tunable hw.ciss.expose_hidden_physical is set at boot time, I have a kernel panic:

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address	= 0x8
fault code		= supervisor read data, page not present
instruction pointer	= 0x8:0xffffffff80201686
stack pointer	        = 0x10:0xffffff807c6ab930
frame pointer	        = 0x10:0x400
code segment		= base 0x0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 77 (sysctl)
trap number		= 12
panic: page fault
cpuid = 0
Uptime: 6s
Physical memory: 4073 MB
Dumping 1230 MB:

Backtrace from the core dump:

(kgdb) bt
#0  doadump () at pcpu.h:195
#1  0x0000000000000004 in ?? ()
#2  0xffffffff8054cff9 in boot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:418
#3  0xffffffff8054d402 in panic (fmt=0x104 <Address 0x104 out of bounds>)
    at /usr/src/sys/kern/kern_shutdown.c:574
#4  0xffffffff80812563 in trap_fatal (frame=0xffffff0003eb4390, eva=Variable "eva" is not available.
)
    at /usr/src/sys/amd64/amd64/trap.c:756
#5  0xffffffff80812935 in trap_pfault (frame=0xffffff807c6ab880, usermode=0)
    at /usr/src/sys/amd64/amd64/trap.c:672
#6  0xffffffff80813274 in trap (frame=0xffffff807c6ab880)
    at /usr/src/sys/amd64/amd64/trap.c:443
#7  0xffffffff807fd2ce in calltrap ()
    at /usr/src/sys/amd64/amd64/exception.S:218
#8  0xffffffff80201686 in acpi_child_pnpinfo_str_method (cbdev=Variable "cbdev" is not available.
)
    at /usr/src/sys/dev/acpica/acpi.c:850
#9  0xffffffff805753c9 in device_sysctl_handler (oidp=Variable "oidp" is not available.
)
    at /usr/src/sys/kern/subr_bus.c:260
#10 0xffffffff8055654f in sysctl_root (oidp=Variable "oidp" is not available.
)
    at /usr/src/sys/kern/kern_sysctl.c:1419
#11 0xffffffff805578c5 in userland_sysctl (td=0x0, name=0xffffff807c6abac0, 
    namelen=4, old=0x0, oldlenp=Variable "oldlenp" is not available.
) at /usr/src/sys/kern/kern_sysctl.c:1522
#12 0xffffffff80557ad2 in __sysctl (td=0xffffff0003eb4390, 
    uap=0xffffff807c6abbf0) at /usr/src/sys/kern/kern_sysctl.c:1449
#13 0xffffffff80812bb7 in syscall (frame=0xffffff807c6abc80)
    at /usr/src/sys/amd64/amd64/trap.c:899
#14 0xffffffff807fd4db in Xfast_syscall ()
    at /usr/src/sys/amd64/amd64/exception.S:339
#15 0x0000000800719cac in ?? ()
Previous frame inner to this frame (corrupt stack?)

Faulty instruction:
(kgdb) x/i 0xffffffff80201686
0xffffffff80201686 <acpi_child_pnpinfo_str_method+70>:  mov    0x8(%rbx),%edx

>How-To-Repeat:
With the same hardware, put hw.ciss.expose_hidden_physical=1 in loader.conf and reboot.
>Fix:
Last called function is acpi_child_pnpinfo_str_method in sys/dev/acpica/acpi.c

static int
acpi_child_pnpinfo_str_method(device_t cbdev, device_t child, char *buf,
    size_t buflen)
{
    ACPI_BUFFER adbuf = {ACPI_ALLOCATE_BUFFER, NULL};
    ACPI_DEVICE_INFO *adinfo;
    struct acpi_device *dinfo = device_get_ivars(child);
    char *end;
    int error;

    error = AcpiGetObjectInfo(dinfo->ad_handle, &adbuf);
    adinfo = (ACPI_DEVICE_INFO *) adbuf.Pointer;
    if (error)
        snprintf(buf, buflen, "unknown");
    else
        snprintf(buf, buflen, "_HID=%s _UID=%lu",
                 (adinfo->Valid & ACPI_VALID_HID) ?
                 adinfo->HardwareId.Value : "none",
                 (adinfo->Valid & ACPI_VALID_UID) ?
                 strtoul(adinfo->UniqueId.Value, &end, 10) : 0);
    if (adinfo)
        AcpiOsFree(adinfo);

    return (0);
}

buf is modified accordingly to "error" value. 

I have found adbuf.Pointer to be set to 0x0 while "error" was set to a zero value. Therefore, references to adinfo struct in snprintf have 0x0 as base.

"error" value is not set correctly. Let's see why in AcpiGetObjectInfo, in sys/contrib/dev/acpica/nsxfname.c

Node = AcpiNsMapHandleToNode (Handle);
if (!Node)
{
    (void) AcpiUtReleaseMutex (ACPI_MTX_NAMESPACE);
    goto Cleanup;
}
(...)
Cleanup:
    ACPI_FREE (Info);
    if (CidList)
    {
        ACPI_FREE (CidList);
    }
    return (Status);

If AcpiNsMapHandleToNode fails, we release a mutex and go to Cleanup:, which does not update Status value before return. 
Status value hence is the one from AcpiUtAcquireMutex called earlier, which is wrong.

Setting Status to AE_BAD_PARAMETER before going to Cleanup fix the issue (I found that AE_BAD_PARAMETER is used elsewhere in the kernel in similar flows when AcpiNsMapHandleToNode is called).

7.0 to 7.3 are affected, patch is attached.

Hope I'm right :)

Patch attached with submission follows:

--- src/sys/contrib/dev/acpica/nsxfname.c.orig  2010-11-14 20:51:57.000000000 +0100
+++ src/sys/contrib/dev/acpica/nsxfname.c       2010-11-14 20:50:46.000000000 +0100
@@ -361,6 +361,7 @@
     if (!Node)
     {
         (void) AcpiUtReleaseMutex (ACPI_MTX_NAMESPACE);
+        Status = AE_BAD_PARAMETER;
         goto Cleanup;
     }
 


>Release-Note:
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201011142011.oAEKBIAH018826>