Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 14 Jun 2004 15:11:52 GMT
From:      Andriy Gapon <avg@icyb.net.ua>
To:        freebsd-gnats-submit@FreeBSD.org
Subject:   kern/67940: acpi kernel trap on 4.10-release
Message-ID:  <200406141511.i5EFBp75056031@www.freebsd.org>
Resent-Message-ID: <200406141520.i5EFKJWV006127@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         67940
>Category:       kern
>Synopsis:       acpi kernel trap on 4.10-release
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Jun 14 15:20:19 GMT 2004
>Closed-Date:
>Last-Modified:
>Originator:     Andriy Gapon
>Release:        4.10-RELEASE
>Organization:
>Environment:
FreeBSD 4.10-RELEASE i386
device acpica 
>Description:
With device acpica in kernel I experience kernel traps approx. 30 sec after boot. This happens 100% of time both in multi-user and single-user modes. Crash dump follows:

IdlePTD at physical address 0x0041f000
initial pcb at physical address 0x00333100
panicstr: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x70
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc0193f30
stack pointer           = 0x10:0xc030b6d0
frame pointer           = 0x10:0xc030b6f4
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = Idle
interrupt mask          = net tty bio cam
trap number             = 12
panic: page fault

syncing disks...
done
Uptime: 34s

dumping to dev #ad/0x20021, offset 295923
dump ata2: resetting devices .. ata2: mask=03 ostat0=50 ostat2=00
ad4: ATAPI 00 00
ata2-slave: ATAPI 00 00
ata2: mask=03 stat0=50 stat1=00
ad4: ATA 01 a5
ata2: devices=01
ad4: success setting PIO4 on generic chip
done
255 254 253 252 251 250 249 248 247 246 245 244 243 242 241 240 239 238
237 236 235 234 233 232 231 230 229 228 227 226 225 224 223 222 221 220
219 218 217 216 215 214 213 212 211 210 209 208 207 206 205 204 203 202
201 200 199 198 197 196 195 194 193 192 191 190 189 188 187 186 185 184
183 182 181 180 179 178 177 176 175 174 173 172 171 170 169 168 167 166
165 164 163 162 161 160 159 158 157 156 155 154 153 152 151 150 149 148
147 146 145 144 143 142 141 140 139 138 137 136 135 134 133 132 131 130
129 128 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112
111 110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 95 94 93 92
91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68
67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44
43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20
19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

#0  dumpsys () at /export/src/sys/kern/kern_shutdown.c:487
487             if (dumping++) {
(kgdb) bt
#0  dumpsys () at /export/src/sys/kern/kern_shutdown.c:487
#1  0xc0190dd8 in boot (howto=256) at
/export/src/sys/kern/kern_shutdown.c:316
#2  0xc019120c in poweroff_wait (junk=0xc03037ac, howto=-1070583121) at
/export/src/sys/kern/kern_shutdown.c:595
#3  0xc02ac1eb in trap_fatal (frame=0xc030b690, eva=112) at
/export/src/sys/i386/i386/trap.c:974
#4  0xc02abea5 in trap_pfault (frame=0xc030b690, usermode=0, eva=112) at
/export/src/sys/i386/i386/trap.c:867
#5  0xc02aba7b in trap (frame={tf_fs = 16, tf_es = 16, tf_ds = 16,
tf_edi = 256, tf_esi = -1058817664, tf_ebp = -1070549260, tf_isp =
-1070549316,
      tf_ebx = 0, tf_edx = 765930, tf_ecx = -1070549208, tf_eax = -1,
tf_trapno = 12, tf_err = 0, tf_eip = -1072087248, tf_cs = 8, tf_eflags =
66118,
      tf_esp = -1058817664, tf_ss = 0}) at
/export/src/sys/i386/i386/trap.c:466
#6  0xc0193f30 in tsleep (ident=0xc0e3b980, priority=256,
wmesg=0xc02cf8f3 "acsem", timo=0) at /export/src/sys/kern/kern_synch.c:436
#7  0xc015cb13 in AcpiOsWaitSemaphore (Handle=0xc0e3b980, Units=1,
Timeout=65535) at /export/src/sys/dev/acpica/Osd/OsdSynch.c:240
#8  0xc014f167 in AcpiUtAcquireMutex (MutexId=0) at
/export/src/sys/contrib/dev/acpica/utmisc.c:844
#9  0xc0143fbb in AcpiExEnterInterpreter () at
/export/src/sys/contrib/dev/acpica/exutils.c:166
#10 0xc0145de9 in AcpiNsExecuteControlMethod (MethodNode=0xc266d5a0,
Params=0x0, ReturnObjDesc=0xc030b790) at
/export/src/sys/contrib/dev/acpica/nseval.c:521
#11 0xc0145d84 in AcpiNsEvaluateByHandle (Handle=0xc266d5a0, Params=0x0,
ReturnObject=0xc030b814) at /export/src/sys/contrib/dev/acpica/nseval.c:409
#12 0xc0145c55 in AcpiNsEvaluateRelative (Handle=0xc266d5e0,
Pathname=0xc030b89c "_AC0", Params=0x0, ReturnObject=0xc030b814)
    at /export/src/sys/contrib/dev/acpica/nseval.c:225
#13 0xc0147321 in AcpiEvaluateObject (Handle=0xc266d5e0,
Pathname=0xc030b89c "_AC0", ExternalParams=0x0, ReturnBuffer=0xc030b83c)
    at /export/src/sys/contrib/dev/acpica/nsxfeval.c:357
#14 0xc0156081 in acpi_EvaluateInteger (handle=0xc266d5e0,
path=0xc030b89c "_AC0", number=0xc266ad30) at
/export/src/sys/dev/acpica/acpi.c:1119
#15 0xc015b9c8 in acpi_tz_getparam (sc=0xc266ad00, node=0xc030b89c
"_AC0", data=0xc266ad30) at /export/src/sys/dev/acpica/acpi_thermal.c:636
#16 0xc015b273 in acpi_tz_establish (sc=0xc266ad00) at
/export/src/sys/dev/acpica/acpi_thermal.c:311
#17 0xc015c8a0 in AcpiOsExecuteQueue (arg=0xc2729e40, pending=1) at
/export/src/sys/dev/acpica/Osd/OsdSchedule.c:234
#18 0xc01a004a in taskqueue_run (queue=0xc0e3b9e0) at
/export/src/sys/kern/subr_taskqueue.c:186
#19 0xc01a0086 in taskqueue_swi_run () at
/export/src/sys/kern/subr_taskqueue.c:202
#20 0xc02a14f0 in splz_swi ()

(kgdb) fr 6
#6  0xc0193f30 in tsleep (ident=0xc0e3b980, priority=256,
wmesg=0xc02cf8f3 "acsem", timo=0) at /export/src/sys/kern/kern_synch.c:436
436                     return (0);
(kgdb) i loc
ident = (void *) 0xc0e3b980
wmesg = 0x0
p = (struct proc *) 0xffffffff
s = -1
sig = -1058817664
catch = 256
thandle = {callout = 0x0}
(kgdb) fr 7
#7  0xc015cb13 in AcpiOsWaitSemaphore (Handle=0xc0e3b980, Units=1,
Timeout=65535) at /export/src/sys/dev/acpica/Osd/OsdSynch.c:240
240             rv = msleep(as, &as->as_mtx, PCATCH, "acsem", tmo);
(kgdb) i loc
Timeout = 65535
as = (struct acpi_semaphore *) 0xc0e3b980
result = 0
rv = 0
tmo = 0
timeouttv = {tv_sec = 1086868951, tv_usec = 765930}
currenttv = {tv_sec = 1086868885, tv_usec = 765930}
timelefttv = {tv_sec = -1058817856, tv_usec = 6292480}
s = 6292480

>From this debug information it seems that trap actually
occured at kern_synch.c:444
	if (p->p_wchan != NULL)
because p assigned from curpoc was equal to NULL - curpoc is equal to
NULL in the dump and p_wchan field has offset 0x70 in struct proc.
I believe that cuproc being NULL is a correct thing because in 4.10 ACPI
tasks are executed from SWI task queue i.e. from software interrupt
handler, as a part of kernel. But I may be mistaken here.
Then, I think, the call to tsleep i.e the need to wait on semaphore
resulted from the fact that some ACPI Thermal routines are executed both
from ACPI taskqueue and a separate ACPI Thermal kernel thread, so
probabalistically there can be "parallel" access to the same resources.
>How-To-Repeat:
      it happens here 100% of time, may depend on your particular hardware system e.g. if/when it issues thermal notification and parameters of thermal polling.
>Fix:
      I see 3 obvious ways of making this work:
1. change tsleep() to handle curproc == NULL
2. run ACPI tasks on kernel thread taskqueue instead of SWI taskqueue,
so that there would be kernel process to run them on
3. somehow make sure that there is no parallel execution of ACPI tasks,
so that tsleep() is never called (eliminate ACPI Thermal kthread)

Having no knowledge in ACPI internals and very little knowledge in
kernel internals I can not dare suggest what would be most appropriate
to do. Nevertheless, it seems that (1) would be too intrusive and
global; (3) is complex and might not be too reliable; (2) seems to be
the easiest, one line change from "taskqueue_swi" to "taskqueue_thread"
in OsdSchedule.c:
--- OsdSchedule.c.orig  Mon Jun 14 17:59:42 2004
+++ OsdSchedule.c       Mon Jun 14 18:00:07 2004
@@ -192,7 +192,7 @@
     TASK_INIT(&at->at_task, pri, AcpiOsExecuteQueue, at);
 
 #if __FreeBSD_version < 500000
-    taskqueue_enqueue(taskqueue_swi, (struct task *)at);
+    taskqueue_enqueue(taskqueue_thread, (struct task *)at);
 #else
     taskqueue_enqueue(taskqueue_acpi, (struct task *)at);
 #endif

The above patch fixed the problem described for me and does not seem to have introduced any problems.
>Release-Note:
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200406141511.i5EFBp75056031>