Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 10 Jan 2008 07:53:16 -0800
From:      "Moore, Robert" <robert.moore@intel.com>
To:        "Yousif Hassan" <yousif@alumni.jmu.edu>, "Nate Lawson" <nate@root.org>, "Alexey Starikovskiy" <astarikovskiy@suse.de>
Cc:        freebsd-acpi@FreeBSD.org
Subject:   RE: GPE handler livelock [fixed?]
Message-ID:  <B28E9812BAF6E2498B7EC5C427F293A403FE5D1C@orsmsx415.amr.corp.intel.com>
In-Reply-To: <F32E1CFEDB514314B86828FDDB1300B4@kamino>
References:  <200712311556.lBVFuVZf030567@freefall.freebsd.org><477916E0.2090702@root.org><200712311243.18123.jhb@freebsd.org><47802510.3040203@root.org><B2D86EF24E7E4BEF90BA00C945189463@kamino><B28E9812BAF6E2498B7EC5C427F293A403F607B4@orsmsx415.amr.corp.intel.com><47826AAA.6040101@root.org><B28E9812BAF6E2498B7EC5C427F293A403F6087A@orsmsx415.amr.corp.intel.com><47827291.60405@suse.de> <478272C3.5080704@suse.de><47829C03.3010802@root.org> <4782A051.3050107@suse.de><4782BA24.8030703@root.org> <F32E1CFEDB514314B86828FDDB1300B4@kamino>

next in thread | previous in thread | raw e-mail | index | archive | help
This is great news.

Special thanks to Alexey. We've been going around and around for almost
2 years attempting to solve these kinds of notify/GPE issues. It sounds
like we may have solved the issue(s).

We will of course integrate the changes that were made to the core
ACPICA code.

Bob


>-----Original Message-----
>From: owner-freebsd-acpi@freebsd.org [mailto:owner-freebsd-
>acpi@freebsd.org] On Behalf Of Yousif Hassan
>Sent: Thursday, January 10, 2008 4:57 AM
>To: Nate Lawson; Alexey Starikovskiy
>Cc: freebsd-acpi@FreeBSD.org; Moore, Robert
>Subject: Re: GPE handler livelock [fixed?]
>
>Nate, everyone:
>
>Well guess what?  The patch appears to WORK!!!!
>
>I tested it first with a normal startup with ACPI enabled, and it
didn't
>freeze.
>Not satisfied, I turned on verbose booting to watch the thermal change
>messages,
>and in fact, for the first time, 7.0 successfully changed AC states and
TZ
>zones
>on this laptop.
>
>Still not satisfied, I began some tasks that generate lots of heat
(say,
>compiling the kernel), and watched as the temperature rose.  No freeze.
>
>So, I'd like to give a huge thank-you to Nate and the gang, and
cautiously,
>I'd like to say congratulations, too.
>
>I'm happy to run any additional tests you need to confirm.
>
>Is this fix too late for 7.0 or is it a candidate for an RC?  I'd
>definitely
>be
>willing to test any new release w/ this fix to make sure it works
>out-of-the-box.
>
>--Yousif
>
>----- Original Message -----
>From: "Nate Lawson" <nate@root.org>
>To: "Alexey Starikovskiy" <astarikovskiy@suse.de>
>Cc: "Moore, Robert" <robert.moore@intel.com>; "Yousif Hassan"
><yousif@alumni.jmu.edu>; <freebsd-acpi@FreeBSD.org>
>Sent: Monday, January 07, 2008 6:47 PM
>Subject: Re: GPE handler livelock
>
>
>> This makes sense.  For the _L00 method in question, it runs 2
notifies
>> before it completes.  So in our queue, that would be:
>>
>> T1: GPE:_L00
>>    [_LOO method runs, GPE removed]
>>    [_L00 queues 2 more Notifies]
>> T2: Notify, Notify
>>    [_L00 completes]
>>
>> Right now, there is nothing in our code to synchronize _L00's
completion
>> with completion of the Notifys.  In fact, _L00 finishes before the
>> Notifys run.  Also, if another GPE comes in while this GPE is
running,
>> it will be queued in front of the Notifys and so would execute first.
>>
>> So I think with just your evgpe.c change, we will defer re-enabling
the
>> GPE until after the queued Notifys complete since the re-enable task
>> will be queued at the same priority.
>>
>> The only remaining concern I have is if another GPE comes in before
one
>> or both Notifys run, it will be inserted at the head of the queue.
So
>> as a hack, I'm setting the priority of these to be equal.
>>
>> Yousif, please try the attached patch and don't load a custom ASL.
>>
>> Thanks,
>> Nate
>>
>> Alexey Starikovskiy wrote:
>>> Nate,
>>>
>>> There are no debugger events in Linux, and all other deferred
execution
>>> happens
>>> in kacpid (GPE, EC and GL included). Notify events used to be
executed
>>> on this same
>>> queue until we noticed deadlocks with HP machines. All events are
not
>>> prioritized in
>>> any way -- it is a simple FIFO. To avoid deadlock, we moved notify
>>> events to separate queue,
>>> but it had a drawback of enabling level GPE too early, thus I
inserted a
>>> reschedule call to each completion on first queue, giving the notify
>>> queue chance to complete.
>>> Later, I was reminded that this approach is not bulletproof, so
enabling
>>> of the level events was
>>> moved to notify queue as well. As it happens after all notify events
for
>>> the gpe event were called,
>>> (but, probably, not executed), enabling of GPE will be deferred
until
>>> these notify events have chance to
>>> complete.
>>>
>>> So, essentially, we had no priority for any event, but now notify
event
>>> could preempt execution of
>>> any other event, and level GPE event does a flush of notify queue.
>>>
>>> Hope this helps,
>>> Alex.
>>>
>>> Nate Lawson wrote:
>>>> Alex,
>>>>
>>>> I had one question about your approach.  It maintains two
single-thread
>>>> task queues (kacpid and kacpi_notify).  It inserts each type of
event
>on
>>>> its own queue.  So there is no strict ordering of handling notifies
in
>>>> priority to other acpi tasks unless you're assuming something about
the
>>>> linux task priority model.  Do you have any expectation that notify
>>>> tasks run before other tasks (perhaps by a special priority
assigned to
>>>> the kacpi_notify work queue)?
>>>>
>>>> In FreeBSD, we have a single task queue.  However, we prioritize
events
>>>> in the queue in the following order (highest to lowest priority):
>>>>
>>>> * GPE
>>>> * EC/global lock
>>>> * Notify
>>>> * Debugger
>>>>
>>>> If an event is inserted on the queue with a higher priority and a
>>>> previous event has not started executing yet, this priority
determines
>>>> the order of insertion.  Thus if GPEs keep arriving, the Notify
won't
>be
>>>> executed until they're done.
>>>>
>>>> Thanks,
>>>> Nate
>>>>
>>>> Alexey Starikovskiy wrote:
>>>>> Here is the patch...
>>>>> Alexey Starikovskiy wrote:
>>>>>> I proposed this patch some time ago, it moves _Lxx enabling to
the
>end
>>>>>> of Notify queue, thus all notifies must complete before event
becomes
>>>>>> enabled again.
>>>>>> Hope it is readable to non-Linux people...
>>>>>>
>>>>>> Regards,
>>>>>> Alex.
>>>>>>
>>>>>> Moore, Robert wrote:
>>>>>>> No changes that I know of before 20070508.
>>>>>>>
>>>>>>> You'll need to figure out why you are getting another GPE before
the
>>>>>>> _Lxx method completes. There was something like this on Linux
with
>>>>>>> an HP
>>>>>>> machine, perhaps Alexey can help.
>>>>>>>
>>>>>>> As I recall, there was something nasty happening where the TZ
trip
>>>>>>> points had to be reset before the Notify() handler completed,
but
>>>>>>> this
>>>>>>> ended up causing another GPE, etc. etc.
>>>>>>>
>>>>>>> Bob
>>>>>>>
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Nate Lawson [mailto:nate@root.org]
>>>>>>>> Sent: Monday, January 07, 2008 10:09 AM
>>>>>>>> To: Moore, Robert
>>>>>>>> Cc: Yousif Hassan; freebsd-acpi@FreeBSD.org
>>>>>>>> Subject: Re: GPE handler livelock
>>>>>>>>
>>>>>>>> Bob, thanks for the reply.  That's exactly what my
investigation is
>>>>>>>> showing also.  It appears we're still on 20070320 so I'm not
sure
>>>>>>>> why
>>>>>>>> this would affect us though.  Perhaps a similar change was
already
>>>>>>>> present?  In any case, we should see if an import fixes this.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Nate
>>>>>>>>
>>>>>>>> Moore, Robert wrote:
>>>>>>>>> This sounds suspiciously like the changes we made to the
Notify()
>>>>>>>>> handling last year. We attempted to make the notify handler
run
>>>>>>>>> synchronously with the caller to Notify(), but this created
more
>>>>>>>>> problems than it solved. We ended up returning the behavior of
>>>>>>>>> Notify
>>>>>>>>> handlers to be asynchronous:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 19 October 2007. Summary of changes for version 20071019:
>>>>>>>>>
>>>>>>>>> 1) ACPI CA Core Subsystem:
>>>>>>>>>
>>>>>>>>> Reverted a change to Notify handling that was introduced in
>version
>>>>>>>>> 20070508. This version changed the Notify handling from
>>>>>>>>> asynchronous
>>>>>>> to
>>>>>>>>> fully synchronous (Device driver Notify handling with respect
to
>>>>>>>>> the
>>>>>>>>> Notify
>>>>>>>>> ASL operator). It was found that this change caused more
problems
>>>>>>> than
>>>>>>>>> it
>>>>>>>>> solved and was removed by most users.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: owner-freebsd-acpi@freebsd.org [mailto:owner-freebsd-
>>>>>>>>>> acpi@freebsd.org] On Behalf Of Yousif Hassan
>>>>>>>>>> Sent: Sunday, January 06, 2008 12:18 PM
>>>>>>>>>> To: Nate Lawson
>>>>>>>>>> Cc: freebsd-acpi@FreeBSD.org
>>>>>>>>>> Subject: Re: GPE handler livelock
>>>>>>>>>>
>>>>>>>>>> Nate wrote:
>>>>>>>>>>> Thanks for digging into this.  I reviewed this and am trying
to
>>>>>>>>> figure
>>>>>>>>>>> out why the _L00 handler never completes.  It keeps getting
>>>>>>> preempted
>>>>>>>>> by
>>>>>>>>>>> the next one.  To help track this down, try removing these
two
>>>>>>> lines
>>>>>>>>>>> from the _L00 method and recompile your ASL:
>>>>>>>>>>>
>>>>>>>>>>>    Acquire (\_TZ.C173, 0xFFFF)
>>>>>>>>>>>    ...
>>>>>>>>>>>    Release (\_TZ.C173)
>>>>>>>>>>>
>>>>>>>>>>> For others who have this problem, instructions on how to
>>>>>>>>>>> recompile
>>>>>>>>> and
>>>>>>>>>>> load your custom ASL can be found here (11.16.4 and 5):
>>>>>>>>>>>
>>>>>>> http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/acpi-
>debug.htm
>>>>>>>
>>>>>>>>> l
>>>>
>>
>>
>
>
>-----------------------------------------------------------------------
----
>-----
>
>
>> Index: sys/dev/acpica/Osd/OsdSchedule.c
>> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>> RCS file: /home/ncvs/src/sys/dev/acpica/Osd/OsdSchedule.c,v
>> retrieving revision 1.39
>> diff -u -r1.39 OsdSchedule.c
>> --- sys/dev/acpica/Osd/OsdSchedule.c 22 Mar 2007 18:16:41 -0000 1.39
>> +++ sys/dev/acpica/Osd/OsdSchedule.c 7 Jan 2008 23:46:47 -0000
>> @@ -114,7 +114,7 @@
>>  pri =3D 5;
>>  break;
>>     case OSL_NOTIFY_HANDLER:
>> - pri =3D 3;
>> + pri =3D 10; // XXX testing
>>  break;
>>     case OSL_DEBUGGER_THREAD:
>>  pri =3D 0;
>> Index: sys/contrib/dev/acpica/evgpe.c
>> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>> RCS file: /home/ncvs/src/sys/contrib/dev/acpica/evgpe.c,v
>> retrieving revision 1.1.1.12
>> diff -u -r1.1.1.12 evgpe.c
>> --- sys/contrib/dev/acpica/evgpe.c 22 Mar 2007 17:23:30 -0000
1.1.1.12
>> +++ sys/contrib/dev/acpica/evgpe.c 7 Jan 2008 21:17:28 -0000
>> @@ -123,6 +123,10 @@
>>
>> /* Local prototypes */
>>
>> +static void
>> +AcpiEvAsynchEnableGpe (
>> +    void                    *Context);
>> +
>> static void ACPI_SYSTEM_XFACE
>> AcpiEvAsynchExecuteGpeMethod (
>>     void                    *Context);
>> @@ -684,14 +688,26 @@
>>         }
>>     }
>>
>> -    if ((LocalGpeEventInfo.Flags & ACPI_GPE_XRUPT_TYPE_MASK) =3D=3D
>> +    /* Defer enabling of GPE until all notify handlers are done */
>> +    AcpiOsExecute(OSL_NOTIFY_HANDLER, AcpiEvAsynchEnableGpe,
>> GpeEventInfo);
>> +    return_VOID;
>> +}
>> +
>> +static void
>> +AcpiEvAsynchEnableGpe (
>> +    void                    *Context)
>> +{
>> +    ACPI_GPE_EVENT_INFO     *GpeEventInfo =3D (void *) Context;
>> +    ACPI_STATUS             Status;
>> +
>> +    if ((GpeEventInfo->Flags & ACPI_GPE_XRUPT_TYPE_MASK) =3D=3D
>>             ACPI_GPE_LEVEL_TRIGGERED)
>>     {
>>         /*
>>          * GPE is level-triggered, we clear the GPE status bit after
>>          * handling the event.
>>          */
>> -        Status =3D AcpiHwClearGpe (&LocalGpeEventInfo);
>> +        Status =3D AcpiHwClearGpe (GpeEventInfo);
>>         if (ACPI_FAILURE (Status))
>>         {
>>             return_VOID;
>> @@ -700,7 +716,7 @@
>>
>>     /* Enable this GPE */
>>
>> -    (void) AcpiHwWriteGpeEnableReg (&LocalGpeEventInfo);
>> +    (void) AcpiHwWriteGpeEnableReg (GpeEventInfo);
>>     return_VOID;
>> }
>>
>>
>
>_______________________________________________
>freebsd-acpi@freebsd.org mailing list
>http://lists.freebsd.org/mailman/listinfo/freebsd-acpi
>To unsubscribe, send any mail to "freebsd-acpi-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?B28E9812BAF6E2498B7EC5C427F293A403FE5D1C>