Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 1 Sep 2010 12:53:04 -0500
From:      Brandon Gooch <jamesbrandongooch@gmail.com>
To:        Alexander Motin <mav@freebsd.org>
Cc:        freebsd-hackers@freebsd.org, FreeBSD-Current <freebsd-current@freebsd.org>
Subject:   Re: One-shot-oriented event timers management
Message-ID:  <AANLkTikcHn-SXk7Rx6NcSQKtBh-Df-g2pzeU%2BzHor9Vy@mail.gmail.com>
In-Reply-To: <4C7E2E8A.3030709@FreeBSD.org>
References:  <4C7A5C28.1090904@FreeBSD.org> <20100830110932.23425932@ernst.jennejohn.org> <4C7B82EA.2040104@FreeBSD.org> <20100830121148.11926306@ernst.jennejohn.org> <20100831102918.4f5404cc@ernst.jennejohn.org> <4C7CC1DE.1080907@FreeBSD.org> <4C7E2E8A.3030709@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Sep 1, 2010 at 5:44 AM, Alexander Motin <mav@freebsd.org> wrote:
> Alexander Motin wrote:
>> Gary Jennejohn wrote:
>>> On Mon, 30 Aug 2010 12:11:48 +0200
>>> OK, this is purely anecdotal, but I'll report it anyway.
>>>
>>> I was running pretty much all day with the patched kernel and things
>>> seemed to be working quite well.
>>>
>>> Then, after about 7 hours, everything just stopped.
>>>
>>> I had gkrellm running and noticed that it updated only when I moved the
>>> mouse.
>>>
>>> This behavior leads me to suspect that the timer interrupts had stopped
>>> working and the mouse interrupts were causing processes to get scheduled.
>>>
>>> Unfortunately, I wasn't able to get a dump and had to hit reset to
>>> recover.
>>>
>>> As I wrote above, this is only anecdotal, but I've never seen anything
>>> like this before applying the patches.
>>
>> One-shot timers have one weak side: if for some reason timer interrupt
>> getting lost -- there will be nobody to reload the timer. Such cases
>> probably will require special attention. Same funny situation with
>> mouse-driven scheduler happens also if LAPIC timer dies when pre-Core-iX
>> CPU goes to C3 state.
>
> I have reproduced the problem locally. It happens more often when ticks
> are not stopped on idle, like in your original case (or if explicitly
> enabled by kern.eventtimer.idletick sysctl).
>
> I've made some changes to HPET driver, which, I hope, should fix
> interrupt losses there.
>
> Updated patch: http://people.freebsd.org/~mav/timers_oneshot6.patch
>
> Patch also includes some optimizations to reduce lock contention.
>
> Thanks for testing.

This latest patch causes an interrupt storm with the HPET timer on my
system. The machine took about 8 minutes to boot and bring me to a
login prompt. System interactivity (i.e. input from keyboard, output
on console) was fine, but after checking the output of `systat vmstat
-1`, I saw the interrupt rate on each HPET entry was over 120k!

Can I provide any useful detail? Of course, test patches are always welcome :)

-Brandon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTikcHn-SXk7Rx6NcSQKtBh-Df-g2pzeU%2BzHor9Vy>