Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 04 Nov 2005 03:59:32 +0600
From:      Victor Snezhko <snezhko@indorsoft.ru>
To:        freebsd-current@freebsd.org
Cc:        Max Laier <max@love2party.net>
Subject:   Re: CURRENT + amd64 + user-ppp = panic
Message-ID:  <uk6fptmt7.fsf@indorsoft.ru>
In-Reply-To: <200511031500.00839.jhb@freebsd.org> (John Baldwin's message of "Thu, 3 Nov 2005 14:59:59 -0500")
References:  <20051027022313.R675@kushnir1.kiev.ua> <200511030059.05946.max@love2party.net> <uek5xv8d4.fsf@indorsoft.ru> <200511031500.00839.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
John Baldwin <jhb@freebsd.org> writes:

>> (kgdb) up 11
>> #11 0xc066e0c2 in softclock (dummy=0x0) at
>> /usr/src/sys/kern/kern_timeout.c:220 220				if (c->c_time != curticks) {
>> (kgdb) list
>> 215			curticks = softticks;
>> 216			bucket = &callwheel[curticks & callwheelmask];
>> 217			c = TAILQ_FIRST(bucket);
>> 218			while (c) {
>> 219				depth++;
>> 220				if (c->c_time != curticks) {
>> 221					c = TAILQ_NEXT(c, c_links.tqe);
>> 222					++steps;
>> 223					if (steps >= MAX_SOFTCLOCK_STEPS) {
>> 224						nextsoftcheck = c;
>> (kgdb) print *bucket
>> $1 = {tqh_first = 0xc1891d80, tqh_last = 0xc1891d80}
>> (kgdb) print c
>> $2 = (struct callout *) 0xdeadc0de
>> (kgdb) print *(bucket->tqh_first)
>> $3 = {c_links = {sle = {sle_next = 0xdeadc0de}, tqe = {tqe_next =
>> 0xdeadc0de, tqe_prev = 0xdeadc0de}}, c_time = -559038242, c_arg =
>> 0xdeadc0de, c_func = 0xdeadc0de, c_mtx = 0xdeadc0de, c_flags = -559038242}
>> (kgdb) print steps
>> $4 = 1
>
> Well, from thus it seems that a callout was free'd while it was still on the 
> list.  Perhaps there is a case wehre callout_stop() isn't called.  Also, 
> callout_drain() should really be used.  If the callout function is rearming, 
> then it might have been running when callout_stop() is called, and it could 
> have rearmed itself and then been overwritten when it was freed.  In fact, 
> that is likely your problem.  You can try this patch, but there might be lock 
> order problems that would require the callout_drain() to happen later when 
> locks aren't held:
>
> Index: nd6.c
> ===================================================================
> RCS file: /usr/cvs/src/sys/netinet6/nd6.c,v
> retrieving revision 1.62
> diff -u -r1.62 nd6.c
> --- nd6.c       22 Oct 2005 05:07:16 -0000      1.62
> +++ nd6.c       3 Nov 2005 19:56:42 -0000
> @@ -398,7 +398,7 @@
>         if (tick < 0) {
>                 ln->ln_expire = 0;
>                 ln->ln_ntick = 0;
> -               callout_stop(&ln->ln_timer_ch);
> +               callout_drain(&ln->ln_timer_ch);
>         } else {
>                 ln->ln_expire = time_second + tick / hz;
>                 if (tick > INT_MAX) {

Hmmm, no, this patch didn't change anything for me. The same trap, the
same bucket full of garbage.

Tomorrow I'll try to trace all callout-related operations in nd6
and/or the whole netinet6. 

If there are more thoughts - I'll be happy to test.

-- 
WBR, Victor V. Snezhko
EMail: snezhko@indorsoft.ru





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?uk6fptmt7.fsf>