Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 11 Jul 2002 10:21:27 -0400 (EDT)
From:      Mark Hennessy <mark@cloud9.net>
To:        <freebsd-stable@FreeBSD.ORG>
Subject:   Re: Cistron 1.6.6 freeze-ups (fwd)
Message-ID:  <20020711101547.A89350-100000@earl-grey.cloud9.net>

next in thread | raw e-mail | index | archive | help
Hi all.  I'm using FreeBSD 4.5 RELEASE, and I was wondering if any of the
following message makes sense.

Basically, I'm running Cistron RADIUS 1.6.6 among other things on my
FreeBSD box and as of late have run into a problem where the
authentication process seems to die for no apparent reason.  This has
been happening every few hours for a few days now and I was wondering if
there was any advice on how to further troubleshoot/resolve the problem.
The only way to bring the process back to life is to kill radiusd off and
restart it fresh.  They seem to think that it's a UDP problem in FreeBSD
but have no other ideas.

Any comments or ideas would be appreciated.

--
 Mark P. Hennessy					      mark@cloud9.net

---------- Forwarded message ----------
Date: Tue, 9 Jul 2002 18:55:44 +0200 (CEST)
From: Emile van Bergen <emile-cistron@evbergen.xs4all.nl>
Reply-To: cistron-radius@lists.cistron.nl
To: "cistron-radius@lists.cistron.nl" <cistron-radius@lists.cistron.nl>
Subject: Re: Cistron 1.6.6 freeze-ups

On Tue, 9 Jul 2002, Mark Hennessy wrote:

> > > Still nothing from ktrace.out (strace doesn't compile, so I use the
> > > included FreeBSD ktrace)
> >
> > What does ps show as the syscall the processes are sleeping on (look at
> > the 'wchan' column)? What's the process state? Are they STOPed, for
> > whatever reason, eg. because of a failed trace attempt (state T)?
>
>
> UID   PID  PPID CPU PRI NI   VSZ  RSS WCHAN  STAT  TT       TIME COMMAND
> 0 25033     1   0   2  0  1904 1052 sbwait Is    ??    0:10.64 /usr/local/sbin/radiusd <snip>
> 0 25034 25033   0   2  0  1776  996 select S     ??    0:00.68 /usr/local/sbin/radiusd <snip>

Hmm, it's in sbwait, which seems to be a kernel routine that is called
when recvfrom() blocks. But if the server would block there, it would
become runnable again as soon as something comes in (which does happen,
see below), and that doesn't happen.

I don't know if it's related, but there is a bug in NetBSD (you use
FreeBSD, I know) that had something to do with the kernel looping in
sbwait, look at http://www.geocrawler.com/archives/3/495/2000/5/50/3799097/

> The server hasn't been bound to any specific outside interface at the
> commandline.
>
> netstat -a | head -2 ; netstat -a | grep rad
>
> Active Internet connections (including servers)
> Proto Recv-Q Send-Q  Local Address          Foreign Address        (state)
> udp4       0      0  *.radacct              *.*
> udp4   14238      0  *.radius               *.*
         ^^^^^

This is an interesting number. Apparently the kernel receives the
packets allright, but doesn't pass them on to Cistron for whatever
reason.

It looks like you have a FreeBSD bug...?

Cheers,


Emile.

--
E-Advies / Emile van Bergen   |   e-advies@evbergen.xs4all.nl
tel. +31 (0)70 3906153        |   http://www.e-advies.info



-
List info/subscribe/unsubscribe? See http://www.radius.cistron.nl/list/


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020711101547.A89350-100000>