Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 21 Jul 2005 13:02:01 +0200
From:      =?ISO-8859-1?Q?Eirik_=D8verby?= <ltning@anduin.net>
To:        Robert Watson <rwatson@FreeBSD.org>
Cc:        stable@FreeBSD.org, Kris Kennaway <kris@obsecurity.org>
Subject:   Re: Serious issue with serial console in 5.4
Message-ID:  <86DDD9F6-A086-48E2-A5C5-1F5EA1C49354@anduin.net>
In-Reply-To: <20050721110222.U97888@fledge.watson.org>
References:  <20050721050048.GU22430@xor.obsecurity.org> <00DD4399-4317-4579-82C4-5B64AC3F800B@anduin.net> <20050721110222.U97888@fledge.watson.org>

next in thread | previous in thread | raw e-mail | index | archive | help

On Jul 21, 2005, at 12:16 PM, Robert Watson wrote:

>
> On Thu, 21 Jul 2005, Eirik =D8verby wrote:
>
>
>>>> The above panic will show up occasionally when logging out from a
>>>> serial console (i.e. ctrl-D, logout, exit, whatever). This is
>>>> EXTREMELY BAD, as it will crash an otherwise perfectly healthy =20
>>>> box at
>>>> random - and renders the serial console useless.
>>>> Robert Watson confirmed this to be an issue on the 10th of April.
>>>>
>>> You might have to wait until 6.0-R since fixing it seems to =20
>>> require infrastructure changes that cannot easily be backported =20
>>> to 5.x.
>>>
>>
>> With all due respect - if this is (and I'm assuming it is, because =20=

>> it happens on all the servers I'm serial-controlling) an =20
>> omnipresent problem on 5.x, I daresay it should warrant some more =20
>> attention. Having unsafe serial terminal support that can bring =20
>> down your system like that defies much of the point of having =20
>> serial terminal support in the first place.
>>
>> However, since I seem to be the only one who has noticed this, =20
>> perhaps I'm the last person on earth to routinely use serial =20
>> terminal switches instead of KVM switches to do my admin work?
>>
>
> The concern about the 5.x backport is that it will break parts of =20
> the device driver ABI, and is a significant change that involves a =20
> lot of risk.
>
> Regarding the general prevalence of the problem -- I've seen a =20
> small number of people reporting it's a big problem.  Since I know =20
> of a great many people running with serial consoles (other than a =20
> workstation, I never run FreeBSD boxes any other way), this leads =20
> me to believe it's something that shows up in fairly specific =20
> conditions -- perhaps relating to precise timing of a race =20
> condition.  This means that if we introduce a generally =20
> destabilizing change, it may impact more people than the problem as =20=

> it exists (a nasty trade-off).
>
> I've only seen the issue when logging out of a serial console =20
> session, and had previously hypothesized that it had to do with the =20=

> simultaneous timing of a console message from syslog and the =20
> opening/closing of the console's tty due to logging out and getty =20
> restarting, resulting in a reference count improperly hitting zero.

I did indeed make some changes to my syslog configuration after =20
getting the serials online. Your theory might not be entirely off.
Let me know if I should post my syslog.conf file or anything else =20
here or elsewhere...

Thanks,
/Eirik


> I thought Doug White had come up with a work-around patch that =20
> prevented the reference count from being allowed to hit 0 for the =20
> console by artificially elevating it, which would prevent the =20
> panic, so either (a) the work around wasn't committed, or (b) it =20
> didn't work.
>
> I can attempt to take another look at this problem in a week or so, =20=

> but have a number of things I need to finish up for FreeBSD 6.0 =20
> before then that will be occupying my time.
>
> Robert N M Watson




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?86DDD9F6-A086-48E2-A5C5-1F5EA1C49354>