Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 26 Feb 96 08:46 WET
From:      uhclem@nemesis.lonestar.org (Frank Durda IV)
To:        freebsd-bugs@freefall.freebsd.org
Subject:   Re: i386/1042: Warning from sio driver reports wrong device	FDIV045
Message-ID:  <m0tr4Bj-000DDbC@nemesis.lonestar.org>

next in thread | raw e-mail | index | archive | help

[1]From: Bruce Evans <bde@zeta.org.au>
[1]The following reply was made to PR i386/1042; it has been noted by GNATS.

Well I guess this additional input won't be in there...   :-)



[0]Ports sio0, sio2, sio3 connected to modems, sio1 not connected to anything
[0]                                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[0]Feb 25 19:48:00 nemesis /kernel: sio1: 247 more tty-level buffer overflows (total 3100)
 
[0]Note that the system reports the problem on sio1, when there is nothing
[0]connected to that port.  That actual overrun probably occurred on sio0
[0]or sio3.
 
[1]This may be caused by sio1 picking up radiation from the other ports.
[1]It shouldn't occur if sio1 isn't open, however (then the UART may be
[1]kept busy by the radiation but the driver ignores it).  The radiation
[1]problem can usually be fixed by connecting the port to something (even
[1]something inactive).

This is not the case.  One of the WorldBlazers was connected to
sio1 but was not taking calls when the problem was noticed.  To simplify
the error report, the modems were moved off the ports to see if the
problem moved.   The error message did not follow the modems.
No matter what modem or modems were busy, the error message always
said sio1.
 

[1]The verbose error reporting can take long enough to interfere with the
[1]reception of futher data :-(.  Errors were once reported every clock
[1]tick (the rc driver still does this) and slow machines take more than
[1]one clock tick to report an error so the first error triggered an
[1]endless cascade of errors.

They usually appear at the end of a call (including successful calls), so
that also seems unlikely.  
 

[0]Another interesting thing is that the Cardinal modem is V.34 and receives
[0]compressed news at rates up to 3100CPS, but never appears to cause
[0]these overruns.  The Telebits (Turbo PEP or PEP) only manage between
[0]1600 and 2100 CPS and they do experience these overruns when the DTE
[0]is set to 57600.  There are no overruns when the Worldblazers are fixed
[0]at 38400.
 
[1]Do the Telebits honour flow control?

Yes.  Several types.  I currently have 

WorldBlazer - SA - Version LA7.05C- Active Configuration
 B1  E0  L1  M0  Q2  T   V1  X2  Y0 
&C1 &D3 &G0 &J0 &L0 &Q1 &R3 &S1 &T4 &X0 
S000:1   S001=0   S002:128 S003=13  S004=10  S005=8   S006=2   S007=60 
S008=2   S009=6   S010=14  S011=70  S012=50  S018=0   S025=5   S026=1  
S038=0   S041:10  S045=0   S046=0   S047=4   S048:1   S050:255 S051:6  
S056=17  S057=19  S058:2   S059:15  S060=0   S061:0   S062=15  S063=0  
S064:1   S068=255 S069=0   S090=0   S092:1   S093=8   S094=1   S100=0  
S104=0   S105=1   S111:0   S112=1   S113=126 S114=0   S115=0   S116=0  
S119=0   S151=4   S155=0   S180=2   S181=1   S183=25  S190=1   S191=7  
S253=10  S254=255 S255=255 
OK

S68 == 255 which says look at setting in S58
S58 == 2 which says Use full-duplex RTS/CTS flow control.  When RTS is
off, the modem does not send data to the local DTE.  When RTS is on,
the modem sends data to the local DTE.  When CTS is ON, the modem can
accept data; when CTS is off, the modem cannot accept data.

 
[0]So the problems appear to be:
[0]1.	Faulty reporting of the guilty device in the kernel warning message.
[0]	It seems to always blame sio1 regardless of what lines are active.
 
[1]Probably not.

See above.  This is happening when a device is present but inactive.
No line noise.
 

[0]2.	There doesn't appear to be any documentation on what the kernel
[0]	error message is trying to report.
 
[1]See the sio man page.

Ok, how do I "fix the application?" (uucico) as the man page cryptically
suggests when the application runs fine on other ports running at 57600
with faster DCE data rates (28.8 vs 24)?  If we have the equivalent of
a TTYHOG or CLIST limits, where are they controlled?  (I have already
looked at the LINT kernel.)
 

[0]	Reducing the FIFO interrupt trigger did not help, implying a
[0]	different type of overrun in the kernel instead of a hardware FIFO
[0]	overrun.  Because PEP tends to return data in bursts of 64 bytes,
[0]	perhaps some software-based buffer is being overrun.
 
[1]The raw queue has a size of only 1024 at all baud rates so it is quite
[1]easy to overrun at high baud rates.  At 115200 bps, 1024 bytes may arrive
[1]in less than one process scheduling quantum (100 msec) so there the buffer
[1]is too small if there are 2 hog processes.  Flow control had better work.

Problem was repeatedly demonstrated with only one modem active.  No other
serial activity seen.

 
[0]	Since there appears to be code in sio.c that would detect overruns
[0]	in the hardware FIFO, report this  and lower the trigger value
[0]	automatically, either this code isn't working or this isn't the
[0]	type of overrun the kernel is trying to report.  Again, no
[0]	documentation.
 
[1]That code has almost always been disabled and doesn't exist in -current.
[1]It tended to drop the trigger level to 1 for transient errors.

Great. :-(  It was one of the few parts of the driver that wasn't 
goofy.  The figure-out-what-type-of-UART-you-have is a bad idea and can
freak on many internal modems.  "Reserved" and "0" for apparently-unused
bits mean different things to different silicon vendors.

 
[0]3.	When the kernel message is displayed, it usually is displayed three
[0]	times in a row, all with the same timestamp.  It only appears once
[0]	in /var/log/messages.
 
[1]Messages are normally repeated for each root login.
 

Ok, all that said, flow control is working (although it may not respond
as fast as our take-it-to-the-edge scheme can stand), but it still sounds
like the serial buffer resources are too limited.   Geez, I was running
four modems like this on a 8MHz 68000 system years ago and not having this
problem.  


Frank Durda IV <uhclem@nemesis.lonestar.org>|"The Knights who say "LETNi"
or uhclem%nemesis@rwsystr.nkn.net           | demand...  A SEGMENT REGISTER!!!"
					    |"A what?"
or ...letni!rwsys!nemesis!uhclem	    |"LETNi! LETNi! LETNi!"  - 1983




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?m0tr4Bj-000DDbC>