Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 09 Apr 1997 14:26:58 +1000
From:      Adrian Carter <adrian@apic.net>
To:        questions@freebsd.org
Subject:   Weird Network Behaviour 
Message-ID:  <3.0.1.32.19970409142658.00792100@mail.apic.net>

next in thread | raw e-mail | index | archive | help
G'Day all.

Got a bit of a weird one here. We are an ISP running pretty much as a
FreeBSD house, with them being a mixture of 2.1.5's (only one machine, due
to the fact it is offsite and I havent had the time to go onsite and
upgrade) and 2.1.7's.

In the last 4 days the machine running 2.1.5 has been sporadically
'dropping data'. In that I mean, as an example, If youy try and send a
message via that machines SMTP port, communication works fine for the HELO,
MAIL FROM, and RCPT TO. The client then sends the DATA command, and
according to sendmail logs, the machine reports a 354 message and awaits
the data. However, the end-user program never see's the 354 message, and
eventually times out. If you use the same machine, with the same e-mail
message, and use a different SMTP host, it works fine. 

So, its a sendmail problem me thinks... so I recompile, reconfigure,
cajoule, coax, abuse sendmail for 2 days, all to no avail. During the
course of all this, its discovered that rcp'ing to the machine any large
file also timesout. However, in 90% of cases, small files and small e-mails
go through fine, but I dont see how this relates, because, the CLIENT is
not seeing the servers 354, which occurs BEFORE it sends the data to the
server, so at the stage that it is 'breaking' the server is unaware of the
size of the message.

So its not sendmail me says. So on I go with a kernel recompile. No change.
Shut down every service on teh machine except qpop and sendmail 8.8.5. No
change. Now the weirdest bit of all is, reboot the machine, it works ok for
a while, then 10 - 15 hours after rebooting, it starts doing it all again.

To through anotehr spanner into the works, it appears like another machine,
running 2.1.7-RELEASE, is doing a similar thing. Users are occasionally
reporting they FTP in as a real user, they get authenticated, but then they
get a timeout when ls'ing. Similar situations are happening with the web
services running on this machine. Access them direct on the LAN, not a
problem, however, remote users report that a lot of the time they get
broken pipe or timeout messages.

Wether the two problems are related I am not sure. They seem similar, but
the problem here is just to diverse. Im no un*x newbie, but its got me
buggered as to what the problem is. Even the network card's have been
replaced, the drop cable replaced and even increased RAM, and there is only
about a .1% collision rate on the ethernet segment. Its jsut so *WEIRD*
that it is so intermittent and is only related to certain services it
appears (qpop on the 2.1.5 machine with the sendmail problem works without
problems, even weeks after a reboot).

Any suggestions or ideas, however trivial, would be appreciated, as I have
pretty much exhausted all tests I can think off, and the problem still
exsists.

Thanks all

Adrian Carter
Sys Admin
The Asia Pacific Internet Company
--
*************************************************************************
*Adrian Carter                                	Email: adrian@apic.net	*
*Systems Administrator	 	              URL: http://www.apic.net/	*
*The Asia Pacific Internet Company Pty Ltd	Autoresp: info@apic.net	*
*        Internet Access, Web Housing, Mailing List Management		*
*           Phone: (+612) 9419-5133   Fax: (+612) 9419-5155			*
*************************************************************************



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3.0.1.32.19970409142658.00792100>