Date: Mon, 14 Jul 2003 17:29:36 -0700 From: "Andrew Kinney" <andykinney@advantagecom.net> To: John Fox <jjf@mind.net>, freebsd-hackers@freebsd.org Subject: Re: Kernel panic when moving lots of data over network Message-ID: <3F12E880.3304.39F3E27D@localhost> In-Reply-To: <20030709175336.GF5200@mind.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On 9 Jul 2003, at 10:53, John Fox wrote: > Strange problem on a new server we're setting up. It's very stable, > except when moving a large amount of data onto it via the network. I > begin moving approx 4GB of data onto it, and before the xfer can > complete, the system panics and reboots. (I am generally able to get > from 1 to 2 GB transferred before the panic occurrs.) > I'm not really a kernel hacker, but I've solved lots of our own kernel problems on 4.5 release, 4.7 release, and 4.8 release with the help of others on this list. We haven't had any problems exactly like what you described, but I seem to remember some open PRs relating to SSH and/or the xl network driver causing panics. You might want to browse through them and see if any match your situation. FWIW, though, we run 4.8-RELEASE, SSH, and the xl driver (3com 905C-TX, I believe) on one of our heavily used dual CPU machines and don't have any problems, so I'd be surprised if any of those PRs had any bearing on this. We don't do any large file transfers over SSH, though. We usually use rsync for that since we deal with lots of little files that get out of synch easily. > #6 0xc021745f in xl_newbuf () > #7 0xc021761e in xl_rxeof () > #8 0xc0219296 in xl_watchdog () > #9 0xc01b662f in if_slowtimo () > #10 0xc0180799 in softclock () Here's some slightly educated guesses that you'll want to eliminate until you isolate the trouble: 1. My experience is that a lot of "trap 12" seem to come from running out of some hard limited kernel resource. Try logging the sysctl vm.zone once a minute through cron to see if you're bumping any of those limits. You'll also want to try logging sysctl kvm_free in the same manner to make sure you're not running out of KVA or KVM. Our system is setup with 2GB KVA (default is 1GB) which solved all the trap 12 issues our system was having due to running out of KVA/KVM. 2. Check your RAM. Bad RAM caused us innumerable headaches from seemingly random trap 12 problems on one of our other systems. Usually hit on some buffer allocation, especially when that was the primary activity in RAM. SSH is especially sensitive to bad RAM. We could usually trigger a panic on a system with bad RAM just by excercising SSH a bit. 3. Some unknown or known problem with the xl driver and long file transfers over SSH. Check those PRs (sorry, don't know the numbers off hand). Sincerely, Andrew Kinney President and Chief Technology Officer Advantagecom Networks, Inc. http://www.advantagecom.net
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3F12E880.3304.39F3E27D>