From owner-freebsd-questions@FreeBSD.ORG Wed Apr 30 18:43:31 2003 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1394237B401; Wed, 30 Apr 2003 18:43:31 -0700 (PDT) Received: from net2.dinoex.sub.org (net2.dinoex.de [212.184.201.182]) by mx1.FreeBSD.org (Postfix) with ESMTP id E7AD043F3F; Wed, 30 Apr 2003 18:43:28 -0700 (PDT) (envelope-from pmc@citylink.dinoex.sub.org) Received: from net2.dinoex.sub.org (uucp@net2.dinoex.de [212.184.201.182]) by net2.dinoex.sub.org (8.12.9/8.12.9) with ESMTP id h411hKgx017127; Thu, 1 May 2003 03:43:22 +0200 (CEST) (envelope-from pmc@citylink.dinoex.sub.org) X-Authentication-Warning: net2.dinoex.sub.org: Host uucp@net2.dinoex.de [212.184.201.182] claimed to be net2.dinoex.sub.org Received: from citylink.dinoex.sub.org (uucp@localhost) h411hJYY017126; Thu, 1 May 2003 03:43:19 +0200 (CEST) (envelope-from pmc@citylink.dinoex.sub.org) Received: from citylink.dinoex.sub.de by citylink.dinoex.sub.org (8.8.5/PMuch-B3b) with ESMTP id CAA05101; Thu, 1 May 2003 02:07:32 +0200 (CEST) Received: from gate.oper.dinoex.org (localhost [127.0.0.1]) h410Ei1i001544; Thu, 1 May 2003 02:14:45 +0200 (CEST) (envelope-from pmc@disp.oper.dinoex.org) Received: from disp.oper.dinoex.org (disp-e [192.168.98.5]) by gate.oper.dinoex.org (8.12.6/8.12.6) with ESMTP id h410DWAj001533; Thu, 1 May 2003 02:13:38 +0200 (CEST) (envelope-from pmc@disp.oper.dinoex.org) Received: (from pmc@localhost) by disp.oper.dinoex.org (8.11.6/8.11.6) id h410BQw20930; Thu, 1 May 2003 02:11:26 +0200 (CEST) (envelope-from pmc) From: Peter Much Message-Id: <200305010011.h410BQw20930@disp.oper.dinoex.org> To: freebsd-questions@freebsd.org, freebsd-net@freebsd.org Date: Thu, 1 May 2003 02:11:25 +0200 (CEST) X-Mailer: ELM [version 2.5 PL5] MIME-Version: 1.0 Content-Type: text/plain; charset=DISPLAY Content-Transfer-Encoding: 8bit Subject: rsh crashes 4.7 kernel, help needed X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 May 2003 01:43:31 -0000 Hi all, I am stuck with this one: My gateway machine, a 486dx66 running PPPoE/natd/ipfw on a 4.7.0-RELEASE installation, runs totally stable for weeks, except when moving large amounts of data out of the machine via rsh (my backup routines do that). Then, not too often, but repeatabe, kernel crashs do happens from rsh. It seems that the crashes do happen especially if the data flow is repeatedly interrupted for longer times -during tape movement etc.-, but that is only a supposition. The stack trace always shows the same functions, ending in m_copym(). While it may be true that there is no need to do full backups from a gateway machine, I am still unhappy about this effect, and would rather like to fix it. Therefore, I made room for crashdumps, built a debug-kernel, and activated INVARIANTS for m_copym(). Now the gdb shows the attached output. It does not seem to me that m_copym() should be called with 0 as the first parameter, but when looking into tcp_output(), I quit: this is too large and complicated for me to understand. The network card that would transfer the data, is the following one, and AFAIK there are no known issues with it: ed0 at port 0x300-0x31f iomem 0xd8000-0xdbfff irq 10 on isa0 ed0: address 00:00:c0:30:b7:2f, type WD8013EPC (16 bit) So my question is, what to do now. Input is very much appreciated. rgds, PMc ---------------------------------------------------------------- initial pcb at physical address 0x003b7b20 panicstr: m_copym, length > size of mbuf chain panic messages: --- panic: m_copym, length > size of mbuf chain syncing disks... 4 done Uptime: 3d17h17m38s dumping to dev #da/1, offset 480 dump 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 --- #0 dumpsys () at ../../kern/kern_shutdown.c:487 487 if (dumping++) { (kgdb) bt full #0 dumpsys () at ../../kern/kern_shutdown.c:487 error = 0 #1 0xc01a56a7 in boot (howto=256) at ../../kern/kern_shutdown.c:316 howto = 256 #2 0xc01a5ae9 in panic (fmt=0xc032f6a0 "m_copym, length > size of mbuf chain") at ../../kern/kern_shutdown.c:595 fmt = 0xc032f6a0 "m_copym, length > size of mbuf chain" bootopt = 256 buf = "m_copym, length > size of mbuf chain", '\000' #3 0xc01c1b1d in m_copym (m=0x0, off0=1448, len=920, wait=1) at ../../kern/uipc_mbuf.c:806 n = (struct mbuf *) 0xc06a8c00 np = (struct mbuf **) 0xc06a8c00 off = 0 top = (struct mbuf *) 0xc06a8c00 copyhdr = 0 #4 0xc0221cd5 in tcp_output (tp=0xc471ee40) at ../../netinet/tcp_output.c:612 tp = (struct tcpcb *) 0xc471ee40 so = (struct socket *) 0xc46d0cc0 len = 1448 win = 57920 off = 1448 flags = 16 error = 0 m = (struct mbuf *) 0xc06ec400 ip = (struct ip *) 0x0 ip6 = (struct ip6_hdr *) 0x0 th = (struct tcphdr *) 0x5a8 opt = "\001\001\b\n\001éå\004\0026JS\000\000\000\000äýÍÄGd\034À0\rmÄ\000+lÀ\000UnÀ\020þÍÄ" ipoptlen = 1448 optlen = 12 hdrlen = 52 idle = 0 sendalot = 1 taop = (struct rmxp_tao *) 0x5a8 tao_noncached = {tao_cc = 1, tao_ccsent = 0, tao_mssopt = 32} isipv6 = 0 #5 0xc0226445 in tcp_usr_send (so=0xc46d0cc0, flags=0, m=0xc06c2b00, nam=0x0, control=0x0, p=0xc4c4f100) at ../../netinet/tcp_usrreq.c:578 m = (struct mbuf *) 0xc06c2b00 control = (struct mbuf *) 0x0 s = 6422528 error = 0 inp = (struct inpcb *) 0x0 tp = (struct tcpcb *) 0xc471ee40 isipv6 = 0 ostate = 4 #6 0xc01c4403 in sosend (so=0xc46d0cc0, addr=0x0, uio=0xc4cdfed4, top=0xc06c2b00, control=0x0, flags=0, p=0xc4c4f100) at ../../kern/uipc_socket.c:609 mp = (struct mbuf **) 0xc06c2b00 m = (struct mbuf *) 0xc06c2b00 space = 29880 len = 0 resid = 0 clen = -1066652928 error = -999486272 s = 0 dontroute = 0 mlen = 2048 atomic = 0 #7 0xc01b7a70 in soo_write (fp=0xc0c4cbc0, uio=0xc4cdfed4, cred=0xc0b87780, flags=0, p=0xc4c4f100) at ../../kern/sys_socket.c:81 fp = (struct file *) 0x0 uio = (struct uio *) 0x0 so = (struct socket *) 0x0 #8 0xc01b4701 in dofilewrite (p=0xc4c4f100, fp=0xc0c4cbc0, fd=3, buf=0xbfbff5bc, nbyte=1024, offset=-1, flags=0) at ../../sys/file.h:162 error = -993726208 fp = (struct file *) 0xc0c4cbc0 cred = (struct ucred *) 0x0 p = (struct proc *) 0xc4c4f100 fp = (struct file *) 0xc0c4cbc0 offset = 0 auio = {uio_iov = 0xc4cdfeac, uio_iovcnt = 1, uio_offset = 1023, uio_resid = 0, uio_segflg = UIO_USERSPACE, uio_rw = UIO_WRITE, uio_procp = 0xc4c4f100} aiov = {iov_base = 0xbfbff9bc "\b", iov_len = 0} cnt = 1024 error = -993726208 ktriov = {iov_base = 0xc4cdfed8 "\001", iov_len = 3224186978} ktruio = {uio_iov = 0x0, uio_iovcnt = 1, uio_offset = -4268021561307097088, uio_resid = 0, uio_segflg = 3217029564, uio_rw = UIO_READ, uio_procp = 0xc0951800} didktr = 0 #9 0xc01b45ba in write (p=0xc4c4f100, uap=0xc4cdff80) at ../../kern/sys_generic.c:329 p = (struct proc *) 0xc4c4f100 uap = (struct write_args *) 0xc4cdff80 fp = (struct file *) 0xc0c4cbc0 error = -993132672 #10 0xc02eaa99 in syscall2 (frame={tf_fs = -1078001617, tf_es = -993198033, tf_ds = -1078001617, tf_edi = -1077938756, tf_esi = 1024, tf_ebp = -1077937348, tf_isp = -993132588, tf_ebx = -1077937732, tf_edx = 3, tf_ecx = 3, tf_eax = 4, tf_trapno = 7, tf_err = 2, tf_eip = 672005944, tf_cs = 31, tf_eflags = 663, tf_esp = -1077938816, tf_ss = 47}) at ../../i386/i386/trap.c:1175 params = 0xbfbff584 "\003" i = 0 callp = (struct sysent *) 0xc038d7a0 p = (struct proc *) 0xc4c4f100 orig_tf_eflags = 663 sticks = 60359 error = 0 narg = 3 args = {3, -1077938756, 1024, 0, 0, 0, 0, 0} have_mplock = 1 code = 4 #11 0xc02de205 in Xint0x80_syscall () No symbol table info available. #12 0x8048f84 in ?? () No symbol table info available. #13 0x8048ae9 in ?? () No symbol table info available.