From owner-freebsd-hackers Thu Feb 4 20:41:45 1999 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id UAA24067 for freebsd-hackers-outgoing; Thu, 4 Feb 1999 20:41:45 -0800 (PST) (envelope-from owner-freebsd-hackers@FreeBSD.ORG) Received: from skynet.ctr.columbia.edu (skynet.ctr.columbia.edu [128.59.64.70]) by hub.freebsd.org (8.8.8/8.8.8) with SMTP id UAA24054 for ; Thu, 4 Feb 1999 20:41:38 -0800 (PST) (envelope-from wpaul@skynet.ctr.columbia.edu) Received: (from wpaul@localhost) by skynet.ctr.columbia.edu (8.6.12/8.6.9) id XAA10060; Thu, 4 Feb 1999 23:48:21 -0500 From: Bill Paul Message-Id: <199902050448.XAA10060@skynet.ctr.columbia.edu> Subject: Re: Seen fxp or mbuf problems? To: julian@whistle.com (Julian Elischer) Date: Thu, 4 Feb 1999 23:48:11 -0500 (EST) Cc: hackers@FreeBSD.ORG In-Reply-To: <36BA4603.1CFBAE39@whistle.com> from "Julian Elischer" at Feb 4, 99 05:14:43 pm X-Mailer: ELM [version 2.4 PL24] Content-Type: text Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG Of all the gin joints in all the towns in all the world, Julian Elischer had to walk into mine and say: > Anyone seen bugs in fxp driver or mbuf related code recently? > > Here is a crash dump from a system about 10 days old (3.x series) > > We are willing to believe that we've done this (we do enough > networking stuff but I'm just looking to see if there > is anyone else that has seen this. > > julian > > #5 0xf015f320 in fxp_add_rfabuf (sc=0xf059ce00, oldm=0xf0390400) > at ../../pci/if_fxp.c:1535 [...] > MCLGET(m, M_DONTWAIT); <-------- error here. [...] > (kgdb) p mclfree > $3 = (union mcluster *) 0xa0225000 > > *cough* This means that either some code in the kernel has s stale pointer to an mbuf cluster and has modified it after it was released, or the Intel chip itself has been given the address of this cluster buffer and it DMA'ed data into it after it had been released. Unfortunately, the trashed buffer has already been reallocated by a call to MCLGET() immediately prior to this one; when you pull a cluster buffer off the free list, its first 4 bytes contains the address of the next buffer in the free list. (Well... I suppose it's 8 bytes on the alpha.) This address gets saved in mclfree and then the buffer gets handed out. If a buffer is trashed while it's sitting on the free list and then it gets reallocated, mclfree will be loaded with garbage, and the next time you call MCLGET(), hijinx will ensue. If you can reproduce the crash reliably, you might be able to catch mclfree getting clobbered by modifying the MCLGET() macro to test for 'reasonably sane' values when updating mclfree and then panic()ing if it spots an insane one. -Bill -- ============================================================================= -Bill Paul (212) 854-6020 | System Manager, Master of Unix-Fu Work: wpaul@ctr.columbia.edu | Center for Telecommunications Research Home: wpaul@skynet.ctr.columbia.edu | Columbia University, New York City ============================================================================= "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness" ============================================================================= To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message