From owner-cvs-src@FreeBSD.ORG Tue Oct 17 17:57:54 2006 Return-Path: X-Original-To: cvs-src@FreeBSD.org Delivered-To: cvs-src@FreeBSD.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 28A7A16A412; Tue, 17 Oct 2006 17:57:54 +0000 (UTC) (envelope-from jmg@hydrogen.funkthat.com) Received: from hydrogen.funkthat.com (gate.funkthat.com [69.17.45.168]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5A26143D46; Tue, 17 Oct 2006 17:57:53 +0000 (GMT) (envelope-from jmg@hydrogen.funkthat.com) Received: from hydrogen.funkthat.com (3957ujberr5fq1yg@localhost.funkthat.com [127.0.0.1]) by hydrogen.funkthat.com (8.13.6/8.13.3) with ESMTP id k9HHvqQ4063124; Tue, 17 Oct 2006 10:57:52 -0700 (PDT) (envelope-from jmg@hydrogen.funkthat.com) Received: (from jmg@localhost) by hydrogen.funkthat.com (8.13.6/8.13.3/Submit) id k9HHvqh1063123; Tue, 17 Oct 2006 10:57:52 -0700 (PDT) (envelope-from jmg) Date: Tue, 17 Oct 2006 10:57:52 -0700 From: John-Mark Gurney To: Oleg Bulyzhin Message-ID: <20061017175752.GI23971@funkthat.com> References: <200610091615.k99GFuPD054744@repoman.freebsd.org> <20061016081442.GA344@lath.rinet.ru> <20061016170517.GF23971@funkthat.com> <20061017105330.GC20789@lath.rinet.ru> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="PmA2V3Z32TCmWXqI" Content-Disposition: inline In-Reply-To: <20061017105330.GC20789@lath.rinet.ru> User-Agent: Mutt/1.4.2.1i X-Operating-System: FreeBSD 5.4-RELEASE-p6 i386 X-PGP-Fingerprint: B7 EC EF F8 AE ED A7 31 96 7A 22 B3 D8 56 36 F4 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html Cc: cvs-src@FreeBSD.org, src-committers@FreeBSD.org, cvs-all@FreeBSD.org Subject: Re: cvs commit: src/sys/dev/pci pci.c pci_if.m pci_private.h pcivar.h src/sys/dev/sk if_sk.c if_skreg.h X-BeenThere: cvs-src@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: John-Mark Gurney List-Id: CVS commit messages for the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 Oct 2006 17:57:54 -0000 --PmA2V3Z32TCmWXqI Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Oleg Bulyzhin wrote this message on Tue, Oct 17, 2006 at 14:53 +0400: > On Mon, Oct 16, 2006 at 10:05:17AM -0700, John-Mark Gurney wrote: > > Oleg Bulyzhin wrote this message on Mon, Oct 16, 2006 at 12:14 +0400: > > > On Mon, Oct 09, 2006 at 04:15:56PM +0000, John-Mark Gurney wrote: > > > > jmg 2006-10-09 16:15:56 UTC > > > > > > > > FreeBSD src repository > > > > > > > > Modified files: > > > > sys/dev/pci pci.c pci_if.m pci_private.h pcivar.h > > > > sys/dev/sk if_sk.c if_skreg.h > > > > Log: > > > > provide routines to access VPD data at the PCI layer... > > > > > > > > remove sk's own implementation, and use the new calls to get the data... > > > > > > > > Reviewed by: -arch > > > > > > > > Revision Changes Path > > > > 1.315 +339 -3 src/sys/dev/pci/pci.c > > > > 1.9 +13 -0 src/sys/dev/pci/pci_if.m > > > > 1.18 +4 -0 src/sys/dev/pci/pci_private.h > > > > 1.71 +34 -0 src/sys/dev/pci/pcivar.h > > > > 1.131 +7 -148 src/sys/dev/sk/if_sk.c > > > > 1.39 +0 -31 src/sys/dev/sk/if_skreg.h > > > > > > I have problem with my test machine since this commit: > > > kernel is panicing on boot if i have my pci bge(4) NIC plugged in. > > > > > > Last kernel messages are: > > > pci1: physical bus=1 > > > pci1:2:0: bad VPD cksum, remain 244 > > > > > > Invoking ddb after panic gives this backtrace: > > > [skipped] > > > pci_read_vpd > > > pci_read_extcap > > > pci_read_device > > > pci_add_children > > > acpi_pci_attach > > > device_attach > > > [skipped] > > > > > > (i'm unable to get crashdump) > > > > > > If i unplug bge, kernel boots just fine. > > > > > > P.S. i can provide any additional info needed and can test patches. > > > > Can you get a line number from pci_read_vpd? Even if you can't get a > > crash dump, you can use addr2line (or kgdb) w/ the ip of the panic... > > That would help.. > > > > Looks like some manufacturers aren't following the PCI standard.. :( > > > > -- > > John-Mark Gurney Voice: +1 415 225 5579 > > > > "All that I will do, has been done, All that I have, has not." > > Okay, there is some details: > > When kernel boots with bge plugged in: > 1) message about bad VPD cksum > 2) then kernel seems to hang for about 30 seconds (i'm not able to break into > DDB) > 3) then i get panic. > > Panic message is about trap inside 'swapper' process (backtrace shows 2 traps > first inside pci_read_vpd, next one in swapper). > > 1st trap is at pci_read_vpd+0x2c6 it is /usr/src/sys/dev/pci/pci.c:665 > I've added debug printf there: > case 3: /* VPD-R Keyword Value */ > printf("off: %d i: %d\n", off, i); > cfg->vpd.vpd_ros[off].value[i++] = byte; > > It seems that 30 seconds delay is memory rewriting (off is constant, while i > is incrementing). I get panic message after: > off:5 i:15407 I just realized that if the card puts a length of zero there, we will end up wrapping dflen.. This should not happened, definately not for RV which is required to be at least 1 byte in length... This will allow non-RV's to be zero bytes in length, but if an RV has zero bytes in length, we'll treat it as a checksum failure and remove all read-only VPD data, and not continue on processing read-write data... Here is an updated patch that should handle your card properly... -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." --PmA2V3Z32TCmWXqI Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="vpd.patch" Index: pci.c =================================================================== RCS file: /home/ncvs/src/sys/dev/pci/pci.c,v retrieving revision 1.315 diff -u -r1.315 pci.c --- pci.c 9 Oct 2006 16:15:55 -0000 1.315 +++ pci.c 17 Oct 2006 17:56:15 -0000 @@ -653,12 +653,36 @@ cfg->vpd.vpd_ros[off].keyword[0] = byte; cfg->vpd.vpd_ros[off].keyword[1] = vpd_nextbyte(&vrs); dflen = vpd_nextbyte(&vrs); - cfg->vpd.vpd_ros[off].value = malloc((dflen + 1) * - sizeof *cfg->vpd.vpd_ros[off].value, - M_DEVBUF, M_WAITOK); + if (dflen == 0 && + strncmp(vpd.vpd_ros[off], "RV", 2) == 0) { + /* + * if this happens, we can't trust the rest + * of the VPD. + */ + printf("pci%d:%d:%d: bad keyword length: %d\n", + cfg->bus, cfg->slot, cfg->func, dflen); + cksumvalid = 0; + end = 1; + break; + } else if (dflen == 0) { + cfg->vpd.vpd_ros[off].value = malloc(1 * + sizeof *cfg->vpd.vpd_ros[off].value, + M_DEVBUF, M_WAITOK) + cfg->vpd.vpd_ros[off].value[0] = '\x00'; + } else + cfg->vpd.vpd_ros[off].value = malloc( + (dflen + 1) * + sizeof *cfg->vpd.vpd_ros[off].value, + M_DEVBUF, M_WAITOK); remain -= 3; i = 0; - state = 3; + /* keep in sync w/ state 3's transistions */ + if (dflen == 0 && remain == 0) + state = 0; + else if (dflen == 0) + state = 2; + else + state = 3; break; case 3: /* VPD-R Keyword Value */ @@ -673,10 +697,13 @@ cfg->bus, cfg->slot, cfg->func, vrs.cksum); cksumvalid = 0; + end = 1; + break; } } dflen--; remain--; + /* keep in sync w/ state 2's transistions */ if (dflen == 0) cfg->vpd.vpd_ros[off++].value[i++] = '\0'; if (dflen == 0 && remain == 0) { @@ -736,6 +763,15 @@ break; } } + + if (cksumvalid == 0) { + /* read-only data bad, clean up */ + for (; off; off--) + free(cfg->vpd.vpd_ros[off].value, M_DEVBUF); + + free(cfg->vpd.vpd_ros, M_DEVBUF); + cfg->vpd.vpd_ros = NULL; + } #undef REG } --PmA2V3Z32TCmWXqI--