Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 18 Oct 2006 13:49:46 +0400
From:      Oleg Bulyzhin <oleg@FreeBSD.org>
To:        John-Mark Gurney <gurney_j@resnet.uoregon.edu>
Cc:        cvs-src@FreeBSD.org, src-committers@FreeBSD.org, cvs-all@FreeBSD.org
Subject:   Re: cvs commit: src/sys/dev/pci pci.c pci_if.m pci_private.h pcivar.h src/sys/dev/sk if_sk.c if_skreg.h
Message-ID:  <20061018094946.GA37107@lath.rinet.ru>
In-Reply-To: <20061017175752.GI23971@funkthat.com>
References:  <200610091615.k99GFuPD054744@repoman.freebsd.org> <20061016081442.GA344@lath.rinet.ru> <20061016170517.GF23971@funkthat.com> <20061017105330.GC20789@lath.rinet.ru> <20061017175752.GI23971@funkthat.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Oct 17, 2006 at 10:57:52AM -0700, John-Mark Gurney wrote:
> Oleg Bulyzhin wrote this message on Tue, Oct 17, 2006 at 14:53 +0400:
> > On Mon, Oct 16, 2006 at 10:05:17AM -0700, John-Mark Gurney wrote:
> > > Oleg Bulyzhin wrote this message on Mon, Oct 16, 2006 at 12:14 +0400:
> > > > On Mon, Oct 09, 2006 at 04:15:56PM +0000, John-Mark Gurney wrote:
> > > > > jmg         2006-10-09 16:15:56 UTC
> > > > > 
> > > > >   FreeBSD src repository
> > > > > 
> > > > >   Modified files:
> > > > >     sys/dev/pci          pci.c pci_if.m pci_private.h pcivar.h 
> > > > >     sys/dev/sk           if_sk.c if_skreg.h 
> > > > >   Log:
> > > > >   provide routines to access VPD data at the PCI layer...
> > > > >   
> > > > >   remove sk's own implementation, and use the new calls to get the data...
> > > > >   
> > > > >   Reviewed by:    -arch
> > > > >   
> > > > >   Revision  Changes    Path
> > > > >   1.315     +339 -3    src/sys/dev/pci/pci.c
> > > > >   1.9       +13 -0     src/sys/dev/pci/pci_if.m
> > > > >   1.18      +4 -0      src/sys/dev/pci/pci_private.h
> > > > >   1.71      +34 -0     src/sys/dev/pci/pcivar.h
> > > > >   1.131     +7 -148    src/sys/dev/sk/if_sk.c
> > > > >   1.39      +0 -31     src/sys/dev/sk/if_skreg.h
> > > > 
> > > > I have problem with my test machine since this commit:
> > > > kernel is panicing on boot if i have my pci bge(4) NIC plugged in.
> > > > 
> > > > Last kernel messages are:
> > > > pci1: physical bus=1
> > > > pci1:2:0: bad VPD cksum, remain 244
> > > > 
> > > > Invoking ddb after panic gives this backtrace:
> > > > [skipped]
> > > > pci_read_vpd
> > > > pci_read_extcap
> > > > pci_read_device
> > > > pci_add_children
> > > > acpi_pci_attach
> > > > device_attach
> > > > [skipped]
> > > > 
> > > > (i'm unable to get crashdump)
> > > > 
> > > > If i unplug bge, kernel boots just fine.
> > > > 
> > > > P.S. i can provide any additional info needed and can test patches.
> > > 
> > > Can you get a line number from pci_read_vpd?  Even if you can't get a
> > > crash dump, you can use addr2line (or kgdb) w/ the ip of the panic...
> > > That would help..
> > > 
> > > Looks like some manufacturers aren't following the PCI standard.. :(
> > > 
> > > -- 
> > >   John-Mark Gurney				Voice: +1 415 225 5579
> > > 
> > >      "All that I will do, has been done, All that I have, has not."
> > 
> > Okay, there is some details:
> > 
> > When kernel boots with bge plugged in:
> > 1) message about bad VPD cksum
> > 2) then kernel seems to hang for about 30 seconds (i'm not able to break into
> > DDB)
> > 3) then i get panic.
> > 
> > Panic message is about trap inside 'swapper' process (backtrace shows 2 traps
> > first inside pci_read_vpd, next one in swapper).
> > 
> > 1st trap is at pci_read_vpd+0x2c6 it is /usr/src/sys/dev/pci/pci.c:665
> > I've added debug printf there:
> > 		case 3: /* VPD-R Keyword Value */
> > 			printf("off: %d i: %d\n", off, i);
> > 			cfg->vpd.vpd_ros[off].value[i++] = byte;
> > 
> > It seems that 30 seconds delay is memory rewriting (off is constant, while i
> > is incrementing).  I get panic message after:
> > off:5 i:15407
> 
> I just realized that if the card puts a length of zero there, we will end
> up wrapping dflen..  This should not happened, definately not for RV
> which is required to be at least 1 byte in length...  This will allow
> non-RV's to be zero bytes in length, but if an RV has zero bytes in
> length, we'll treat it as a checksum failure and remove all read-only
> VPD data, and not continue on processing read-write data...
> 
> Here is an updated patch that should handle your card properly...
> 
> -- 
>   John-Mark Gurney				Voice: +1 415 225 5579
> 
>      "All that I will do, has been done, All that I have, has not."

> Index: pci.c
> ===================================================================
> RCS file: /home/ncvs/src/sys/dev/pci/pci.c,v
> retrieving revision 1.315
> diff -u -r1.315 pci.c
> --- pci.c	9 Oct 2006 16:15:55 -0000	1.315
> +++ pci.c	17 Oct 2006 17:56:15 -0000
> @@ -653,12 +653,36 @@
>  			cfg->vpd.vpd_ros[off].keyword[0] = byte;
>  			cfg->vpd.vpd_ros[off].keyword[1] = vpd_nextbyte(&vrs);
>  			dflen = vpd_nextbyte(&vrs);
> -			cfg->vpd.vpd_ros[off].value = malloc((dflen + 1) *
> -			    sizeof *cfg->vpd.vpd_ros[off].value,
> -			    M_DEVBUF, M_WAITOK);
> +			if (dflen == 0 &&
> +			    strncmp(vpd.vpd_ros[off], "RV", 2) == 0) {
> +				/*
> +				 * if this happens, we can't trust the rest
> +				 * of the VPD.
> +				 */
> +				printf("pci%d:%d:%d: bad keyword length: %d\n",
> +				    cfg->bus, cfg->slot, cfg->func, dflen);
> +				cksumvalid = 0;
> +				end = 1;
> +				break;
> +			} else if (dflen == 0) {
> +				cfg->vpd.vpd_ros[off].value = malloc(1 *
> +				    sizeof *cfg->vpd.vpd_ros[off].value,
> +				    M_DEVBUF, M_WAITOK)
> +				cfg->vpd.vpd_ros[off].value[0] = '\x00';
> +			} else
> +				cfg->vpd.vpd_ros[off].value = malloc(
> +				    (dflen + 1) *
> +				    sizeof *cfg->vpd.vpd_ros[off].value,
> +				    M_DEVBUF, M_WAITOK);
>  			remain -= 3;
>  			i = 0;
> -			state = 3;
> +			/* keep in sync w/ state 3's transistions */
> +			if (dflen == 0 && remain == 0)
> +				state = 0;
> +			else if (dflen == 0)
> +				state = 2;
> +			else
> +				state = 3;
>  			break;
>  
>  		case 3:	/* VPD-R Keyword Value */
> @@ -673,10 +697,13 @@
>  					    cfg->bus, cfg->slot, cfg->func,
>  					    vrs.cksum);
>  					cksumvalid = 0;
> +					end = 1;
> +					break;
>  				}
>  			}
>  			dflen--;
>  			remain--;
> +			/* keep in sync w/ state 2's transistions */
>  			if (dflen == 0)
>  				cfg->vpd.vpd_ros[off++].value[i++] = '\0';
>  			if (dflen == 0 && remain == 0) {
> @@ -736,6 +763,15 @@
>  			break;
>  		}
>  	}
> +
> +	if (cksumvalid == 0) {
> +		/* read-only data bad, clean up */
> +		for (; off; off--)
> +			free(cfg->vpd.vpd_ros[off].value, M_DEVBUF);
> +
> +		free(cfg->vpd.vpd_ros, M_DEVBUF);
> +		cfg->vpd.vpd_ros = NULL;
> +	}
>  #undef REG
>  }
>  

It works, thanks! (i had to correct syntax errors and change
vpd.vpd_ros[off] --> cfg->vpd_ros[off].keyword inside strncmp() call).


[skipped]
pci1: <ACPI PCI bus> on pcib1
pci1: physical bus=1
pci1:2:0: bad VPD cksum, remain 244
found-> vendor=0x173b, dev=0x03ea, revid=0x15
        bus=1, slot=2, func=0
        class=02-00-00, hdrtype=0x00, mfdev=0
        cmdreg=0x0116, statreg=0x02b0, cachelnsz=16 (dwords)
        lattimer=0x20 (960 ns), mingnt=0x40 (16000 ns), maxlat=0x00 (0 ns)
        intpin=a, irq=9
        powerspec 2  supports D0 D3  current D0
        VPD Ident: NETGEAR GA302T Gigabit Adapter
        MSI supports 8 messages, 64 bit
        map[10]: type 1, range 64, base ff8f0000, size 16, enabled
[skipped]

-- 
Oleg.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20061018094946.GA37107>