Date: Mon, 12 Apr 2010 17:24:59 -0700 From: Pyun YongHyeon <pyunyh@gmail.com> To: "Erich Jenkins, Fuujin Group Ltd" <erich@fuujingroup.com> Cc: freebsd-net@freebsd.org, Evgenii Davidov <dado@korolev-net.ru> Subject: Re: Broadcom BCM5701 / HP NC6770 Message-ID: <20100413002459.GI1444@michelle.cdnetworks.com> In-Reply-To: <4BC3C347.3010701@fuujingroup.com> References: <20100409173821.GD1085@michelle.cdnetworks.com> <4BC016F3.4020300@fuujingroup.com> <20100410212520.GB6481@michelle.cdnetworks.com> <4BC12097.4030508@fuujingroup.com> <4BC19324.3050800@fuujingroup.com> <20100412175701.GC1444@michelle.cdnetworks.com> <20100412194209.GF1444@michelle.cdnetworks.com> <4BC3B676.3070503@fuujingroup.com> <20100413000255.GH1444@michelle.cdnetworks.com> <4BC3C347.3010701@fuujingroup.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Apr 12, 2010 at 07:05:11PM -0600, Erich Jenkins, Fuujin Group Ltd wrote: > Pyun YongHyeon wrote: > >On Mon, Apr 12, 2010 at 06:10:30PM -0600, Erich Jenkins, Fuujin Group Ltd > >wrote: > >>Pyun YongHyeon wrote: > >>>On Mon, Apr 12, 2010 at 10:57:01AM -0700, Pyun YongHyeon wrote: > >>>>On Sun, Apr 11, 2010 at 03:15:16AM -0600, Erich Jenkins, Fuujin Group > >>>>Ltd wrote: > >>>>>I've been muddling around in src/sys/dev on the old system and the new > >>>>>system and there appear to be rather major changes to MII and bge, > >>>>>possibly the whole stack? > >>>>> > >>>>It was not completely rewritten but many improvements were made. > >>>> > >>>>>There are a number of things that seem to have been merged with other > >>>>>parts of the network stack, or perhaps written into the individual > >>>>>drivers (someone working on the net stack would have to verify that). > >>>>> > >>>>>For instance, some files called in 5.3-REL seem to have gone away > >>>>>completely, and in the new (unpatched) version of if_bge.c under > >>>>>7.3-REL, calls to these modules are gone: > >>>>> > >>>>>- #include <vm/vm.h> /* for vtophys */ > >>>>>- #include <vm/pmap.h> /* for vtophys */ > >>>>One of the most significant changes would be bus_dma(9) conversion > >>>>which is required to all drivers to make it work correctly on a > >>>>variety of platforms. bus_dma(9) does not directly use vtophys > >>>>anymore so these headers were nuked. > >>>> > >>>>>- #include <machine/clock.h> /* for DELAY */ > >>>>>- #include <machine/bus_memio.h> > >>>>> > >>>>>- #include <dev/pci/pcireg.h> (called but something changed in here) > >>>>>- #include <dev/pci/pcivar.h> (ditto above) > >>>>> > >>>>No, these headers are still present. > >>>> > >>>>>It appears that the checksum features have been completely rewritten, > >>>>Checksum offloading was not completely rewritten but workaround > >>>>for buggy controllers was added. > >>>> > >>>>>and some of the ring settings have changed. It's interesting that the > >>>>>driver only fills 256 of the rx rings in the hopes that the cpu is > >>>>>"fast enough to keep up with the NIC". Would a subroutine here to grab > >>>>>the cpu > >>>>That magic number 256 is adequate for most cases but it may not be > >>>>enough to handle heavy loads. Internally the controller use fixed > >>>>512 RX buffers but bge(4) used only half of the buffers to save > >>>>resources. I think you can increase SSLOTS to 512 to get full 512 > >>>>RX buffers. > >>>> > >>>>>clock and count (number of procs/pipelines) be more trouble than it's > >>>>>worth to "automagically" increase the number of rx rings the driver > >>>>>fills based on the system in which it's installed? > >>>>> > >>>>Dynamically increasing number of RX buffers is doable but it would > >>>>add much more code. If there is high demand for that I would just > >>>>increase number of RX buffers to 512. Controller can't be > >>>>configured to have more than 512 RX buffers. > >>>> > >>>>>Something also changed in pci/pcireg.h and pci/pcivar.h, but I haven't > >>>>>had the time to hunt down and expand the source tree from the 5.3-REL > >>>>>branch yet. > >>>>> > >>>>>I have other machines with copper nics utilizing the bge driver, and > >>>>>there are no issues at all. Perhaps I'm getting ahead of things, but > >>>>Yes that is expected one. :-) > >>>> > >>>>>since this seems to have been broken through several releases, would > >>>>>it make any sense to split the support between the BCM5701KHB chipset > >>>>>and the more recent BCM chipset to avoid causing issues with > >>>>>cards/systems not currently experiencing troubles? > >>>>> > >>>>I'd like to if I can. Supporting huge number of different > >>>>controllers in single driver is maintenance nightmare. However, > >>>>rewriting some part that require special handling for certain > >>>>controller/revision is too risky because I don't have access to > >>>>most controllers. > >>>> > >>>>One theory for the issue I got while reading the code is link state > >>>>handling. As I said in previous mail, link state handling for TBI > >>>>is somewhat tricky in bge(4) and driver seemed to rely on periodic > >>>>register access to keep track of link state. I guess polling(4) may > >>>>give different behavior on link state handling as it does not rely > >>>>on interrupts at all. So would you try to use polling(4) and see > >>>>that make any difference on your box? > >>>If polling(4) make it work, try attached patch. > >>> > >>> > >>>------------------------------------------------------------------------ > >>> > >>>_______________________________________________ > >>>freebsd-net@freebsd.org mailing list > >>>http://lists.freebsd.org/mailman/listinfo/freebsd-net > >>>To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > >>I'll get this set up. I've got a jail issues on 7.0-REL that I'm trying > >>to figure out too, so it might take a few hours before I get to this. > >> > > > >I beleive bge(4) in 7.0-RELEASE and 7.3-RELEASE is quite different. > >So I'm not sure whether the patch works on 7.0-RELEASE. > > > >>I just checked on a reported iSCSI error on a machine using a BCM5721 > >>nic (copper gigE) and I'm seeing issues like this: > >> > >>Apr 11 06:24:59 san0 iscsi-target: pid 863:iscsi.c:1149: ***ERROR*** Bad > >>"Opcode": Got 0 expected 5. > >>Apr 11 06:24:59 san0 iscsi-target: pid 863:target.c:1317: ***ERROR*** > >>iscsi_write_data_decap() failed > >>Apr 11 16:51:52 san0 iscsi-target: pid 863:iscsi.c:1149: ***ERROR*** Bad > >>"Opcode": Got 0 expected 5. > >>Apr 11 16:51:52 san0 iscsi-target: pid 863:target.c:1317: ***ERROR*** > >>iscsi_write_data_decap() failed > >>Apr 12 10:32:49 san0 iscsi-target: pid 863:iscsi.c:1149: ***ERROR*** Bad > >>"Opcode": Got 0 expected 5. > >>Apr 12 10:32:49 san0 iscsi-target: pid 863:target.c:1317: ***ERROR*** > >>iscsi_write_data_decap() failed > >>Apr 12 11:55:42 san0 iscsi-target: pid 863:iscsi.c:1149: ***ERROR*** Bad > >>"Opcode": Got 0 expected 5. > >>Apr 12 11:55:42 san0 iscsi-target: pid 863:target.c:1317: ***ERROR*** > >>iscsi_write_data_decap() failed > >>Apr 12 14:07:13 san0 iscsi-target: pid 863:iscsi.c:1149: ***ERROR*** Bad > >>"Opcode": Got 0 expected 5. > >>Apr 12 14:07:13 san0 iscsi-target: pid 863:target.c:1317: ***ERROR*** > >>iscsi_write_data_decap() failed > >> > >>Any chance this could be because of the NIC chipset? I don't see this on > >>any of the machines configured identically, using the em driver for > >>Intel GigE nics. > >> > > > >Have no idea what happens here. Does this also happen on > >7.3-RELEASE? > >_______________________________________________ > >freebsd-net@freebsd.org mailing list > >http://lists.freebsd.org/mailman/listinfo/freebsd-net > >To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > Sorry, I meant to say I'd reinstall 7.3-REL to test this. > > The iSCSI issues are happening on 8.0-REL. Are there any major > differences between 7.3 and 8.0 in the bge driver or network stack that > could be contributing to this? > 7.3-RELEASE has more recent bge(4) changes. Most changes are related with RX buffer handling and bus_dma(9) fixes. See CVS web interface for more details. 8.0 has new network stack that has some nice/experimental features. But I guess it wouldn't affect iscsi behavior. It's just wild guess, I have no experience on iscsi so others can point out iscsi differences between 7.3-RELEASE and 8.0-RELEASE.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100413002459.GI1444>