From owner-freebsd-current@FreeBSD.ORG  Thu Jul 13 17:20:09 2006
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
X-Original-To: freebsd-current@FreeBSD.ORG
Delivered-To: freebsd-current@FreeBSD.ORG
Received: by hub.freebsd.org (Postfix, from userid 618)
	id A844A16A4DA; Thu, 13 Jul 2006 17:20:09 +0000 (UTC)
In-Reply-To: <20060712003110.GA9542@cdnetworks.co.kr> from Pyun YongHyeon at
	"Jul 12, 2006 09:31:10 am"
To: pyunyh@gmail.com
Date: Thu, 13 Jul 2006 17:20:09 +0000 (GMT)
X-Mailer: ELM [version 2.4ME+ PL54 (25)]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Message-Id: <20060713172009.A844A16A4DA@hub.freebsd.org>
From: wpaul@FreeBSD.ORG (Bill Paul)
Cc: csaba-ml@creo.hu, lydianconcepts@gmail.com, freebsd-current@FreeBSD.ORG,
	brueffer@FreeBSD.ORG
Subject: Re: Call for stge(4) testers
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 13 Jul 2006 17:20:09 -0000

> 
> Actually jumbo frame support works but it takes a very long time
> (30~50sec) to work correctly after selecting an MTU larger than
> 1500. During the time I could see the the NIC sends the jumbo frame
> with tcpdump on sender side but the frame doesn't arrive at
> destination. Sometimes the NIC generated 'Tx underrun' and sometimes
> it didn't generate any errors.

tcpdump doesn't really monitor the sender side. It can't tell you,
from the TX side, what packets have made it out onto the wire: it
can only tell you what packets made it into the driver's send routine
(where the BPF tap point is). NICs normally can not hear their own
transmissions.

tcpdump on the _other_ end will tell you what packets made it out
onto the wire, _unless_ the receiving NIC has been configured to
automatically discard error frames. Most of the time you want the
NIC to do this, since it saves you from having to process them on
the host, but in the past I've deliberately disabled this feature
on some cards so that I could get the bad frames onto the host and
see exactly what's wrong with them. (Are they truncated? Is the
checksum bad? If so, why? Did Homeland Security mess them up while
scanning them? etc...)

> However in the long run it did work
> after about a minuate without any action from user.

Hm. Well, here's the thing: ethernet controller's can't tell time. I
think it's less of an issue of how much time it takes before the NIC
starts sending jumbo frames correctly and more of an issue of how
many failed transmission attempts occur before one succeeds. Did you
count the number of bad transmissions that occur before the chip
starts working? Was it always the same number? Did you instrument
the interrupt handler to see if TX completion events are actually
occuring for each transmission? (Just because you made it into
stge_start() and queued the packet up doesn't mean transmission
completed?)

> I even changed
> STGE_TxStartThresh to use a full frame size but it seemed that it had
> no effects. Anyway, STGE_TxStartThresh is not used at all if stge(4)
> enabled checksum offload capability.

Yeah, that's probably because the NIC wants the whole frame to be resident
in the TX FIFO memory so that it can calculate the checksum(s). This implies
store-and-forward behavior.
 
> I'm afraid the hardware doesn't support a single 9K buffer per
> decriptor. It seems that the hardware can support up to 4k buffer
> per descriptor so I chained received frames to a single jumbo
> frame.

Hunh. In the 3Com design, the RX descriptors are allowed to have a
fragment list, just like the the TX descriptors, though you're also
allowed to tell the chip to use a special 'one fragment' layout if
your driver doesn't use the whole fraglist. It looks like in the TC902x,
they implemented only the 'one fragment' layout for RX descriptors.

> That's exactly what I did it on my tests. From your detailed
> explanation on jumbo frame internals and guide I've reread source
> code related to jumbo frame and managed to find a bug in my code.
> It was my programming error in selecting an MTU from ioctl handler.
> However I don't understand how the hardware can send/receive larger
> MTUs than 9022.

What is it about this that you don't you understand, exactly? Is it
something about this particular chip, or about being able to have
frames larger than 9022 bytes in general? (Technically jumbo frames
can be up to 16K in length, but by convention most people use 9000
bytes.) Mind you, this is one chip I've never gotten around to writing
a driver for myself, so I'm sure I'm missing something.

> If you program the hardware to receive 10K it
> eventually work in a one or two minute.

Like I said, it's probably a question of some sequence of events
occuring before it starts to work, not the passage of a certain amount
of time. It may be that you have to disable and re-enable the TX and/or
RX dma engines before switching to the larger frame size. Also, verify
that you properly initialize all of the fields in the TX descriptors
on each transmission so you're not keeping stale bits set from
previous transmissions. There's also a slight chance this is a stack
bug, not a driver/chip bug. What happens if you do:

# ifconfig stge0 mtu 9000
# ifconfig stge0 down
# ifconfig stge0 up

Does it still take a minute for transmissions to succeed?

-Bill

--
=============================================================================
-Bill Paul            (510) 749-2329 | Senior Engineer, Master of Unix-Fu
                 wpaul@windriver.com | Wind River Systems
=============================================================================
              <adamw> you're just BEGGING to face the moose
=============================================================================