Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 2 Oct 2009 13:14:00 -0700
From:      Pyun YongHyeon <pyunyh@gmail.com>
To:        Yohanes Nugroho <yohanes@gmail.com>
Cc:        freebsd-net@freebsd.org, freebsd-arm@freebsd.org
Subject:   Re: FreeBSD ARM network speed
Message-ID:  <20091002201400.GJ1512@michelle.cdnetworks.com>
In-Reply-To: <260bb65e0910012258w7c569505xa8cac5bd8bbd2aaa@mail.gmail.com>
References:  <260bb65e0910012258w7c569505xa8cac5bd8bbd2aaa@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Oct 02, 2009 at 12:58:38PM +0700, Yohanes Nugroho wrote:
> Hi All,
> 

Hi,


[...]

> The specification for the STR9104 SoC is available on Cavium website
> for those who are interested, but it is not very clear, so in
> developing the network driver, I followed the logic used by the Linux
> driver (the initialization sequence, etc). The current code is at
> http://p4db.freebsd.org/fileViewer.cgi?FSPC=//depot/projects/str91xx/src/sys/arm/econa/if_ece.c&REV=4
> 
> Here is how the sending part works on STR9104:
> 
> - In the initialization part, I allocate a ring, the size of the ring
> is 256 entries (same as Linux version).

If ethernet controller does not support 1000baseT(I think it's 
fastethernt because ICPlus IP101A is 10/100 PHY) allocating 256
descriptors are waste of resource especially on 64MB systems, I
think.

> - When being asked to send a packet, I will do the following thing:
>   - stop the network TX DMA
>   - put the address of each segment of the packet to the ring, and set
> a flag so that the entry in the ring will be sent by hardware
>   - start the network TX DMA
> 
> obviously there is a cleaning up part (freeing mbuf) that should be
> done. The network driver can generate interrupt when a packet has been
> sent (but can't tell which entry was sent). In the Linux version, this
> interrupt is not used, the clean up is done just after starting the TX
> DMA, at the send of the sending function, and I do the same in the
> FreeBSD driver . Usually only one entry that needs to be removed, so
> it is quite fast.
> 
> Is there something obvious (or not so obvius) that I've missed?
> 

I briefly looked over the driver code and I can see missing
bus_dmamap_sync(9) in several places as well as incorrect use of
bus_dma(9). This may also affect performance because checking OWN
bit wouldn't be correct in CPU's view without bus_dmamap_sync(9).
Another poor performance might come from m_devget(9), I don't know
whether controller really needs this type of copying(sorry, have
no time to read data sheet) but m_devget(9) is really slow and time
consuming operation because it has to copy entire frame to new
mbuf. If you had to use m_devget(9) to align buffers on ETHER_ALIGN
boundary I guess you can pass the alignment restriction to
bus_dma(9). Of course, this requires the controller have ability to
receive frames on even address boundary or no Rx buffer alignment
limitation.

I believe you should not stop DMA before sending another frame as
you did in Rx handler. Basically you should make controller as
busy as you can to get maximum performance and should reclaim
transmitted buffers as soon as you noticed. Stopping DMA may take
time since it may have to drain active DMA cycles. If the
controller does not generate Tx completion interrupt after sending
a frame, which is not likely, you may have to implement a kind of
polling in separate thread or should use polling(9).

Good luck!



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20091002201400.GJ1512>