Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 17 Feb 2006 10:01:01 -0500 (EST)
From:      Andrew Gallatin <gallatin@cs.duke.edu>
To:        freebsd-amd64@freebsd.org
Subject:   non-temporal copyin/copyout?
Message-ID:  <17397.58669.457047.277510@grasshopper.cs.duke.edu>

next in thread | raw e-mail | index | archive | help


Has anybody considered using non-temporal copies for the in-kernel
bcopy on amd64?

A quick test in userspace shows that for large copies, an adapted
pagecopy (from amd64/amd64/support.S) more than doubles bcopy
bandwidth from 1.2GB/s to 2.5GB/s on my on my Athlon64 X2 3800+.

I'm bringing this up because I've noticed that FreeBSD 10GbE
performance is far below Solaris/amd64 and linux/x86_64 when using the
PCI-e 10GbE adaptor that I'm doing drivers for.  For example, Solaris
can recieve a netperf TCP stream at 9.75Gb/sec while using only 47%
CPU as measured by vmstat.  (eg, it is using a little less than a
single core).  In contrast, FreeBSD is limited to 7.7Gb/sec, and uses
nearly 90% CPU.  When profiling with hwpmc, I see a profile which
shows up to 70% of the time is spent in copyout. 

Thanks,

Drew




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?17397.58669.457047.277510>