From owner-freebsd-sparc64@FreeBSD.ORG Wed Feb 13 01:47:54 2013 Return-Path: Delivered-To: freebsd-sparc64@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 90842C99 for ; Wed, 13 Feb 2013 01:47:54 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-da0-f42.google.com (mail-da0-f42.google.com [209.85.210.42]) by mx1.freebsd.org (Postfix) with ESMTP id 6AEE01B1 for ; Wed, 13 Feb 2013 01:47:54 +0000 (UTC) Received: by mail-da0-f42.google.com with SMTP id z17so314443dal.29 for ; Tue, 12 Feb 2013 17:47:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:from:date:to:cc:subject:message-id:reply-to:references :mime-version:content-type:content-disposition:in-reply-to :user-agent; bh=V1vxSbrR7B/XbU6eitskxWlTd+fTfD92vBcj1l+drf4=; b=mVWcUZJZbvR1oPNAoxs6MjysDy1aOxeluNOozMz2rpdsUFQWTZFaJEnGHDf25pSLp8 mSkjUD6U7yjA6NXu1TZYqb9NeR0nvPeB3BO+C9vTsqoCsEfxsy+np9AiHe/zzSnHKqUL Ikm/lkmHpCa57cafLn214jrQ7O3eY4/vK9QE5OMowtwbFx5uYvCztJFRYqGvv1EFAbHh KEkZMsqlCz9ZmKUL6lLPW1f2Reo4HKJvCNKeg8r9PmRuHRews0kAr0O0eo0By4xzYTf4 hwozOBU3Crzd7jBPV2n0/Aobjhpq982CfiPhf+D4KS1H969HerLy0vq1bMIJYFFfAz2h FpYQ== X-Received: by 10.66.82.163 with SMTP id j3mr58418941pay.31.1360720067737; Tue, 12 Feb 2013 17:47:47 -0800 (PST) Received: from pyunyh@gmail.com (lpe4.p59-icn.cdngp.net. [114.111.62.249]) by mx.google.com with ESMTPS id w2sm73811501pax.22.2013.02.12.17.47.44 (version=TLSv1 cipher=RC4-SHA bits=128/128); Tue, 12 Feb 2013 17:47:46 -0800 (PST) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Wed, 13 Feb 2013 10:47:38 +0900 From: YongHyeon PYUN Date: Wed, 13 Feb 2013 10:47:38 +0900 To: Marius Strobl Subject: Re: console stops with 9.1-RELEASE when under forwarding load Message-ID: <20130213014738.GB3101@michelle.cdnetworks.com> References: <20130122043541.GA67894@pix.net> <20130123223009.GA22474@alchemy.franken.de> <20130205061956.GB40942@pix.net> <20130205072553.GB1439@michelle.cdnetworks.com> <20130205203503.GR80850@alchemy.franken.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130205203503.GR80850@alchemy.franken.de> User-Agent: Mutt/1.4.2.3i Cc: Kurt Lidl , freebsd-sparc64@freebsd.org X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Feb 2013 01:47:54 -0000 On Tue, Feb 05, 2013 at 09:35:03PM +0100, Marius Strobl wrote: > On Tue, Feb 05, 2013 at 04:25:53PM +0900, YongHyeon PYUN wrote: > > On Tue, Feb 05, 2013 at 01:19:56AM -0500, Kurt Lidl wrote: > > > On Wed, Jan 23, 2013 at 11:30:09PM +0100, Marius Strobl wrote: > > > > On Mon, Jan 21, 2013 at 11:35:41PM -0500, Kurt Lidl wrote: > > > > > I'm not sure if this is better directed at freebsd-sparc64@ > > > > > or freebsd-net@ but I'm going to guess here... > > > > > > > > > > Anyways. In all cases, I'm using an absolutely stock > > > > > FreeBSD 9.1-release installation. > > > > > > > > > > I got several SunFire V120 machines recently, and have been testing > > > > > them out to verify their operation. They all started out identically > > > > > configured -- 1 GB of memory, 2x36GB disks, DVD-rom, 650Mhz processor. > > > > > The V120 has two on-board "gem" network interfaces. And the machine > > > > > can take a single, 32-bit PCI card. > > > > > > > > > > I've benchmarked the gem interfaces being able to source or sink > > > > > about 90mbit/sec of TCP traffic. This is comparable to the speed > > > > > of "hme" interfaces that I've tested in my slower Netra-T1-105 > > > > > machines. > > > > > > > > > > So. I put a Intel 32bit gig-e interface (a "GT" desktop > > > > > Gig-E interface) into the machine, and it comes up like this: > > > > > > > > > > em0: port 0xc00200-0xc0023f mem 0x20000-0x3ffff,0x40000-0x5ffff at device 5.0 on pci2 > > > > > em0: Memory Access and/or Bus Master bits were not set! > > > > > em0: Ethernet address: 00:1b:21: > > > > > > > > > > That interface can source or sink TCP traffic at about > > > > > 248 mbit/sec. > > > > > > > > > > Since I really want to make one of these machines my firewall/router, > > > > > I took a different, dual-port Intel Gig-E server adaptor (a 64bit > > > > > PCI card) and put it into one of the machines so I could look at > > > > > the fowarding performance. It probes like this: > > > > > > > > > > em0: port 0xc00200-0xc0023f mem 0x20000-0x3ffff,0x40000-0x7ffff at device 5.0 on pci2 > > > > > em0: Memory Access and/or Bus Master bits were not set! > > > > > em0: Ethernet address: 00:04:23: > > > > > em1: port 0xc00240-0xc0027f mem 0xc0000-0xdffff,0x100000-0x13ffff at device 5.1 on pci2 > > > > > em1: Memory Access and/or Bus Master bits were not set! > > > > > em1: Ethernet address: 00:04:23: > > > > > > > > > > Now this card can source traffic at about 250 mbit/sec and can sink > > > > > traffic around 204 mbit/sec. > > > > > > > > > > But the real question is - how is the forwarding performance? > > > > > > > > > > So I setup a test between some machines: > > > > > > > > > > A --tcp data--> em0-sparc64-em1 --tcp data--> B > > > > > | | > > > > > \---------<--------tcp acks-------<-----------/ > > > > > > > > > > So, A sends to interface em0 on the sparc64, the sparc64 > > > > > forward out em1 to host B, and the ack traffic flows out > > > > > a different interface from B to A. (A and B are amd64 > > > > > machines, with Gig-E interfaces that are considerably > > > > > faster than the sparc64 machines.) > > > > > > > > > > This test works surprisingly well -- 270 mbit/sec of forwarding > > > > > traffic, at around 29500 packets/second. > > > > > > > > > > The problem is when I change the test to send the tcp ack traffic > > > > > back through the sparc64 (so, ack traffic goes from B into em1, > > > > > then forwarded out em0 to A), while doing the data in the same way. > > > > > > > > > > The console of the sparc64 becomes completely unresponsive during > > > > > the running of this test. The 'netstat 1' that I been running just > > > > > stops. When the data finishes transmitting, the netstat output > > > > > gives one giant jump, counting all the packets that were sent during > > > > > the test as if they happened in a single second. > > > > > > > > > > It's pretty clear that the process I'm running on the console isn't > > > > > receiving any cycles at all. This is true for whatever I have > > > > > running on the console of machine -- a shell, vmstat, iostat, > > > > > whatever. It just hangs until the forwarding test is over. > > > > > Then the console input/output resumes normally. > > > > > > > > > > Has anybody else seen this type of problem? > > > > > > > > > > > > > I don't see what could be a sparc64-specific problem in this case. > > > > You are certainly pushing the hardware beyond its limits though and > > > > it would be interesting to know how a similarly "powerful" i386 > > > > machine behaves in this case. > > > > In any case, in order to not burn any CPU cycles needlessly, you > > > > should use a kernel built from a config stripped down to your > > > > requirements and with options SMP removed to get the maximum out > > > > of a UP machine. It could also be that SCHED_ULE actually helps > > > > in this case (there's a bug in 9.1-RELEASE causing problems with > > > > SCHED_ULE and SMP on sparc64, but for UP it should be fine). > > > > > > I updated the kernel tree on one of my sparc64 machines to the > > > latest version of 9-STABLE, and gave the following combinations a > > > try: > > > SMP+ULE > > > SMP+4BSD > > > non-SMP+ULE > > > non-SMP+4BSD > > > They all performed about the same, in terms of throughput, > > > and about the same in terms of user-responsiveness when under load. > > > None were responsive when forwarding ~214mbit/sec of traffic. > > > > > > I played around a bit with tuning of the rx/tx queue depths for the > > > em0/em1 devices, but none of that had any perceptable difference in > > > the level of throughput or responsiveness of the machine. > > > > If my memory serve me right, em(4) requires considerably fast > > machine to offset the overhead of taskqueue(9). Because the > > taskqueue handler is enqueued again and again under heavy RX > > network load, most system cycles would be consumed in the > > taskqueue handler. > > Try polling(4) and see whether it makes any difference. I'm not > > sure whether polling(4) works on sparc64 though. > > > > This might or might not work or at least cause ill effects. In general, > Sun PCI bridges synchronize DMA on interrupts and polling(4) bypasses > that mechanism. For the host-PCI-bridges found in v210, psycho(4) > additionally synchronizes DMA manually when bus_dmamap_sync(9) is called > with BUS_DMASYNC_POSTREAD (as suggested in the datasheet). I'm not sure > whether this is also sufficient for polling(4). In any case, sun4u > hardware certainly wasn't built with something like polling(4) in mind. > Hrm, according to my reading of the lem(4) source, it shouldn't use > taskqueue(9) when setting the loader tunable hw.em.use_legacy_irq to > 1 for the MACs in question. In any case, the latter certainly is easier > to test than rebuilding a kernel with polling(4) support. > Right. If the driver is lem(4), using use_legacy_irq would be better way to eliminate taskqueue(9) overhead on slow boxes. You may also want to tune several interrupt delay tunables. > Marius >