Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 18 Nov 2002 17:48:39 -0800
From:      Terry Lambert <tlambert2@mindspring.com>
To:        Luigi Rizzo <rizzo@icir.org>
Cc:        David Gilbert <dgilbert@velocet.ca>, dolemite@wuli.nu, freebsd-hackers@FreeBSD.ORG, freebsd-net@FreeBSD.ORG
Subject:   Re: Small initial LRP processing patch vs. -current
Message-ID:  <3DD99877.2F7C6D12@mindspring.com>
References:  <20021109180321.GA559@unknown.nycap.rr.com> <3DCD8761.5763AAB2@mindspring.com> <15823.51640.68022.555852@canoe.velocet.net> <3DD1865E.B9C72DF5@mindspring.com> <15826.24074.605709.966155@canoe.velocet.net> <3DD2F33E.BE136568@mindspring.com> <3DD96FC0.B77331A1@mindspring.com> <20021118151109.B19767@xorpc.icir.org> <3DD99018.73B703A@mindspring.com> <20021118173155.C29018@xorpc.icir.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Luigi Rizzo wrote:
> > > This patch will not make any difference if you have device_polling
> > > enabled, because polling already does this -- queues a small number
> > > of packets (default is max 5 per card) and calls ip_input on them
> > > right away.
> >
> > The problem with this is that it introduces a livelock point at
> 
> no it doesn't because new packets are not grabbed off the card until
> the queue is empty.

It's still possible to run out of mbufs.


> > > I do not understand this claim:
> > >
> > > > The basic theory here is that ipintr processing can be delayed
> > > > indefinitely, if interrupt load is high enough, and there will
> > > > be a maximum latency of 10ms for IP processing after ether_input(),
> > > > in the normal stack case, without the patches.
> > >
> > > because netisr are not timer driven to the best of my knowledge --
> > > they just fire right after the cards' interrupts are complete.
> >
> > That's almost right.  The soft interrupt handlers run when you
> > splx() out of a raised priority level.  In fact, this happens at
> > the end of clockintr, so NETISR *is* timer driven, until you hit
> 
> i think it happens at the end of the device interrupt!

It happens at splx().  THis happens at the end of a device
interrupt, but... acking the interrupt can result in another
interrupt before processing is complete to the point that soft
interrupts will run.

See the Jeffrey Mogul paper on receiver livelock, and the Rice
University paper on LRP.


> > Polling changes this somewhat.  The top end is reduced, in exchange
> > for not dropping off as badly
> 
> actually, this is not true in general, and not in the case of
> FreeBSD's DEVICE_POLLING.
> 
> Polling-enabled drivers fetch the cards' state from in-memory
> descriptors, not from the interrupt status register across the
> PCI bus. Also, they to look for exceptions (which require going
> through the PCI bus) only every so often. So the claim that the top
> end is reduced is not true in general -- it depends on how the
> interrupt vs. polling code are written and optimised.

No.  That's more of a side-issue, and it's dictated by the hardware
and firmware implementation, more than anything else, I think.

The actual problem is that the balance between system time spent
polling in the kernel vs. running the application in user space is
based on reserving a fixed amount of time, rather than a load-dependent
amount of time for processing.

I understand that DEVICE_POLLING is your baby; I'm not attacking
your implementation.  It does what it was supposed to do.  Things
are better with polling than without; all I am saying is that they
could be better still.

The reason I asked for the second set of numbers (polling with/without
with the ip_input code path change) is actually to support the idea
that polling and/or additional patches are still required.

You really want to achieve the highest possible throughput, without
ever dropping a packet.  If you drop a packet anywhere between the
network card and the application, then you are not handling the
highest load the hardware is capable of handling.  Polling only
deals with this up to the top of the TCP stack, at a cost of
increased latency over interrupt in exchange for reduced interrupt
processing overhead (you get to wait until the ether_poll() runs,
instead of handling the packet immediately, which introduces an
unavoidable hardclock/2 latency).

You avoid interrupt livelock, while still risking a deadly embrace
waiting for applications to service the sockets (hence the need for
scheduler hacks, as well).

When it comes down to it, latency := pool retention time, and the
smaller your pool retention time, the more connections you can
handle simultaneously with a given pool size.

-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-net" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3DD99877.2F7C6D12>