Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 18 Mar 2008 08:40:18 -0400
From:      John Baldwin <jhb@freebsd.org>
To:        freebsd-arch@freebsd.org
Cc:        "Bjoern A. Zeeb" <bzeeb-lists@lists.zabbadoz.net>, Poul-Henning Kamp <phk@phk.freebsd.dk>
Subject:   Re: Power-Mgt (Was: Re: cvs commit: src/sys/i386/cpufreq est.c )
Message-ID:  <200803180840.18275.jhb@freebsd.org>
In-Reply-To: <20080318085804.I50685@maildrop.int.zabbadoz.net>
References:  <3860.1205764623@critter.freebsd.dk> <20080318085804.I50685@maildrop.int.zabbadoz.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tuesday 18 March 2008 04:59:42 am Bjoern A. Zeeb wrote:
> On Mon, 17 Mar 2008, Poul-Henning Kamp wrote:
>
> Hi,
>
> > [Moved to arch@]
> >
> > In general, I think we must make power-aware computing our "next
> > SMPng project", not in the sense of delaying the next major release
> > five years, but in the sense that power consumption should permerate
> > our thinking about the operating system from now on.
> >
> > Overall, I think that means that we should:
> >
> > *  Enable performance neutral power savings on servers
> > 	- spin down unused disks. (geom/drivers)
> > 	- use only as many CPU cores as necessary (scheduler)
> > 	- light cpu-throttling.
> > 	- downgrading 1GB to 100MB ether when idle.
> >
> > *  Aim to meet or execeed energystar 4.0/5.0[1] on desktops and
> >   plugged laptops.
> > 	- Pretty much as above, but with specific targets.
> > 	- http://www.energystar.gov/index.cfm?c=revisions.computer_spec
> >
> > *  Be as battery-frugal as possible on battery driven laptops.
> > 	- Any trick in and off the book.
>
> so while this topic is one,
>
> what actually happens to an unrecognized card or a card with no driver
> loaded currently? How much power does an unsued card use and can we do
> anything about that? Are we perhaps already doing something about
> that?

We power off PCI cards (to D3) that aren't recognized by a driver already.  
However, what would be more useful would be to power down cards that have a 
driver but aren't in use.  Ethernet NICs are one example.  I see a couple of 
possibilities:

1) Shut off "down" interfaces (ifconfig foo down) and only turn them on when
   the user puts them "up".

2) If the device supports D1/D2 then put it into one of those when it has no
   link (no NICs that we support do D1/D2 currently though).  Otherwise, if
   the device has no carrier, power it down to D3 but periodically (say, every
   5 seconds, maybe configurable) power it back up to D0 to check for link.

3) Shut off "down" interfaces but use 2) for "up" interfaces.

I think 3) is what I'd prefer.  Esp. if we make the timer configurable.  It 
would also be nice to power down sound cards when no userland app has them 
open, and to power down USB controllers if no USB devices are connected 
(ideally to a D1/D2 state where they still get an interrupt on device 
insertion).

To avoid lots of code duplication I think we would need to provide some sort 
of "idle" device support in new-bus.  Possibly something like this:

/*
 * Routines to manage putting the device into an "idle" power state when it
 */

/*
 * How long we have to be idle before we are turned off.  This might should
 * default to some sort of value (say 5 seconds).  It should be exposed via
 * sysctl by new-bus itself (e.g. dev.foo.0.idle_timer).  We may want to have
 * different defaults for different classes of devices (e.g. maybe there is
 * a NIC_IDLE_TIMER constant that NIC drivers use to set this in their attach
 * routines).
 */
int	device_set_idle_timer(device_t dev, int ticks);

/*
 * Note that a device is idle.  If the device was previously active,
 * this starts the idle timer.  If the timer completes w/o being cancelled
 * it invokes device_idle(dev);
 */
int	device_is_idle(device_t dev);

/*
 * Note that a device is now active.  If the device was previously idle
 * then the idle timer is stopped.  If the timer wasn't running then it
 * invokes device_active(dev);
 */
int	device_is_active(device_t dev);

/*
 * device_if.h method, so becomes DEVMETHOD(device_idle, foo_idle);
 * This is invoked when the idle timer expires (i.e. device has been
 * idle for a complete idle timer duration).  This method should power
 * down the device in some way.
 */
int	DEVICE_IDLE(device_t parent, device_t child);

/*
 * device_if.h method invoked when a powered down device becomes active
 * again.  This should power the device back up.
 */
int	DEVICE_ACTIVE(device_t parent, device_t child);

So one possible impl of 3) for a NIC might be:

foo_ioctl(struct ifnet *ifp)
{
	struct foo_softc *sc;

	sc = ifp->if_softc;

	...
	case SIOCSIFFLAGS:
		FOO_LOCK(sc);
		if (ifp->if_flags & IFF_UP)
			device_is_active(sc->foo_dev);
		else
			device_is_idle(sc->foo_dev);
		foo_init(sc);
		FOO_UNLOCK(sc);
		break;
	...
}

/* Routine that gets called on link status change interrupt. */
foo_handle_link(struct foo_softc *sc);
{

	...
	if (sc->sc_ifp->if_flags & IFF_UP) {
		if (link_active)
			device_is_active(sc->foo_dev);
		else
			device_is_idle(sc->foo_dev);
	}
}

/* If the device supports D1/D2 which interrupts on link status change: */
foo_intr(void *)
{
	struct foo_softc *sc;

	/* Invoked first so it can power on the device before we access it. */
	device_is_active(sc->foo_dev);
	...
}

int
foo_idle(device_t dev)
{
	struct foo_softc *sc;

	sc = device_get_softc(dev);
	if (sc->foo_ifp->if_flags & IFF_UP)
		device_set_powerstate(dev, D2);
	else
		device_set_powerstate(dev, D3);
}

int
foo_active(device_t dev)
{

	device_set_powerstate(dev, D0);
}

For the case where D1/D2 isn't supported foo_intr() would remain unchanged and 
foo_active() would be as above.  foo_idle() would be responsible for starting 
its own internal timer that would power the device UP, check for link, then 
power the device down in the IFF_UP case.

Behind the scenes the new-bus code would have a task and callout backing the 
idle timer (callout enqueues the task and the task invokes DEVICE_IDLE()).  
It would have to use its own internal locking I think to handle the various 
edge cases of cancelling the timer, etc.

This is also the first time I've written this down and I'm still thinking 
about how we can provide some infrastructure in new-bus to avoid having to 
duplicate a lot of work in device drivers themselves.

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200803180840.18275.jhb>