Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 07 Jun 2006 17:21:28 -0400
From:      "Stephane E. Potvin" <sepotvin@videotron.ca>
To:        freebsd-acpi@FreeBSD.org
Subject:   [patch] Support for asymetrical per-cpu Cx states
Message-ID:  <44874358.6050608@videotron.ca>

next in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format.
--------------020609000000060304070901
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit

What started as a quick check why my laptop was not giving me Cx states
higher than C1 turned out in a major rework of the way the acpi_cpu
driver works.

The attached patch makes the following modifications to the driver:

- Support for asymetrical and per-cpu Cx states. This is done by parsing
the _CST packages for each cpu and tracking all the states individually,
on a per-cpu basis.

- Support to revert to generic FADT/P_BLK based Cx control if the _CST
package are not present on all cpus. In that case, the new driver will
still support per-cpu Cx state handling. The driver will determine the
highest Cx level that can be supported by all the cpus and configure the
available Cx state based on that.

- Fixed the case where multiple cpus in the system share the same
registers for Cx state handling. To do that, I added a new flag
parameter to the acpi_PkgGas and acpi_bus_alloc_gas functions that
enable the caller to add the RF_SHAREABLE flag. This will probably fix
the case where enabling the HT core would remove all Cx states (except
C1), but I have no mean to check it at this time. I've not added this
flag to the other callers in the tree but I guess that some of them
could use this flag when multiple cpus are present.

- I also found out that for Core Duo cpus, both cores seems to be taken
out of C3 state when any one of the cores need to transition out. This
broke the short sleep detection logic. I disabled it if there are more
than one cpu in the system for now as it fixed it in my case. I guess
that proper quirks handling will be required to fix this for known non
working systems.

- Added support to control cx_lowest on a per-cpu basis. I also
implemented a generic cx_lowest to enable changing cx_lowest for all
cpus with a single sysctl and for backward compatibility. The value
returned on read is kind of meaningless (is there an easy way to have a
write-only sysctl?) I've not done the same for the cx_supported case as
I was not sure which way to handle it in case all supported Cx states
were not symmetrical. Sample output for the new sysctl:

hw.acpi.cpu.0.cx_supported: C1/1 C2/1 C3/57
hw.acpi.cpu.0.cx_lowest: C3
hw.acpi.cpu.0.cx_usage: 0.00% 43.16% 56.83%
hw.acpi.cpu.1.cx_supported: C1/1 C2/1 C3/57
hw.acpi.cpu.1.cx_lowest: C3
hw.acpi.cpu.1.cx_usage: 0.00% 45.65% 54.34%
hw.acpi.cpu.cx_lowest: C3

I would appreciate any feedback, positive or negative, on this :)

Steph


--------------020609000000060304070901
Content-Type: text/x-patch;
 name="acpi_cpu.diff"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="acpi_cpu.diff"

Index: etc/rc.d/power_profile
===================================================================
RCS file: /home/FreeBSD/ncvs/src/etc/rc.d/power_profile,v
retrieving revision 1.9
diff -u -r1.9 power_profile
--- etc/rc.d/power_profile	21 Dec 2005 01:19:20 -0000	1.9
+++ etc/rc.d/power_profile	7 Jun 2006 21:16:12 -0000
@@ -76,7 +76,7 @@
 # Set the various sysctls based on the profile's values.
 node="hw.acpi.cpu.cx_lowest"
 highest_value="C1"
-lowest_value="`(sysctl -n hw.acpi.cpu.cx_supported | \
+lowest_value="`(sysctl -n hw.acpi.cpu.0.cx_supported | \
 	awk '{ print "C" split($0, a) }' -) 2> /dev/null`"
 eval value=\$${profile}_cx_lowest
 sysctl_set
Index: sys/dev/acpica/acpi.c
===================================================================
RCS file: /home/FreeBSD/ncvs/src/sys/dev/acpica/acpi.c,v
retrieving revision 1.224
diff -u -r1.224 acpi.c
--- sys/dev/acpica/acpi.c	16 May 2006 14:36:22 -0000	1.224
+++ sys/dev/acpica/acpi.c	7 Jun 2006 20:50:37 -0000
@@ -1106,7 +1106,7 @@
 /* Allocate an IO port or memory resource, given its GAS. */
 int
 acpi_bus_alloc_gas(device_t dev, int *type, int *rid, ACPI_GENERIC_ADDRESS *gas,
-    struct resource **res)
+    struct resource **res, u_int flags)
 {
     int error, res_type;
 
@@ -1139,7 +1139,7 @@
 
     bus_set_resource(dev, res_type, *rid, gas->Address,
 	gas->RegisterBitWidth / 8);
-    *res = bus_alloc_resource_any(dev, res_type, rid, RF_ACTIVE);
+    *res = bus_alloc_resource_any(dev, res_type, rid, RF_ACTIVE | flags);
     if (*res != NULL) {
 	*type = res_type;
 	error = 0;
Index: sys/dev/acpica/acpi_cpu.c
===================================================================
RCS file: /home/FreeBSD/ncvs/src/sys/dev/acpica/acpi_cpu.c,v
retrieving revision 1.59
diff -u -r1.59 acpi_cpu.c
--- sys/dev/acpica/acpi_cpu.c	25 Oct 2005 21:15:47 -0000	1.59
+++ sys/dev/acpica/acpi_cpu.c	7 Jun 2006 20:50:37 -0000
@@ -51,9 +51,6 @@
 
 /*
  * Support for ACPI Processor devices, including C[1-3] sleep states.
- *
- * TODO: implement scans of all CPUs to be sure all Cx states are
- * equivalent.
  */
 
 /* Hooks for the ACPI CA debugging infrastructure */
@@ -80,6 +77,16 @@
     int			 cpu_cx_count;	/* Number of valid Cx states. */
     int			 cpu_prev_sleep;/* Last idle sleep duration. */
     int			 cpu_features;	/* Child driver supported features. */
+    /* Runtime state. */
+    int		 	 cpu_non_c3;	/* Index of lowest non-C3 state. */
+    int		 	 cpu_short_slp;	/* Count of < 1us sleeps. */
+    u_int		 cpu_cx_stats[MAX_CX_STATES];/* Cx usage history. */
+    /* Values for sysctl. */
+    struct sysctl_ctx_list acpi_cpu_sysctl_ctx;
+    struct sysctl_oid	*acpi_cpu_sysctl_tree;
+    int		 	 cpu_cx_lowest;
+    char 		 cpu_cx_supported[64];
+    int			 cpu_rid;
 };
 
 struct acpi_cpu_device {
@@ -110,20 +117,17 @@
 /* Platform hardware resource information. */
 static uint32_t		 cpu_smi_cmd;	/* Value to write to SMI_CMD. */
 static uint8_t		 cpu_cst_cnt;	/* Indicate we are _CST aware. */
-static int		 cpu_rid;	/* Driver-wide resource id. */
 static int		 cpu_quirks;	/* Indicate any hardware bugs. */
 
 /* Runtime state. */
+static int		 cpu_disable_idle; /* Disable idle function */
 static int		 cpu_cx_count;	/* Number of valid states */
-static int		 cpu_non_c3;	/* Index of lowest non-C3 state. */
-static int		 cpu_short_slp;	/* Count of < 1us sleeps. */
-static u_int		 cpu_cx_stats[MAX_CX_STATES];/* Cx usage history. */
 
 /* Values for sysctl. */
 static struct sysctl_ctx_list acpi_cpu_sysctl_ctx;
 static struct sysctl_oid *acpi_cpu_sysctl_tree;
+static int		 cpu_cx_generic;
 static int		 cpu_cx_lowest;
-static char 		 cpu_cx_supported[64];
 
 static device_t		*cpu_devices;
 static int		 cpu_ndevices;
@@ -140,15 +144,17 @@
 static int	acpi_cpu_read_ivar(device_t dev, device_t child, int index,
 		    uintptr_t *result);
 static int	acpi_cpu_shutdown(device_t dev);
-static int	acpi_cpu_cx_probe(struct acpi_cpu_softc *sc);
+static void	acpi_cpu_cx_probe(struct acpi_cpu_softc *sc);
+static void	acpi_cpu_generic_cx_probe(struct acpi_cpu_softc *sc);
 static int	acpi_cpu_cx_cst(struct acpi_cpu_softc *sc);
 static void	acpi_cpu_startup(void *arg);
-static void	acpi_cpu_startup_cx(void);
+static void	acpi_cpu_startup_cx(struct acpi_cpu_softc *sc);
 static void	acpi_cpu_idle(void);
 static void	acpi_cpu_notify(ACPI_HANDLE h, UINT32 notify, void *context);
-static int	acpi_cpu_quirks(struct acpi_cpu_softc *sc);
+static int	acpi_cpu_quirks(void);
 static int	acpi_cpu_usage_sysctl(SYSCTL_HANDLER_ARGS);
 static int	acpi_cpu_cx_lowest_sysctl(SYSCTL_HANDLER_ARGS);
+static int	acpi_cpu_global_cx_lowest_sysctl(SYSCTL_HANDLER_ARGS);
 
 static device_method_t acpi_cpu_methods[] = {
     /* Device interface */
@@ -255,9 +261,10 @@
     struct acpi_softc	  *acpi_sc;
     ACPI_STATUS		   status;
     u_int		   features;
-    int			   cpu_id, drv_count, i;
+    int			   cpu_id, drv_count, i, dev_id;
     driver_t 		  **drivers;
     uint32_t		   cap_set[3];
+    char		   cpu_num[8];
 
     ACPI_FUNCTION_TRACE((char *)(uintptr_t)__func__);
 
@@ -265,6 +272,7 @@
     sc->cpu_dev = dev;
     sc->cpu_handle = acpi_get_handle(dev);
     cpu_id = acpi_get_magic(dev);
+    dev_id = device_get_unit(dev);
     cpu_softc[cpu_id] = sc;
     pcpu_data = pcpu_find(cpu_id);
     pcpu_data->pc_device = dev;
@@ -288,10 +296,29 @@
     ACPI_DEBUG_PRINT((ACPI_DB_INFO, "acpi_cpu%d: P_BLK at %#x/%d\n",
 		     device_get_unit(dev), sc->cpu_p_blk, sc->cpu_p_blk_len));
 
+    /*
+     * If this is the first cpu we attach, create and initialize the generic
+     * resources that will be used by all cpu devices.
+     */
     acpi_sc = acpi_device_get_parent_softc(dev);
-    sysctl_ctx_init(&acpi_cpu_sysctl_ctx);
-    acpi_cpu_sysctl_tree = SYSCTL_ADD_NODE(&acpi_cpu_sysctl_ctx,
-	SYSCTL_CHILDREN(acpi_sc->acpi_sysctl_tree), OID_AUTO, "cpu",
+    if (dev_id == 0) {
+	/* Assume we won't be using generic Cx mode by default */
+	cpu_cx_generic = 0;
+
+	/* Install root sysctl tree */
+	sysctl_ctx_init(&acpi_cpu_sysctl_ctx);
+	acpi_cpu_sysctl_tree = SYSCTL_ADD_NODE(&acpi_cpu_sysctl_ctx,
+	    SYSCTL_CHILDREN(acpi_sc->acpi_sysctl_tree), OID_AUTO, "cpu",
+	    CTLFLAG_RD, 0, "");
+
+	/* Queue post cpu-probing task handler */
+	AcpiOsQueueForExecution(OSD_PRIORITY_LO, acpi_cpu_startup, NULL);
+    }
+
+    snprintf(cpu_num, sizeof(cpu_num), "%d", dev_id);
+    sysctl_ctx_init(&sc->acpi_cpu_sysctl_ctx);
+    sc->acpi_cpu_sysctl_tree = SYSCTL_ADD_NODE(&sc->acpi_cpu_sysctl_ctx,
+	SYSCTL_CHILDREN(acpi_cpu_sysctl_tree), OID_AUTO, cpu_num,
 	CTLFLAG_RD, 0, "");
 
     /*
@@ -327,17 +354,8 @@
 	AcpiEvaluateObject(sc->cpu_handle, "_PDC", &arglist, NULL);
     }
 
-    /*
-     * Probe for Cx state support.  If it isn't present, free up unused
-     * resources.
-     */
-    if (acpi_cpu_cx_probe(sc) == 0) {
-	status = AcpiInstallNotifyHandler(sc->cpu_handle, ACPI_DEVICE_NOTIFY,
-					  acpi_cpu_notify, sc);
-	if (device_get_unit(dev) == 0)
-	    AcpiOsQueueForExecution(OSD_PRIORITY_LO, acpi_cpu_startup, NULL);
-    } else
-	sysctl_ctx_free(&acpi_cpu_sysctl_ctx);
+    /* Probe for Cx state support. */
+    acpi_cpu_cx_probe(sc);
 
     /* Finally,  call identify and probe/attach for child devices. */
     bus_generic_probe(dev);
@@ -440,7 +458,7 @@
     bus_generic_shutdown(dev);
 
     /* Disable any entry to the idle function. */
-    cpu_cx_count = 0;
+    cpu_disable_idle = 1;
 
     /* Signal and wait for all processors to exit acpi_cpu_idle(). */
     smp_rendezvous(NULL, NULL, NULL, NULL);
@@ -448,105 +466,101 @@
     return_VALUE (0);
 }
 
-static int
+static void
 acpi_cpu_cx_probe(struct acpi_cpu_softc *sc)
 {
-    ACPI_GENERIC_ADDRESS gas;
-    struct acpi_cx	*cx_ptr;
-    int			 error;
-
     ACPI_FUNCTION_TRACE((char *)(uintptr_t)__func__);
 
+    /* Use initial sleep value of 1 sec. to start with lowest idle state. */
+    sc->cpu_prev_sleep = 1000000;
+    sc->cpu_cx_lowest = 0;
+
     /*
-     * Bus mastering arbitration control is needed to keep caches coherent
-     * while sleeping in C3.  If it's not present but a working flush cache
-     * instruction is present, flush the caches before entering C3 instead.
-     * Otherwise, just disable C3 completely.
+     * Check for the ACPI 2.0 _CST sleep states object. If we can't find
+     * any, we'll revert to generic FADT/P_BLK Cx control method which will
+     * be handled by acpi_cpu_startup. We need to defer to after having
+     * probed all the cpus in the system before probing for generic Cx
+     * states as we may already have found cpus with valid _CST packages
      */
-    if (AcpiGbl_FADT->V1_Pm2CntBlk == 0 || AcpiGbl_FADT->Pm2CntLen == 0) {
-	if (AcpiGbl_FADT->WbInvd && AcpiGbl_FADT->WbInvdFlush == 0) {
-	    cpu_quirks |= CPU_QUIRK_NO_BM_CTRL;
-	    ACPI_DEBUG_PRINT((ACPI_DB_INFO,
-		"acpi_cpu%d: no BM control, using flush cache method\n",
-		device_get_unit(sc->cpu_dev)));
-	} else {
-	    cpu_quirks |= CPU_QUIRK_NO_C3;
-	    ACPI_DEBUG_PRINT((ACPI_DB_INFO,
-		"acpi_cpu%d: no BM control, C3 not available\n",
-		device_get_unit(sc->cpu_dev)));
-	}
+    if (!cpu_cx_generic && acpi_cpu_cx_cst(sc) != 0) {
+	/*
+	 * We were unable to find a _CST package for this cpu or there
+	 * was an error parsing it. Switch back to generic mode.
+	 */
+	cpu_cx_generic = 1;	
+	device_printf(sc->cpu_dev, "Switching to generic Cx mode\n");
     }
 
     /*
-     * First, check for the ACPI 2.0 _CST sleep states object.
-     * If not usable, fall back to the P_BLK's P_LVL2 and P_LVL3.
+     * TODO: _CSD Package should be checked here.
      */
+}
+
+static void
+acpi_cpu_generic_cx_probe(struct acpi_cpu_softc *sc)
+{
+    ACPI_GENERIC_ADDRESS	 gas;
+    struct acpi_cx	 	*cx_ptr;
+
     sc->cpu_cx_count = 0;
-    error = acpi_cpu_cx_cst(sc);
-    if (error != 0) {
-	cx_ptr = sc->cpu_cx_states;
-
-	/* C1 has been required since just after ACPI 1.0 */
-	cx_ptr->type = ACPI_STATE_C1;
-	cx_ptr->trans_lat = 0;
-	cpu_non_c3 = 0;
-	cx_ptr++;
-	sc->cpu_cx_count++;
-
-	/* 
-	 * The spec says P_BLK must be 6 bytes long.  However, some systems
-	 * use it to indicate a fractional set of features present so we
-	 * take 5 as C2.  Some may also have a value of 7 to indicate
-	 * another C3 but most use _CST for this (as required) and having
-	 * "only" C1-C3 is not a hardship.
-	 */
-	if (sc->cpu_p_blk_len < 5)
-	    goto done;
+    cx_ptr = sc->cpu_cx_states;
+
+    /* Use initial sleep value of 1 sec. to start with lowest idle state. */
+    sc->cpu_prev_sleep = 1000000;
+
+    /* C1 has been required since just after ACPI 1.0 */
+    cx_ptr->type = ACPI_STATE_C1;
+    cx_ptr->trans_lat = 0;
+    cx_ptr++;
+    sc->cpu_cx_count++;
+
+    /*
+     * The spec says P_BLK must be 6 bytes long.  However, some systems
+     * use it to indicate a fractional set of features present so we
+     * take 5 as C2.  Some may also have a value of 7 to indicate
+     * another C3 but most use _CST for this (as required) and having
+     * "only" C1-C3 is not a hardship.
+     */
+    if (sc->cpu_p_blk_len < 5)
+	return; 
 
-	/* Validate and allocate resources for C2 (P_LVL2). */
-	gas.AddressSpaceId = ACPI_ADR_SPACE_SYSTEM_IO;
-	gas.RegisterBitWidth = 8;
-	if (AcpiGbl_FADT->Plvl2Lat <= 100) {
-	    gas.Address = sc->cpu_p_blk + 4;
-	    acpi_bus_alloc_gas(sc->cpu_dev, &cx_ptr->res_type, &cpu_rid, &gas,
-		&cx_ptr->p_lvlx);
-	    if (cx_ptr->p_lvlx != NULL) {
-		cpu_rid++;
+    /* Validate and allocate resources for C2 (P_LVL2). */
+    gas.AddressSpaceId = ACPI_ADR_SPACE_SYSTEM_IO;
+    gas.RegisterBitWidth = 8;
+    if (AcpiGbl_FADT->Plvl2Lat <= 100) {
+	gas.Address = sc->cpu_p_blk + 4;
+	acpi_bus_alloc_gas(sc->cpu_dev, &cx_ptr->res_type, &sc->cpu_rid, &gas,
+	    &cx_ptr->p_lvlx, RF_SHAREABLE);
+	if (cx_ptr->p_lvlx != NULL) {
+		sc->cpu_rid++;
 		cx_ptr->type = ACPI_STATE_C2;
 		cx_ptr->trans_lat = AcpiGbl_FADT->Plvl2Lat;
-		cpu_non_c3 = 1;
 		cx_ptr++;
 		sc->cpu_cx_count++;
-	    }
 	}
-	if (sc->cpu_p_blk_len < 6)
-	    goto done;
+    }
+    if (sc->cpu_p_blk_len < 6)
+	return;
 
-	/* Validate and allocate resources for C3 (P_LVL3). */
-	if (AcpiGbl_FADT->Plvl3Lat <= 1000 &&
-	    (cpu_quirks & CPU_QUIRK_NO_C3) == 0) {
-	    gas.Address = sc->cpu_p_blk + 5;
-	    acpi_bus_alloc_gas(sc->cpu_dev, &cx_ptr->res_type, &cpu_rid, &gas,
-		&cx_ptr->p_lvlx);
-	    if (cx_ptr->p_lvlx != NULL) {
-		cpu_rid++;
+    /* Validate and allocate resources for C3 (P_LVL3). */
+    if (AcpiGbl_FADT->Plvl3Lat <= 1000) {
+	gas.Address = sc->cpu_p_blk + 5;
+	acpi_bus_alloc_gas(sc->cpu_dev, &cx_ptr->res_type, &sc->cpu_rid, &gas,
+	    &cx_ptr->p_lvlx, RF_SHAREABLE);
+	if (cx_ptr->p_lvlx != NULL) {
+		sc->cpu_rid++;
 		cx_ptr->type = ACPI_STATE_C3;
 		cx_ptr->trans_lat = AcpiGbl_FADT->Plvl3Lat;
 		cx_ptr++;
 		sc->cpu_cx_count++;
-	    }
 	}
     }
 
-done:
-    /* If no valid registers were found, don't attach. */
-    if (sc->cpu_cx_count == 0)
-	return (ENXIO);
+    /* Update the largest cx_count seen so far */
+    if (sc->cpu_cx_count > cpu_cx_count)
+	cpu_cx_count = sc->cpu_cx_count;
 
-    /* Use initial sleep value of 1 sec. to start with lowest idle state. */
-    sc->cpu_prev_sleep = 1000000;
-
-    return (0);
+    return;
 }
 
 /*
@@ -570,8 +584,10 @@
     buf.Pointer = NULL;
     buf.Length = ACPI_ALLOCATE_BUFFER;
     status = AcpiEvaluateObject(sc->cpu_handle, "_CST", NULL, &buf);
-    if (ACPI_FAILURE(status))
+    if (ACPI_FAILURE(status)) {
+    	device_printf(sc->cpu_dev, "Unable to find _CST method\n");
 	return (ENXIO);
+    }
 
     /* _CST is a package with a count and at least one Cx package. */
     top = (ACPI_OBJECT *)buf.Pointer;
@@ -603,11 +619,13 @@
 	    device_printf(sc->cpu_dev, "skipping invalid Cx state package\n");
 	    continue;
 	}
+	device_printf(sc->cpu_dev, "type = %d, trans_lat = %d, power = %d\n",
+	    cx_ptr->type, cx_ptr->trans_lat, cx_ptr->power);
 
 	/* Validate the state to see if we should use it. */
 	switch (cx_ptr->type) {
 	case ACPI_STATE_C1:
-	    cpu_non_c3 = i;
+	    sc->cpu_non_c3 = i;
 	    cx_ptr++;
 	    sc->cpu_cx_count++;
 	    continue;
@@ -618,7 +636,7 @@
 				 device_get_unit(sc->cpu_dev), i));
 		continue;
 	    }
-	    cpu_non_c3 = i;
+	    sc->cpu_non_c3 = i;
 	    break;
 	case ACPI_STATE_C3:
 	default:
@@ -642,10 +660,31 @@
 #endif
 
 	/* Allocate the control register for C2 or C3. */
-	acpi_PkgGas(sc->cpu_dev, pkg, 0, &cx_ptr->res_type, &cpu_rid,
-	    &cx_ptr->p_lvlx);
+	{
+		ACPI_GENERIC_ADDRESS gas;
+		ACPI_OBJECT *obj;
+
+		obj = &pkg->Package.Elements[0];
+		if (obj == NULL || obj->Type != ACPI_TYPE_BUFFER ||
+		    obj->Buffer.Length  < sizeof(ACPI_GENERIC_ADDRESS) + 3)
+		{
+			device_printf(sc->cpu_dev, "Invalid Gas\n");
+		}
+		else
+		{
+			memcpy(&gas, obj->Buffer.Pointer + 3, sizeof(gas));
+			device_printf(sc->cpu_dev, "gas = %08x\n", (uint32_t) obj->Buffer.Pointer);
+			device_printf(sc->cpu_dev, "AddressSpaceId = %02x\n", gas.AddressSpaceId);
+			device_printf(sc->cpu_dev, "RegisterBitWidth = %02x\n", gas.RegisterBitWidth);
+			device_printf(sc->cpu_dev, "RegisterBidOffset = %02x\n", gas.RegisterBitOffset);
+			device_printf(sc->cpu_dev, "Address = %016llx\n", gas.Address);
+		}
+	}
+		
+	acpi_PkgGas(sc->cpu_dev, pkg, 0, &cx_ptr->res_type, &sc->cpu_rid,
+	    &cx_ptr->p_lvlx, RF_SHAREABLE);
 	if (cx_ptr->p_lvlx) {
-	    cpu_rid++;
+	    sc->cpu_rid++;
 	    ACPI_DEBUG_PRINT((ACPI_DB_INFO,
 			     "acpi_cpu%d: Got C%d - %d latency\n",
 			     device_get_unit(sc->cpu_dev), cx_ptr->type,
@@ -666,81 +705,121 @@
 acpi_cpu_startup(void *arg)
 {
     struct acpi_cpu_softc *sc;
-    int count, i;
+    int i;
 
     /* Get set of CPU devices */
     devclass_get_devices(acpi_cpu_devclass, &cpu_devices, &cpu_ndevices);
 
-    /* Check for quirks via the first CPU device. */
-    sc = device_get_softc(cpu_devices[0]);
-    acpi_cpu_quirks(sc);
-
     /*
-     * Make sure all the processors' Cx counts match.  We should probably
-     * also check the contents of each.  However, no known systems have
-     * non-matching Cx counts so we'll deal with this later.
+     * Setup any quirks that might necessary now that we have probed
+     * all the CPUs
      */
-    count = MAX_CX_STATES;
-    for (i = 0; i < cpu_ndevices; i++) {
-	sc = device_get_softc(cpu_devices[i]);
-	count = min(sc->cpu_cx_count, count);
+    acpi_cpu_quirks();
+
+    cpu_cx_count = 0;
+    if (cpu_cx_generic) {
+	/*
+	 * We are using generic Cx mode, probe for available Cx states
+	 * for all processors.
+	 */
+	for (i = 0; i < cpu_ndevices; i++) {
+		sc = device_get_softc(cpu_devices[i]);
+		acpi_cpu_generic_cx_probe(sc);
+	}
+
+	/*
+	 * Find the highest Cx state common to all CPUs
+	 * in the system, taking quirks into account.
+	 */
+	for (i = 0; i < cpu_ndevices; i++) {
+		sc = device_get_softc(cpu_devices[i]);
+		if (sc->cpu_cx_count < cpu_cx_count)
+			cpu_cx_count = sc->cpu_cx_count;
+	}
+    } else {
+	/*
+	 * We are using _CST mode, remove C3 state if necessary.
+	 * Update the largest Cx state supported in the global cpu_cx_count.
+	 * It will be used in the Globa Cx sysctl handler.
+	 * As we now know for sure that we will be using _CST mode
+	 * install our notify handler.
+	 */
+	for (i = 0; i < cpu_ndevices; i++) {
+		sc = device_get_softc(cpu_devices[i]);
+		if (cpu_quirks && CPU_QUIRK_NO_C3) {
+			sc->cpu_cx_count = sc->cpu_non_c3 + 1;
+		}
+		if (sc->cpu_cx_count > cpu_cx_count)
+			cpu_cx_count = sc->cpu_cx_count;
+		AcpiInstallNotifyHandler(sc->cpu_handle,
+		    ACPI_DEVICE_NOTIFY, acpi_cpu_notify, sc);
+	}
     }
-    cpu_cx_count = count;
 
     /* Perform Cx final initialization. */
-    sc = device_get_softc(cpu_devices[0]);
-    if (cpu_cx_count > 0)
-	acpi_cpu_startup_cx();
+    for (i = 0; i < cpu_ndevices; i++) {
+    	sc = device_get_softc(cpu_devices[i]);
+	acpi_cpu_startup_cx(sc);
+    }
+
+    /* Add a sysctl handler to handle global Cx lowest setting */
+    SYSCTL_ADD_PROC(&acpi_cpu_sysctl_ctx,
+	SYSCTL_CHILDREN(acpi_cpu_sysctl_tree),
+	OID_AUTO, "cx_lowest", CTLTYPE_STRING | CTLFLAG_RW,
+	NULL, 0, acpi_cpu_global_cx_lowest_sysctl, "A",
+	"Global lowest Cx sleep state to use");
+
+    /* Take over idling from cpu_idle_default(). */
+    cpu_cx_lowest = 0;
+    cpu_disable_idle = 0;
+    cpu_idle_hook = acpi_cpu_idle;
 }
 
 static void
-acpi_cpu_startup_cx()
+acpi_cpu_startup_cx(struct acpi_cpu_softc *sc)
 {
-    struct acpi_cpu_softc *sc;
     struct sbuf sb;
     int i;
 
     /*
-     * Set up the list of Cx states, eliminating C3 states by truncating
-     * cpu_cx_count if quirks indicate C3 is not usable.
+     * Set up the list of Cx states
      */
-    sc = device_get_softc(cpu_devices[0]);
-    sbuf_new(&sb, cpu_cx_supported, sizeof(cpu_cx_supported), SBUF_FIXEDLEN);
-    for (i = 0; i < cpu_cx_count; i++) {
-	if ((cpu_quirks & CPU_QUIRK_NO_C3) == 0 ||
-	    sc->cpu_cx_states[i].type != ACPI_STATE_C3)
-	    sbuf_printf(&sb, "C%d/%d ", i + 1, sc->cpu_cx_states[i].trans_lat);
-	else
-	    cpu_cx_count = i;
+    sc->cpu_non_c3 = 0;
+    sbuf_new(&sb, sc->cpu_cx_supported, sizeof(sc->cpu_cx_supported),
+        SBUF_FIXEDLEN);
+    for (i = 0; i < sc->cpu_cx_count; i++) {
+	sbuf_printf(&sb, "C%d/%d ", i + 1, sc->cpu_cx_states[i].trans_lat);
+	if (sc->cpu_cx_states[i].type < ACPI_STATE_C3)
+		sc->cpu_non_c3 = i;
     }
+
     sbuf_trim(&sb);
     sbuf_finish(&sb);
-    SYSCTL_ADD_STRING(&acpi_cpu_sysctl_ctx,
-		      SYSCTL_CHILDREN(acpi_cpu_sysctl_tree),
-		      OID_AUTO, "cx_supported", CTLFLAG_RD, cpu_cx_supported,
-		      0, "Cx/microsecond values for supported Cx states");
-    SYSCTL_ADD_PROC(&acpi_cpu_sysctl_ctx,
-		    SYSCTL_CHILDREN(acpi_cpu_sysctl_tree),
+    SYSCTL_ADD_STRING(&sc->acpi_cpu_sysctl_ctx,
+		      SYSCTL_CHILDREN(sc->acpi_cpu_sysctl_tree),
+		      OID_AUTO, "cx_supported", CTLFLAG_RD,
+		      sc->cpu_cx_supported, 0,
+		      "Cx/microsecond values for supported Cx states");
+    SYSCTL_ADD_PROC(&sc->acpi_cpu_sysctl_ctx,
+		    SYSCTL_CHILDREN(sc->acpi_cpu_sysctl_tree),
 		    OID_AUTO, "cx_lowest", CTLTYPE_STRING | CTLFLAG_RW,
-		    NULL, 0, acpi_cpu_cx_lowest_sysctl, "A",
+		    (void *)sc, 0, acpi_cpu_cx_lowest_sysctl, "A",
 		    "lowest Cx sleep state to use");
-    SYSCTL_ADD_PROC(&acpi_cpu_sysctl_ctx,
-		    SYSCTL_CHILDREN(acpi_cpu_sysctl_tree),
+    SYSCTL_ADD_PROC(&sc->acpi_cpu_sysctl_ctx,
+		    SYSCTL_CHILDREN(sc->acpi_cpu_sysctl_tree),
 		    OID_AUTO, "cx_usage", CTLTYPE_STRING | CTLFLAG_RD,
-		    NULL, 0, acpi_cpu_usage_sysctl, "A",
+		    (void *)sc, 0, acpi_cpu_usage_sysctl, "A",
 		    "percent usage for each Cx state");
 
 #ifdef notyet
     /* Signal platform that we can handle _CST notification. */
-    if (cpu_cst_cnt != 0) {
+    if (!cpu_cx_generic) {
 	ACPI_LOCK(acpi);
 	AcpiOsWritePort(cpu_smi_cmd, cpu_cst_cnt, 8);
 	ACPI_UNLOCK(acpi);
     }
 #endif
 
-    /* Take over idling from cpu_idle_default(). */
-    cpu_idle_hook = acpi_cpu_idle;
 }
 
 /*
@@ -758,7 +837,7 @@
     int		bm_active, cx_next_idx, i;
 
     /* If disabled, return immediately. */
-    if (cpu_cx_count == 0) {
+    if (cpu_disable_idle != 0) {
 	ACPI_ENABLE_IRQS();
 	return;
     }
@@ -779,28 +858,34 @@
      * find the lowest state that has a latency less than or equal to
      * the length of our last sleep.
      */
-    cx_next_idx = cpu_cx_lowest;
+    cx_next_idx = sc->cpu_cx_lowest;
     if (sc->cpu_prev_sleep < 100) {
 	/*
 	 * If we sleep too short all the time, this system may not implement
 	 * C2/3 correctly (i.e. reads return immediately).  In this case,
 	 * back off and use the next higher level.
+	 * It seems that when you have a dual core cpu (like the Intel Core Duo)
+	 * that both cores will get out of C3 state as soon as one of them
+	 * requires it. This breaks the sleep detection logic as the sleep
+	 * counter is local to each cpu. Disable the sleep logic for now as a
+	 * workaround if there's more than one CPU. The right fix would probably
+	 * be to add quirks for system that don't really support C3 state.
 	 */
-	if (sc->cpu_prev_sleep <= 1) {
-	    cpu_short_slp++;
-	    if (cpu_short_slp == 1000 && cpu_cx_lowest != 0) {
-		if (cpu_non_c3 == cpu_cx_lowest && cpu_non_c3 != 0)
-		    cpu_non_c3--;
-		cpu_cx_lowest--;
-		cpu_short_slp = 0;
+	if (mp_ncpus < 2 && sc->cpu_prev_sleep <= 1) {
+	    sc->cpu_short_slp++;
+	    if (sc->cpu_short_slp == 1000 && sc->cpu_cx_lowest != 0) {
+		if (sc->cpu_non_c3 == sc->cpu_cx_lowest && sc->cpu_non_c3 != 0)
+		    sc->cpu_non_c3--;
+		sc->cpu_cx_lowest--;
+		sc->cpu_short_slp = 0;
 		device_printf(sc->cpu_dev,
 		    "too many short sleeps, backing off to C%d\n",
-		    cpu_cx_lowest + 1);
+		    sc->cpu_cx_lowest + 1);
 	    }
 	} else
-	    cpu_short_slp = 0;
+	    sc->cpu_short_slp = 0;
 
-	for (i = cpu_cx_lowest; i >= 0; i--)
+	for (i = sc->cpu_cx_lowest; i >= 0; i--)
 	    if (sc->cpu_cx_states[i].trans_lat <= sc->cpu_prev_sleep) {
 		cx_next_idx = i;
 		break;
@@ -819,13 +904,13 @@
 	if (bm_active != 0) {
 	    AcpiSetRegister(ACPI_BITREG_BUS_MASTER_STATUS, 1,
 		ACPI_MTX_DO_NOT_LOCK);
-	    cx_next_idx = min(cx_next_idx, cpu_non_c3);
+	    cx_next_idx = min(cx_next_idx, sc->cpu_non_c3);
 	}
     }
 
     /* Select the next state and update statistics. */
     cx_next = &sc->cpu_cx_states[cx_next_idx];
-    cpu_cx_stats[cx_next_idx]++;
+    sc->cpu_cx_stats[cx_next_idx]++;
     KASSERT(cx_next->type != ACPI_STATE_C0, ("acpi_cpu_idle: C0 sleep"));
 
     /*
@@ -901,15 +986,35 @@
 }
 
 static int
-acpi_cpu_quirks(struct acpi_cpu_softc *sc)
+acpi_cpu_quirks(void)
 {
     device_t acpi_dev;
 
     /*
-     * C3 on multiple CPUs requires using the expensive flush cache
-     * instruction.
+     * Bus mastering arbitration control is needed to keep caches coherent
+     * while sleeping in C3.  If it's not present but a working flush cache
+     * instruction is present, flush the caches before entering C3 instead.
+     * Otherwise, just disable C3 completely.
      */
-    if (mp_ncpus > 1)
+    if (AcpiGbl_FADT->V1_Pm2CntBlk == 0 || AcpiGbl_FADT->Pm2CntLen == 0) {
+	if (AcpiGbl_FADT->WbInvd && AcpiGbl_FADT->WbInvdFlush == 0) {
+	    cpu_quirks |= CPU_QUIRK_NO_BM_CTRL;
+	    ACPI_DEBUG_PRINT((ACPI_DB_INFO,
+		"acpi_cpu%d: no BM control, using flush cache method\n",
+		device_get_unit(sc->cpu_dev)));
+	} else {
+	    cpu_quirks |= CPU_QUIRK_NO_C3;
+	    ACPI_DEBUG_PRINT((ACPI_DB_INFO,
+		"acpi_cpu%d: no BM control, C3 not available\n",
+		device_get_unit(sc->cpu_dev)));
+	}
+    }
+
+    /*
+     * If we are using generic Cx mode, C3 on multiple CPUs requires using
+     * the expensive flush cache instruction.
+     */
+    if (cpu_cx_generic && mp_ncpus > 1)
 	cpu_quirks |= CPU_QUIRK_NO_BM_CTRL;
 
     /* Look for various quirks of the PIIX4 part. */
@@ -943,18 +1048,20 @@
 static int
 acpi_cpu_usage_sysctl(SYSCTL_HANDLER_ARGS)
 {
+    struct acpi_cpu_softc *sc;
     struct sbuf	 sb;
     char	 buf[128];
     int		 i;
     uintmax_t	 fract, sum, whole;
 
+    sc = (struct acpi_cpu_softc *) arg1;
     sum = 0;
-    for (i = 0; i < cpu_cx_count; i++)
-	sum += cpu_cx_stats[i];
+    for (i = 0; i < sc->cpu_cx_count; i++)
+	sum += sc->cpu_cx_stats[i];
     sbuf_new(&sb, buf, sizeof(buf), SBUF_FIXEDLEN);
-    for (i = 0; i < cpu_cx_count; i++) {
+    for (i = 0; i < sc->cpu_cx_count; i++) {
 	if (sum > 0) {
-	    whole = (uintmax_t)cpu_cx_stats[i] * 100;
+	    whole = (uintmax_t)sc->cpu_cx_stats[i] * 100;
 	    fract = (whole % sum) * 100;
 	    sbuf_printf(&sb, "%u.%02u%% ", (u_int)(whole / sum),
 		(u_int)(fract / sum));
@@ -976,32 +1083,74 @@
     char	 state[8];
     int		 val, error, i;
 
-    sc = device_get_softc(cpu_devices[0]);
-    snprintf(state, sizeof(state), "C%d", cpu_cx_lowest + 1);
+    sc = (struct acpi_cpu_softc *) arg1;
+    snprintf(state, sizeof(state), "C%d", sc->cpu_cx_lowest + 1);
     error = sysctl_handle_string(oidp, state, sizeof(state), req);
     if (error != 0 || req->newptr == NULL)
 	return (error);
     if (strlen(state) < 2 || toupper(state[0]) != 'C')
 	return (EINVAL);
     val = (int) strtol(state + 1, NULL, 10) - 1;
-    if (val < 0 || val > cpu_cx_count - 1)
+    if (val < 0 || val > sc->cpu_cx_count - 1)
 	return (EINVAL);
 
     ACPI_SERIAL_BEGIN(cpu);
-    cpu_cx_lowest = val;
+    sc->cpu_cx_lowest = val;
 
     /* If not disabling, cache the new lowest non-C3 state. */
-    cpu_non_c3 = 0;
-    for (i = cpu_cx_lowest; i >= 0; i--) {
+    sc->cpu_non_c3 = 0;
+    for (i = sc->cpu_cx_lowest; i >= 0; i--) {
 	if (sc->cpu_cx_states[i].type < ACPI_STATE_C3) {
-	    cpu_non_c3 = i;
+	    sc->cpu_non_c3 = i;
 	    break;
 	}
     }
 
     /* Reset the statistics counters. */
-    bzero(cpu_cx_stats, sizeof(cpu_cx_stats));
+    bzero(sc->cpu_cx_stats, sizeof(sc->cpu_cx_stats));
+    ACPI_SERIAL_END(cpu);
+
+    return (0);
+}
+
+static int
+acpi_cpu_global_cx_lowest_sysctl(SYSCTL_HANDLER_ARGS)
+{
+    struct	acpi_cpu_softc *sc;
+    char	state[8];
+    int		val, error, i, j;
+
+    snprintf(state, sizeof(state), "C%d", cpu_cx_lowest + 1);
+    error = sysctl_handle_string(oidp, state, sizeof(state), req);
+    if (error != 0 || req->newptr == NULL)
+	return (error);
+    if (strlen(state) < 2 || toupper(state[0]) != 'C')
+	return (EINVAL);
+    val = (int) strtol(state + 1, NULL, 10) - 1;
+    if (val < 0 || val > cpu_cx_count - 1)
+        return (EINVAL);
+
+    cpu_cx_lowest = val;
+
+    /*
+     * Update the new lowest useable Cx state for all CPUs
+     */
+    ACPI_SERIAL_BEGIN(cpu);
+    for (i = 0; i < cpu_ndevices; i++) {
+	sc = device_get_softc(cpu_devices[i]);
+	sc->cpu_cx_lowest = cpu_cx_lowest;
+	sc->cpu_non_c3 = 0;
+	for (j = sc->cpu_cx_lowest; j >= 0; j++) {
+		if (sc->cpu_cx_states[i].type < ACPI_STATE_C3) {
+			sc->cpu_non_c3 = i;
+			break;
+		}
+	}
+	/* Reset the statistics counters. */
+	bzero(sc->cpu_cx_stats, sizeof(sc->cpu_cx_stats));
+    }
     ACPI_SERIAL_END(cpu);
 
     return (0);
 }
+
Index: sys/dev/acpica/acpi_package.c
===================================================================
RCS file: /home/FreeBSD/ncvs/src/sys/dev/acpica/acpi_package.c,v
retrieving revision 1.8
diff -u -r1.8 acpi_package.c
--- sys/dev/acpica/acpi_package.c	11 Sep 2005 18:39:01 -0000	1.8
+++ sys/dev/acpica/acpi_package.c	7 Jun 2006 20:50:37 -0000
@@ -104,7 +104,7 @@
 
 int
 acpi_PkgGas(device_t dev, ACPI_OBJECT *res, int idx, int *type, int *rid,
-    struct resource **dst)
+    struct resource **dst, u_int flags)
 {
     ACPI_GENERIC_ADDRESS gas;
     ACPI_OBJECT *obj;
@@ -116,7 +116,7 @@
 
     memcpy(&gas, obj->Buffer.Pointer + 3, sizeof(gas));
 
-    return (acpi_bus_alloc_gas(dev, type, rid, &gas, dst));
+    return (acpi_bus_alloc_gas(dev, type, rid, &gas, dst, flags));
 }
 
 ACPI_HANDLE
Index: sys/dev/acpica/acpi_perf.c
===================================================================
RCS file: /home/FreeBSD/ncvs/src/sys/dev/acpica/acpi_perf.c,v
retrieving revision 1.23
diff -u -r1.23 acpi_perf.c
--- sys/dev/acpica/acpi_perf.c	12 Dec 2005 11:15:20 -0000	1.23
+++ sys/dev/acpica/acpi_perf.c	7 Jun 2006 20:50:37 -0000
@@ -191,7 +191,7 @@
 	pkg = (ACPI_OBJECT *)buf.Pointer;
 	if (ACPI_PKG_VALID(pkg, 2)) {
 		rid = 0;
-		error = acpi_PkgGas(dev, pkg, 0, &type, &rid, &res);
+		error = acpi_PkgGas(dev, pkg, 0, &type, &rid, &res, 0);
 		switch (error) {
 		case 0:
 			bus_release_resource(dev, type, rid, res);
@@ -323,7 +323,7 @@
 	}
 
 	error = acpi_PkgGas(sc->dev, pkg, 0, &sc->perf_ctrl_type, &sc->px_rid,
-	    &sc->perf_ctrl);
+	    &sc->perf_ctrl, 0);
 	if (error) {
 		/*
 		 * If the register is of type FFixedHW, we can only return
@@ -339,7 +339,7 @@
 	sc->px_rid++;
 
 	error = acpi_PkgGas(sc->dev, pkg, 1, &sc->perf_sts_type, &sc->px_rid,
-	    &sc->perf_status);
+	    &sc->perf_status, 0);
 	if (error) {
 		if (error == EOPNOTSUPP) {
 			sc->info_only = TRUE;
Index: sys/dev/acpica/acpi_throttle.c
===================================================================
RCS file: /home/FreeBSD/ncvs/src/sys/dev/acpica/acpi_throttle.c,v
retrieving revision 1.9
diff -u -r1.9 acpi_throttle.c
--- sys/dev/acpica/acpi_throttle.c	21 Feb 2006 03:15:26 -0000	1.9
+++ sys/dev/acpica/acpi_throttle.c	7 Jun 2006 20:50:37 -0000
@@ -278,7 +278,7 @@
 		}
 		memcpy(&gas, obj.Buffer.Pointer + 3, sizeof(gas));
 		acpi_bus_alloc_gas(sc->cpu_dev, &sc->cpu_p_type, &thr_rid,
-		    &gas, &sc->cpu_p_cnt);
+		    &gas, &sc->cpu_p_cnt, 0);
 		if (sc->cpu_p_cnt != NULL && bootverbose) {
 			device_printf(sc->cpu_dev, "P_CNT from _PTC %#jx\n",
 			    gas.Address);
@@ -298,7 +298,7 @@
 		gas.AddressSpaceId = ACPI_ADR_SPACE_SYSTEM_IO;
 		gas.RegisterBitWidth = 32;
 		acpi_bus_alloc_gas(sc->cpu_dev, &sc->cpu_p_type, &thr_rid,
-		    &gas, &sc->cpu_p_cnt);
+		    &gas, &sc->cpu_p_cnt, 0);
 		if (sc->cpu_p_cnt != NULL) {
 			if (bootverbose)
 				device_printf(sc->cpu_dev,
Index: sys/dev/acpica/acpivar.h
===================================================================
RCS file: /home/FreeBSD/ncvs/src/sys/dev/acpica/acpivar.h,v
retrieving revision 1.100
diff -u -r1.100 acpivar.h
--- sys/dev/acpica/acpivar.h	6 Dec 2005 14:47:28 -0000	1.100
+++ sys/dev/acpica/acpivar.h	7 Jun 2006 20:50:37 -0000
@@ -311,7 +311,8 @@
 void		acpi_UserNotify(const char *subsystem, ACPI_HANDLE h,
 		    uint8_t notify);
 int		acpi_bus_alloc_gas(device_t dev, int *type, int *rid,
-		    ACPI_GENERIC_ADDRESS *gas, struct resource **res);
+		    ACPI_GENERIC_ADDRESS *gas, struct resource **res,
+		    u_int flags);
 
 struct acpi_parse_resource_set {
     void	(*set_init)(device_t dev, void *arg, void **context);
@@ -417,7 +418,7 @@
 int		acpi_PkgInt32(ACPI_OBJECT *res, int idx, uint32_t *dst);
 int		acpi_PkgStr(ACPI_OBJECT *res, int idx, void *dst, size_t size);
 int		acpi_PkgGas(device_t dev, ACPI_OBJECT *res, int idx, int *type,
-		    int *rid, struct resource **dst);
+		    int *rid, struct resource **dst, u_int flags);
 ACPI_HANDLE	acpi_GetReference(ACPI_HANDLE scope, ACPI_OBJECT *obj);
 
 /* Default number of task queue threads to start. */

--------------020609000000060304070901--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?44874358.6050608>