Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 21 Apr 2011 03:02:09 +1000 (EST)
From:      Ian Smith <smithi@nimnet.asn.au>
To:        Daniel Gerzo <danger@freebsd.org>
Cc:        Alexander Motin <mav@freebsd.org>, freebsd-stable@freebsd.org
Subject:   kern.smp.maxid error on i386 UP [was: powerd / cpufreq question]
Message-ID:  <20110420164100.Y43371@sola.nimnet.asn.au>
In-Reply-To: <20110413024230.Y35056@sola.nimnet.asn.au>
References:  <4D9EEDAF.3020803@rulez.sk> <20110411125416.S35056@sola.nimnet.asn.au> <4DA37E31.4020700@FreeBSD.org> <20110413024230.Y35056@sola.nimnet.asn.au>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 13 Apr 2011, Ian Smith wrote:
 > On Tue, 12 Apr 2011, Daniel Gerzo wrote:
 >  > On 11.4.2011 6:08, Ian Smith wrote:
[..]
 >  > > Are those kern.cp_times values as they came, or did you remove trailing
 >  > > zeroes?  Reason I ask is that on my Thinkpad T23, single-core 1133/733
 >  > > MHz, sysctl kern.cp_time shows the usual 5 values, but kern.cp_times has
 >  > > the same 5 values for cpu0, but then 5 zeroes for each of cpu1 through
 >  > > cpu31, on 8.2-PRE about early January.  I need to update the script to
 >  > > remove surplus data for non-existing cpus, but wonder if the extra data
 >  > > also appeared on your 12 core box?
 >  > 
 >  > I haven't removed anything, it's a pure copy&paste.
 > 
 > Thanks.  I'll check the single-cpu case again after updating to 8.2-R

Ok, still a problem on at least my i386 single core Thinkpad T23 at 
8.2-R, since 8.0 I think, certainly evident in a sysctl -a at 8.1-R

FreeBSD t23.smithi.id.au 8.2-RELEASE FreeBSD 8.2-RELEASE #1: Thu Apr 14 
21:45:47 EST 2011 root@t23.smithi.id.au:/usr/obj/usr/src/sys/GENERIC i386

Verbose dmesg: http://smithi.id.au/t23_dmesg_boot-v.8.2-R.txt 
sysctl -a:     http://smithi.id.au/t23_sysctl-a_8.2-R.txt

kern.ccpu: 0
  <cpu count="1" mask="0x1">0</cpu>
kern.smp.forward_signal_enabled: 1
kern.smp.topology: 0
kern.smp.cpus: 1
kern.smp.disabled: 0
kern.smp.active: 0
kern.smp.maxcpus: 32
kern.smp.maxid: 31	<<<<<<<
hw.ncpu: 1

kern.cp_times: 38548 1 120437 195677 9660939 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

/usr/src/sys/kern/kern_clock.c:
return SYSCTL_OUT(req, 0, sizeof(long) * CPUSTATES * (mp_maxid + 1));

Consumers of kern.cp_times like powerd, top, dtrace? and others have to 
loop over 32 cpus, all but one non-existent, and there seem to be many 
places in the kernel doing eg: for (cpu = 0; cpu <= mp_maxid; cpu++) { 
and while CPU_FOREACH / CPU_ABSENT will skip over them, seems wasteful 
at best on machines least likely to have cycles to spare.

eg: powerd parses kern.cp_times to count cpus, wasting cycles adding 
up the 31 'empty' cpus.  I haven't explored other userland consumers.

Clearly kern.smp.maxid (ie mp_maxid) should be 0, not 31.  On i386, 
non-APIC i386 at least, mp_maxid is not set to (mp_ncpus - 1) as on some 
other archs .. after having being initialised to (MAXCPU - 1) in 
/sys/i386/i386/mp_machdep.c it's never updated for non-smp machines.

I haven't chased all of these rabbits down all of their holes by any 
means, but it seems that making /sys/i386/i386/mp_machdep.c do what it 
says it's gonna do ('with an id of 0') should help.  Paste, tabs lost:

int
cpu_mp_probe(void)
{
        /*
         * Always record BSP in CPU map so that the mbuf init code works
         * correctly.
         */
        all_cpus = 1;
        if (mp_ncpus == 0) {
                /*
                 * No CPUs were found, so this must be a UP system.  Setup
                 * the variables to represent a system with a single CPU
                 * with an id of 0.
                 */
                mp_ncpus = 1;
+		mp_maxid = 0;
                return (0);
        }

        /* At least one CPU was found. */
        if (mp_ncpus == 1) {
                /*
                 * One CPU was found, so this must be a UP system with
                 * an I/O APIC.
                 */
+		mp_maxid = 0;
                return (0);
        }

        /* At least two CPUs were found. */
        return (1);
}

Note that the second added line above already exists in 
/sys/amd64/amd64/mp_machdep.c, maybe to fix a similar problem, though 
that should only apply to 'a UP system with an I/O APIC'.  Maybe better 
could be to fix this in cpu_mp_probe's caller, /sys/kern/subr_smp.c:

static void
mp_start(void *dummy)
{
        mtx_init(&smp_ipi_mtx, "smp rendezvous", NULL, MTX_SPIN);

        /* Probe for MP hardware. */
        if (smp_disabled != 0 || cpu_mp_probe() == 0) {
                mp_ncpus = 1;
+		mp_maxid = 0;
                all_cpus = PCPU_GET(cpumask);
                return;
        }

        cpu_mp_start();
        printf("FreeBSD/SMP: Multiprocessor System Detected: %d CPUs\n",
            mp_ncpus);
        cpu_mp_announce();
}

I'm probably a long way off base for a solution, but think I've located 
the problem.  Thoughts?  Is this a known issue?  Might any developers 
actually still have a single-cpu i386 system to check this on? :)

Very happy to test any patches etc.

cheers, Ian



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110420164100.Y43371>