From owner-freebsd-arch@FreeBSD.ORG Fri Mar 14 05:58:23 2008 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 13B791065676 for ; Fri, 14 Mar 2008 05:58:23 +0000 (UTC) (envelope-from joseph.koshy@gmail.com) Received: from fg-out-1718.google.com (fg-out-1718.google.com [72.14.220.156]) by mx1.freebsd.org (Postfix) with ESMTP id 8CA9B8FC25 for ; Fri, 14 Mar 2008 05:58:22 +0000 (UTC) (envelope-from joseph.koshy@gmail.com) Received: by fg-out-1718.google.com with SMTP id 16so3271767fgg.35 for ; Thu, 13 Mar 2008 22:58:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; bh=ba2RW5BQ+x3ZpKBwn6UASgrnMPMgo60kJVs+Ojjo2Lg=; b=caT+5axDR36Znl8/hXU6ZpvYxlxikwT0gCmJ8VTDQZym9squ++4unFfDXTZ4zMhXsPWK9PrTDj6RhZ5QNYfdKYWhFph2F+vQa62R46GpEYCOAyZHp2heK0LSSOuPI6Cu5zIxClirFL/lmyiONIbY9GGbpFKPPAKB5gxXMMj2uPg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=DpvGQU5I4tXIOvC2YQg38dLo9oK3zI9TIHkKpr3w0TgacKTPSAYH6vHcfptbUtwDhQ35QaSNSPE0hHzYaX8Jy0Nvco58LAz864RL37et7edRfdzmrLu/iuF86O7OTme13hq11ky3vZ7nTYzuuTHULUSFUsTSUnJo4gwet5yThrA= Received: by 10.86.68.20 with SMTP id q20mr2306814fga.59.1205472763829; Thu, 13 Mar 2008 22:32:43 -0700 (PDT) Received: by 10.86.99.18 with HTTP; Thu, 13 Mar 2008 22:32:43 -0700 (PDT) Message-ID: <84dead720803132232k15c3aad7pe59875f0c84e0c27@mail.gmail.com> Date: Fri, 14 Mar 2008 11:02:43 +0530 From: "Joseph Koshy" To: "John Baldwin" In-Reply-To: <200803131516.12284.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <20080313180805.GA83406@dragon.NUXI.org> <200803131516.12284.jhb@freebsd.org> Cc: freebsd-arch@freebsd.org Subject: Re: [PATCH] hwpmc(4) changes to use 'mp_maxid' instead of 'mp_ncpus'. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Mar 2008 05:58:23 -0000 On Fri, Mar 14, 2008 at 12:46 AM, John Baldwin wrote: > On Thursday 13 March 2008 02:08:05 pm David O'Brien wrote: > > Hi folks, > > Some folks at Juniper have submitted these changes to hwpmc(4). > > I am sending them here for public review. > > > > Their thoughts are: > > The mp_ncpus refers to the count of the active CPU's. Where as > > mp_maxid refers to the count of all the cpus on the SMP. Using > > mp_ncpus in the cpu_id range-check of hwpmc module would lead to the > > assumption that all the active CPU's in the SMP are not interleaved. > > But for running on some platforms, the active and inactive cpus could > > be interleaved making hwpmc not work for the cpus whose cpu_id is > > greater than the active-cpu count. jhb> This is correct, but you need to handle CPUs that are absent. It might be jhb> sufficient to update pmc_cpu_is_disabled() in kern_pmc.c to check jhb> CPU_ABSENT(cpu) and claim the CPU is disabled if it is absent, but I'm not jhb> sure that will catch everything as that seems aimed at handling having a jhb> non-absent CPU halted (such as disabling HTT on i386). That is inline with the feedback (and sample patch to kern_pmc.c) that I had sent in to O'Brien. But there are other problems with the patch at various levels, probably not obvious to someone who is just looking at the kernel code. First, the relevance. My understanding is that these changes are for a proprietary SMP platform that uses non-mainstream (Tier3 or Tier4) CPUs. It so happens that Juniper decided to numbers CPUs 'sparsely' in their kernel variant and that is the motivation for this patch. IMO, as a policy, code changes for exotic hardware need to be maintained by vendors of said exotic hardware and not dumped on volunteers. Second, when I designed the PMCTools API I didn't consider that CPU numbers could be 'sparse'. [They aren't sparsely allocated on the i386/amd64---the code I looked at when I was designing PmcTools.] So there are assumptions sprinkled throughout userland that that the integers 0..hw.ncpus can select a valid CPU. While all that can be tracked down and changed, and documentation updated, it is still work that I would prefer to defer until there is a chance that someone in the general public can use it. I do need to prioritize how I spend my volunteer hours. Third, IFF we as a project are going to support 'sparse CPU numbering, I would like to see the form that takes before making changes to HWPMC and tools. For example: - How will userland and in-kernel modules find out which CPUs are physically present? Would there be a bitmask on the lines of today's machdep.hlt_cpus that we could query? Could we make the 'all_cpus' bitmask visible to userland? What happens when we start supporting systems with more than 32 processors? - Will sysctl hw.ncpus represent the count of present CPUs or will it represent the maximum CPU id? - How will userland distinguish between absent CPUs those that could be temporarily administratively disabled? - Are we going to support 'transient' CPUs [that come and go]? Why would we want sparse CPU numbering otherwise? Nit: 'mp_maxid' appears to be an index, not a count as claimed above. If support for sparse CPU numbering is something useful, I feel the correct sequence should be to discuss it here, add sparse CPU numbering to the base i386/amd64 kernels (say) first and then propagate the feature to auxiliary code like HWPMC and userland. Changing HWPMC and its userland before the base kernel itself changes does not seem to be the right thing to do. Regards, Koshy