From owner-freebsd-current@FreeBSD.ORG Wed Sep 29 14:14:10 2010 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 414661065696; Wed, 29 Sep 2010 14:14:10 +0000 (UTC) (envelope-from mdf356@gmail.com) Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com [209.85.214.182]) by mx1.freebsd.org (Postfix) with ESMTP id CE36F8FC1E; Wed, 29 Sep 2010 14:14:09 +0000 (UTC) Received: by iwn34 with SMTP id 34so1341246iwn.13 for ; Wed, 29 Sep 2010 07:14:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=Yn1wzphc1axe8WeHUjPIO810w2Gw76qRbq1PlwYUmcg=; b=rFpwN5eHMt5zulL0+cz5qiO5u3b2p+FgpX6eQFgsHm6I0NCwgc+Y0re8ZmgM9GIl8z +jcvgA0lfFt9S2K11U+2ZIwTOsoAaQKldKX3vPPn9YSg7B166cy4/RMuohFGV9rDribs C4PmvWGK9xrp7K7n/CWHHVGC2dx8rcp225NPQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=uAyNMiV9EpXfuEixuqEQXOYt4rRZ5wqbQtP66oewg528NP2jUcUmnboX5z3/n+sZGx Z+jLBNC0dOvdwZxDeGZHk0Q0JD/GyFYbYaAa1R3nKY/NcKFreOYECW08vihTYJgZBmZU J/NE+sfqu88fwtYFv21c/XzxMHZs70BaHwdzw= MIME-Version: 1.0 Received: by 10.231.166.139 with SMTP id m11mr1837744iby.136.1285769649039; Wed, 29 Sep 2010 07:14:09 -0700 (PDT) Received: by 10.231.167.140 with HTTP; Wed, 29 Sep 2010 07:14:08 -0700 (PDT) In-Reply-To: <07839A8A-5CC5-47C5-B098-89FE81CA2F3E@freebsd.org> References: <1285601161.7245.7.camel@home-yahoo> <1285699244.2454.63.camel@home-yahoo> <201009290749.22669.jhb@freebsd.org> <07839A8A-5CC5-47C5-B098-89FE81CA2F3E@freebsd.org> Date: Wed, 29 Sep 2010 14:14:08 +0000 Message-ID: From: Matthew Fleming To: freebsd-current@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: sbruno@freebsd.org, "Robert N. M. Watson" , Joshua Neal Subject: Re: MAXCPU preparations X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2010 14:14:10 -0000 On Wed, Sep 29, 2010 at 1:54 PM, Robert N. M. Watson wrote: > > On 29 Sep 2010, at 12:49, John Baldwin wrote: > >> On Tuesday, September 28, 2010 6:24:32 pm Robert N. M. Watson wrote: >>> >>> On 28 Sep 2010, at 19:40, Sean Bruno wrote: >>> >>>>> If you go fully dynamic you should use mp_maxid + 1 rather than maxcp= us. >>>> >>>> I assume that mp_maxid is the new kern.smp.maxcpus? =A0Can you inject = some >>>> history here so I can understand why one is "better" than the other? >>> >>> So, unlike maxcpus, mp_maxid is in theory susceptible to races in a bra= ve new world in which we support hotplug CPUs -- a brave new world we're >> not yet ready for, however. If you do use mp_maxid, be aware you need to= add one to get the size of the array -- maxcpus is the number of CPUs that >> can be used, whereas mp_maxid is the highest CPU ID (counting from 0) th= at is used. >> >> Hmm, I'm not sure that is true. =A0My feeling on mp_maxid is that it is = the >> largest supported CPUID. =A0Platforms that support hotplug would need to= set >> mp_maxid such that it has room for hotpluggable CPUs. =A0You don't want = to >> go reallocate the UMA datastructures for every zone when a CPU is hotplu= gged >> for example. > > Yes, we'll have to break one (or even both) of two current assumptions wi= th the move to hotplug: contiguous in-use CPU IDs and mp_maxid representing= the greatest possible CPU ID as a constant value. The former is guaranteed= , but I wonder about the latter. On face value, you might say "oh, we know = how many sockets there are", but if course, we don't know how many threads = will arrive when a package is inserted. =A0For many subsystems, DPCPU will = present a more than adequate answer for avoiding resizing, although not for= low-level systems (perhaps such as UMA?). Likewise, on virtual machine pla= tforms where hotplug actually might reflect a longer-term scheduling choice= by the admin/hypervisor (i.e., resource partitioning), it may be harder to= reason about what the future holds. > As a historical note, when AIX added hotplug CPU support, we kept the MAXCPU define as the largest number of CPUs supported by the OS image. At the time it was 256; as it shifted to 1024 there was a large cleanup effort to eliminate as many statically sized arrays as possible. AIX also has an mp_maxid equivalent which changed when new higher core numbers were used. For various reasons, new CPUs were added only while a single CPU was running, so any loop that looked like (for i =3D 0; i <=3D mp_maxid; i++) could get interrupted by an IPI (the handler spun until the core was added by the coordinating CPU) and find that it had a stale mp_maxid value. This was not a bug in 99% of the uses of the code, since whatever it was doing with each CPU was either not atomic (e.g. summing per-cpu counters) or was some kind of initializing work which was also done by the new CPU before it was fully awake. I don't necessarily recommend this as an architected plan for adding CPUs, but I wanted to offer the historical note that it could work. Also, CPU id's may not be contiguous with hotplug. Even if they are contiguous on boot, there is no reason to assume CPUs are offlined from highest ID down. For reasons of cache sharing, the best performance may be obtained by picking a specific CPU to offline. It also may be that it makes more sense to reserve CPU ids so that e.g. CPUs N*T through N*(2T-1) are all HMT threads on the same core, (for T-way HMT). In this case CPU IDs become sparse if HMT is completely disabled, and the best performance for the box overall is probably obtained by offlining T3 and T2 for each core while keeping T0 and T1 active. But discontiguous IDs shouldn't be a problem as all the code I've seen checks CPU_ABSENT(i) in the loop. Cheers, matthew