From owner-freebsd-arch@FreeBSD.ORG Thu Mar 15 17:27:55 2007 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 0CD6516A401 for ; Thu, 15 Mar 2007 17:27:55 +0000 (UTC) (envelope-from max@love2party.net) Received: from moutng.kundenserver.de (moutng.kundenserver.de [212.227.126.188]) by mx1.freebsd.org (Postfix) with ESMTP id 7559E13C458 for ; Thu, 15 Mar 2007 17:27:54 +0000 (UTC) (envelope-from max@love2party.net) Received: from [88.64.186.120] (helo=amd64.laiers.local) by mrelayeu.kundenserver.de (node=mrelayeu8) with ESMTP (Nemesis), id 0ML31I-1HRtjw0TZu-0000eb; Thu, 15 Mar 2007 18:27:51 +0100 From: Max Laier Organization: FreeBSD To: freebsd-arch@freebsd.org Date: Thu, 15 Mar 2007 18:27:32 +0100 User-Agent: KMail/1.9.5 References: <45F906ED.8070100@aueb.gr> In-Reply-To: <45F906ED.8070100@aueb.gr> X-Face: ,,8R(x[kmU]tKN@>gtH1yQE4aslGdu+2]; R]*pL,U>^H?)gW@49@wdJ`H<=?utf-8?q?=25=7D*=5FBD=0A=09U=5For=3D=5CmOZf764=26nYj=3DJYbR1PW0ud?=>|!~,,CPC.1-D$FG@0h3#'5"k{V]a~.<=?utf-8?q?mZ=7D44=23Se=7Em=0A=09Fe=7E=5C=5DX5B=5D=5Fxj?=(ykz9QKMw_l0C2AQ]}Ym8)fU MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart5645397.FyxE2rQ9mp"; protocol="application/pgp-signature"; micalg=pgp-sha1 Content-Transfer-Encoding: 7bit Message-Id: <200703151827.39963.max@love2party.net> X-Provags-ID: V01U2FsdGVkX1+lropN8exSTmzfKWjeHkRJZMf83DcCdPzLYTe C5UWS1kvJsPZmiRJ/UrEMzCAz8Tte7QCzs6hX3oWbCdJmYL0EL VC2uibCBz+9YbsWR0jDqQ== Cc: Diomidis Spinellis , freebsd-threads@freebsd.org Subject: Re: Multithreaded qsort(3) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2007 17:27:55 -0000 --nextPart5645397.FyxE2rQ9mp Content-Type: text/plain; charset="iso-8859-7" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline On Thursday 15 March 2007 09:42, Diomidis Spinellis wrote: > Greetings, > > I've added multhread support to our qsort implementation. You can see > the code at and some > performance figures at the end of this email. I propose to add it to > our libc. I need your opinions on the interface, and also any comments > you may have on the code. I see the following design dimensions: > > 1. Naming and availability > -------------------------- > - Option 1.1: Replace qsort with this implementation > * Pros: Programs gain performance without any source code change; > abstraction (number of processors/cores should be invisible, just as > the CPU architecture or disk drive interface) > * Cons: May confuse multithreaded programs; programs require linking > with pthreads library; programs whose compare function is not thread > safe will break > > - Option 1.2: Name it qsort_mt > * Pros: POLA; programs can tune the call according to their need > * Cons: Programs require source code changes; we violate abstraction; > namespace polution > > My proposal: option 1.2: name it qsort_mt and make it available in a > separate library (libc_mt). I'd suggest to name it qsort() and put it in a separate library (not=20 necessarily named libc_mt, as I don't believe there are that many=20 functions in libc, that can actually gain from multithreading). This approach, would allow LD_PRELOAD'ing the _mt version without the need= =20 to recompile. That said, I'm not sure this really belongs in the base system. As=20 qsort_mt it's way too obscure and unportable for any real application to=20 use it. As a replacement for qsort it won't work (as you pointed out=20 yourself). I do think that we need efforts like this, but they should be=20 made outside of FreeBSD, otherwise they won't be much useful in general. > 2. Interface > ------------ > - Option 2.1: qsort compatible > * Pros: Drop in replacement; doesn't need programmer understanding; > compatible with option 1.1 > * Cons: Can't be tuned for a specific program or workload > > - Option 2.2: Add nthreads parameter, specifying the maximum number of > threads > * Pros: Multithreaded programs can tune load balancing; all programs > can optimise nthreads according to the workload characteristics. > * Cons: Libc routines are generaly expected to Do he Right Thing, > without handholding; must agree on mechanism for determining the number > of threads; the benefits of tuning are unlikely to be dramatic. > > - Option 2.3: Add forkelem, specifying minimum number of elements for a > new thread (below forkelem, I use single-threaded qsort). > * Pros: programs can optimise nthreads according to the workload > characteristics. > * Cons: Libc routines are generaly expected to Do he Right Thing, > without handholding; the benefits of tuning are unlikely to be > dramatic. > > - Option 2.4: Have the library tune itsself for a given system or > individual programs by measuring different values > * Pros: higher performance when the tuned values are reused > * Cons: Lack of prior art; performance drop when determining tuning > values; security considerations if values are cached; complexity > > (Options 2.2 and 2.3 are orthogonal; options 2.1 and 2.4 are > orthogonal) > > My proposal: go for 2.1, based on the way C library routines work. > > 3. Determining the number of threads > ------------------------------------ > - Option 3.1: NPROCS environment variable > * Pros: prior art: CrayOS, BSD/OS; users can tune it. > * Cons: May be too blunt; requires setting by user or administrator > > - Option 3.[234]: hw.ncpu or kern.smp.cpus sysctl, or pmc_ncpu() > * Pros: automatic, right for the underlying hardware > * Cons: even blunter > > - Option 3.5: Add library interface for the above as int nthreads_mt() > * Pros: we abstract policy; programs can replace it; reusability; > extensibility. > * Cons: namespace polution; it is not clear that a single number is > correct for all uses of a program. > > (All the above options are orthogonal) > > My proposal: Add library interface using NPROCS and falling back to a > sysctl variable. This interface can be extended in the future. > > Performance figures > ------------------- > Reported times are elapsed, user, and system seconds. > > $ uname -a > FreeBSD sledge.freebsd.org 7.0-CURRENT FreeBSD 7.0-CURRENT #924: Wed > Mar 14 14:25:07 UTC 2007 > root@sledge.freebsd.org:/h/src/sys/amd64/compile/SLEDGE amd64 > > $ qsort_mt -? > qsort_mt: option requires an argument -- h > usage: qsort_mt [-stv] [-f forkelements] [-h threads] [-n elements] > -l Run the libc version of qsort > -s Test with 20-byte strings, instead of integers > -t Print timing results > -v Verify the integer results > Defaults are 1e7 elements, 2 threads, 100 fork elements > > > $ gcc -DTEST -O qsort_mt.c -o qsort_mt -lthr -lpmc > > $ qsort_mt -t -h 2 -n 100000000 -l # Integers; qsort(3) > 46.5 46.2 0.314 > $ ./qsort_mt -t -h 2 -n 100000000 # Integers; qsort_mt > 27.5 46.3 0.301 > $ qsort_mt -t -h 2 -n 10000000 -s -l # Strings; qsort(3) > 41.3 40.1 1.2 > $ ./qsort_mt -t -h 2 -n 10000000 -s # Strings; qsort_mt > 26.7 40.1 1.16 > > $ gcc -DTEST -O qsort_mt.c -o qsort_mt -lpthread -lpmc > $ qsort_mt -t -h 2 -n 100000000 # Integers; qsort_mt > 28 47.1 0.368 > $ qsort_mt -t -h 2 -n 10000000 -s # Strings; qsort_mt > 27.1 40.6 1.06 > > > Thanks, > > Diomidis - dds@ > http://www.spinellis.gr > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" =2D-=20 /"\ Best regards, | mlaier@freebsd.org \ / Max Laier | ICQ #67774661 X http://pf4freebsd.love2party.net/ | mlaier@EFnet / \ ASCII Ribbon Campaign | Against HTML Mail and News --nextPart5645397.FyxE2rQ9mp Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (FreeBSD) iD8DBQBF+YILXyyEoT62BG0RAju3AJ4vEppT5JGbBZYcyhTi/oaeTvkv2QCfV8PG zAnh7BCCQi8z2Ag7I2OAeJI= =CehS -----END PGP SIGNATURE----- --nextPart5645397.FyxE2rQ9mp--