From owner-freebsd-threads@FreeBSD.ORG Sun Oct 7 16:21:54 2007 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0F4C516A41A; Sun, 7 Oct 2007 16:21:54 +0000 (UTC) (envelope-from bright@elvis.mu.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id E379013C457; Sun, 7 Oct 2007 16:21:53 +0000 (UTC) (envelope-from bright@elvis.mu.org) Received: by elvis.mu.org (Postfix, from userid 1192) id AF6B61A4D83; Sun, 7 Oct 2007 09:21:53 -0700 (PDT) Date: Sun, 7 Oct 2007 09:21:53 -0700 From: Alfred Perlstein To: Ivan Voras Message-ID: <20071007162153.GX31826@elvis.mu.org> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i Cc: freebsd-threads@freebsd.org Subject: Re: Unexpected threading performance result X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 07 Oct 2007 16:21:54 -0000 * Ivan Voras [071007 07:52] wrote: > Hi, > > For an unrelated purpose, I'm benchmarking performance of tree > algorithms in SMP environments and my preliminary run has an unexpected > result. Here's the typical output from the (small) benchmark program, > run on a dual-core Athlon64 (i386 mode): > > Running benchmarks on small_nonuniform, 1000000 samples > Step 1: Running 100 loops > ** Step 1 benchmark completed 100 loops in 84.44 seconds. > Step 2: Running 2 threads with 100 loops each > ** Step 2 benchmark completed 100 loops in 2 threads in 167.46 seconds. > > The interpretation is: running the same loop twice, in two parallel > threads doesn't result in a speedup, but it looks like the execution is > serialized. The problem is: the loops are completely independent, no > locking in their execution, and 'top' confirms that both threads in the > program are running at 100% CPU each. ... 1) Could you please make this example program compile/work, it's a bit difficult to diagnose the problem if we don't know if things like n_samples are initialised properly. 2) Please try to pthread_attr_setscope() to PTHREAD_SCOPE_SYSTEM on a pthread_attr_t object to pass into pthread_create(). That may help. 3) What's the deal with RB_FIND? What does that do? Is that data structure locked with a an exclusive lock? > > I verified this behaviour on: > > - 7-CURRENT, i386, ULE > - 7-CURRENT, i386, 4BSD > - 6-STABLE, amd64, 4BSD > > I can't really explain this behaviour, but it might not be related to > FreeBSD - maybe I made a mistake in the program or there's a > hardware-related reason for it (maybe CPU cache trashing from the tree > traversal?). In any case, can someone shed some light on this? > > The main part of the (small) program is pasted below. > > > 47 double time_start, time_b1, time_b2; > 48 int n_data, n_samples; > 49 int *data, *samples; > 50 > 51 > 52 void bench_loop() > 53 { > 54 int i; > 55 struct node *nd, find; > 56 for (i = 0; i < n_samples; i++) { > 57 find.data = samples[i]; > 58 nd = RB_FIND(node_tree, &head, &find); > 59 if (nd == NULL) > 60 errx(1, "Sample %d was not found", find.data); > 61 } > 62 } > 63 > 64 void step1() > 65 { > 66 int n; > 67 /* step 1 - simple tree traversal */ > 68 printf("Step 1: Running %d loops\n", STEP1_ITER); > 69 for (n = 0; n < STEP1_ITER; n++) > 70 bench_loop(); > 71 time_b1 = gettime(); > 72 printf("** Step 1 benchmark completed %d loops in %0.2lf > seconds.\n", STEP1_ITER, time_b1 - time_start); > 73 } > 74 > 75 void *step2_thread(void *arg) { > 76 int n; > 77 for (n = 0; n < STEP2_ITER; n++) > 78 bench_loop(); > 79 return NULL; > 80 } > 81 > 82 void step2() > 83 { > 84 /* step 2 - run tree traversal in parallel threads */ > 85 int n; > 86 pthread_t threads[STEP2_THREADS]; > 87 > 88 printf("Step 2: Running %d threads with %d loops each\n", > STEP2_THREADS, STEP2_ITER); > 89 for (n = 0; n < STEP2_THREADS; n++) { > 90 if (pthread_create(&threads[n], NULL, step2_thread, > NULL) != 0) > 91 err(1, "Cannot spawn thread"); > 92 } > 93 for (n = 0; n < STEP2_THREADS; n++) > 94 pthread_join(threads[n], NULL); > 95 time_b2 = gettime(); > 96 printf("** Step 2 benchmark completed %d loops in %d > threads in %0.2lf seconds.\n", > 97 STEP2_ITER, STEP2_THREADS, time_b2 - > time_start); > 98 } > > _______________________________________________ > freebsd-threads@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-threads > To unsubscribe, send any mail to "freebsd-threads-unsubscribe@freebsd.org" -- - Alfred Perlstein