Date: Sun, 7 Oct 2007 09:21:53 -0700 From: Alfred Perlstein <alfred@freebsd.org> To: Ivan Voras <ivoras@freebsd.org> Cc: freebsd-threads@freebsd.org Subject: Re: Unexpected threading performance result Message-ID: <20071007162153.GX31826@elvis.mu.org> In-Reply-To: <fearqk$sot$1@sea.gmane.org> References: <fearqk$sot$1@sea.gmane.org>
next in thread | previous in thread | raw e-mail | index | archive | help
* Ivan Voras <ivoras@freebsd.org> [071007 07:52] wrote: > Hi, > > For an unrelated purpose, I'm benchmarking performance of tree > algorithms in SMP environments and my preliminary run has an unexpected > result. Here's the typical output from the (small) benchmark program, > run on a dual-core Athlon64 (i386 mode): > > Running benchmarks on small_nonuniform, 1000000 samples > Step 1: Running 100 loops > ** Step 1 benchmark completed 100 loops in 84.44 seconds. > Step 2: Running 2 threads with 100 loops each > ** Step 2 benchmark completed 100 loops in 2 threads in 167.46 seconds. > > The interpretation is: running the same loop twice, in two parallel > threads doesn't result in a speedup, but it looks like the execution is > serialized. The problem is: the loops are completely independent, no > locking in their execution, and 'top' confirms that both threads in the > program are running at 100% CPU each. ... 1) Could you please make this example program compile/work, it's a bit difficult to diagnose the problem if we don't know if things like n_samples are initialised properly. 2) Please try to pthread_attr_setscope() to PTHREAD_SCOPE_SYSTEM on a pthread_attr_t object to pass into pthread_create(). That may help. 3) What's the deal with RB_FIND? What does that do? Is that data structure locked with a an exclusive lock? > > I verified this behaviour on: > > - 7-CURRENT, i386, ULE > - 7-CURRENT, i386, 4BSD > - 6-STABLE, amd64, 4BSD > > I can't really explain this behaviour, but it might not be related to > FreeBSD - maybe I made a mistake in the program or there's a > hardware-related reason for it (maybe CPU cache trashing from the tree > traversal?). In any case, can someone shed some light on this? > > The main part of the (small) program is pasted below. > > > 47 double time_start, time_b1, time_b2; > 48 int n_data, n_samples; > 49 int *data, *samples; > 50 > 51 > 52 void bench_loop() > 53 { > 54 int i; > 55 struct node *nd, find; > 56 for (i = 0; i < n_samples; i++) { > 57 find.data = samples[i]; > 58 nd = RB_FIND(node_tree, &head, &find); > 59 if (nd == NULL) > 60 errx(1, "Sample %d was not found", find.data); > 61 } > 62 } > 63 > 64 void step1() > 65 { > 66 int n; > 67 /* step 1 - simple tree traversal */ > 68 printf("Step 1: Running %d loops\n", STEP1_ITER); > 69 for (n = 0; n < STEP1_ITER; n++) > 70 bench_loop(); > 71 time_b1 = gettime(); > 72 printf("** Step 1 benchmark completed %d loops in %0.2lf > seconds.\n", STEP1_ITER, time_b1 - time_start); > 73 } > 74 > 75 void *step2_thread(void *arg) { > 76 int n; > 77 for (n = 0; n < STEP2_ITER; n++) > 78 bench_loop(); > 79 return NULL; > 80 } > 81 > 82 void step2() > 83 { > 84 /* step 2 - run tree traversal in parallel threads */ > 85 int n; > 86 pthread_t threads[STEP2_THREADS]; > 87 > 88 printf("Step 2: Running %d threads with %d loops each\n", > STEP2_THREADS, STEP2_ITER); > 89 for (n = 0; n < STEP2_THREADS; n++) { > 90 if (pthread_create(&threads[n], NULL, step2_thread, > NULL) != 0) > 91 err(1, "Cannot spawn thread"); > 92 } > 93 for (n = 0; n < STEP2_THREADS; n++) > 94 pthread_join(threads[n], NULL); > 95 time_b2 = gettime(); > 96 printf("** Step 2 benchmark completed %d loops in %d > threads in %0.2lf seconds.\n", > 97 STEP2_ITER, STEP2_THREADS, time_b2 - > time_start); > 98 } > > _______________________________________________ > freebsd-threads@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-threads > To unsubscribe, send any mail to "freebsd-threads-unsubscribe@freebsd.org" -- - Alfred Perlstein
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20071007162153.GX31826>