From owner-freebsd-hackers@FreeBSD.ORG Sat Jan 4 06:41:22 2014 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 409BAAA6 for ; Sat, 4 Jan 2014 06:41:22 +0000 (UTC) Received: from mail-pb0-f42.google.com (mail-pb0-f42.google.com [209.85.160.42]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 128691831 for ; Sat, 4 Jan 2014 06:41:21 +0000 (UTC) Received: by mail-pb0-f42.google.com with SMTP id uo5so16503541pbc.15 for ; Fri, 03 Jan 2014 22:41:21 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:content-type:mime-version:subject:from :in-reply-to:date:cc:content-transfer-encoding:message-id:references :to; bh=kgeRFR3+AzOkEo09wPlisHPzxcuyKYTwB+pZUcR6aOk=; b=f0cygry4gkENkt4+CiLmptS9qRTiLA4hfJwTFoUBzmvoWoHN787Ay0tjvqe92CBj/z FucUC0AIPNPe2qb2S+Cdkb/NlBXhfJj7DIud2XyYSXSMpj5nog9uEx0s+AHhbZJQdhH/ B4hRm+kRAwp2AeeBXNX6vF3HS4UWw3q3wPpvjBgo1jMErd6v79GsJlIZlqcDVi3qmpKO yG6jpez9hynY6BxUksD4nHT46GjKkT/SsY6J/eMbD2ePHbOxr9T5nmI6MdVgkkvu7bhC 9tM1EQvGLFg+nUHu2e5wVUyZqgGjqVm+WncM/Jb0rZjAmgumq19NjFhb5KTcaBZqRcAL dOIQ== X-Gm-Message-State: ALoCoQndkDfUJ/9p33qiV5FGvEFDSGh8l166Jqwf3fv+wB/aHLrce6q7RPyzC9IpAJxg89qnG6wN X-Received: by 10.68.189.133 with SMTP id gi5mr101475889pbc.57.1388817681188; Fri, 03 Jan 2014 22:41:21 -0800 (PST) Received: from [192.168.2.136] (99-74-169-43.lightspeed.sntcca.sbcglobal.net. [99.74.169.43]) by mx.google.com with ESMTPSA id de1sm113285379pbc.7.2014.01.03.22.41.18 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 03 Jan 2014 22:41:19 -0800 (PST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 7.1 \(1827\)) Subject: Re: pthread basics and contention From: Tim Kientzle In-Reply-To: Date: Fri, 3 Jan 2014 22:41:15 -0800 Content-Transfer-Encoding: quoted-printable Message-Id: <4EFEA29F-4D6E-4B4A-8C26-E15FA62B574C@kientzle.com> References: <20140103051707.C11B6200F5@smtp.hushmail.com> To: chump1@hushmail.com X-Mailer: Apple Mail (2.1827) Cc: "freebsd-hackers@freebsd.org" X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 04 Jan 2014 06:41:22 -0000 Depending on the calculation involved, you may be memory bus constrained, not CPU constrained. On modern processors, it is often the case that it takes longer to get the data to/from memory than to actually compute anything. In those cases, splitting your work into threads just gives you more CPUs waiting on the same slow memory. Deciding whether this is the issue or not requires good processor-level profiling tools. Tim On Fri, Jan 3, 2014 at 12:17 AM, wrote: >=20 > I have a fairly simple task that involves processing something in a 2D = array, MxN times. I took a naive approach, 1x process 1x thread, and it = took a little longer than desired. Well now, I could do better with some = multi processing, especially on a multi core box, right? >=20 > Well, I have not had much luck. At first I spawned M threads and had = each iterate over each N in turn, with M between 25-35. It took much, = much longer than the single thread. I figured contention and overhead = were costing me big, and gave it a shot with a scaled down version of = the problem, M=3D10. Still, much slower than the single thread. A little = confused, I went back to the big problem set (25-35), and made a new = program that spawned only two threads, and each is limited to processing = only even or only odd data sets. Even that still takes twice as long as = the single thread version! What is up with that? >=20 > More important asides, I am barely doing any real processing at all. = It is basically a no-op, barely doing more than incrementing the = counter. Should I expect to see performance gains once I am doing real = work in the processing portion of my program? Should I expect to see = much different behavior on a different OS? Also I have one physical = processor, two cores. Would I see better gains with more cores? How do = you find processes and threads scale against hardware overall? >=20 > Thanks! >=20 > Sent using Hushmail