From owner-freebsd-java@FreeBSD.ORG Mon Dec 3 15:18:43 2007 Return-Path: Delivered-To: java@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BBE9916A421 for ; Mon, 3 Dec 2007 15:18:43 +0000 (UTC) (envelope-from nate@yogotech.com) Received: from ns.yogotech.com (ns.yogotech.com [206.127.123.66]) by mx1.freebsd.org (Postfix) with ESMTP id 5F73113C447 for ; Mon, 3 Dec 2007 15:18:43 +0000 (UTC) (envelope-from nate@yogotech.com) Received: from caddis.yogotech.com (caddis.yogotech.com [206.127.123.130]) by ns.yogotech.com (8.14.2/8.14.2) with ESMTP id lB3FIWHQ034392; Mon, 3 Dec 2007 08:18:37 -0700 (MST) (envelope-from nate@yogotech.com) Received: from caddis.yogotech.com (localhost [127.0.0.1]) by caddis.yogotech.com (8.14.1/8.13.8) with ESMTP id lB3FIWQm079831; Mon, 3 Dec 2007 08:18:32 -0700 (MST) (envelope-from nate@caddis.yogotech.com) Received: (from nate@localhost) by caddis.yogotech.com (8.14.1/8.14.1/Submit) id lB3FIW24079828; Mon, 3 Dec 2007 08:18:32 -0700 (MST) (envelope-from nate) From: Nate Williams MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <18260.7752.293904.909660@caddis.yogotech.com> Date: Mon, 3 Dec 2007 08:18:32 -0700 To: "Arno J. Klaassen" In-Reply-To: References: <200711301716.lAUHGEV1064334@repoman.freebsd.org> <47536361.8090203@freebsd.org> X-Mailer: VM 7.17 under 21.4 (patch 20) "Double Solitaire" XEmacs Lucid X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.1.8 X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on ns.yogotech.com Cc: Daniel Eischen , java@freebsd.org, nate@yogotech.com, julian@freebsd.org, David Xu Subject: Re: cvs commit: src/lib/libkse/thread thr_kern.c X-BeenThere: freebsd-java@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Nate Williams List-Id: Porting Java to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Dec 2007 15:18:43 -0000 [ Java hang ] > But the only easy way for me to reproduce it is just compiling jacorb > (www.jacorb.org) on releng_6 (about ten days old) using libthr : after > a while java hangs (can only be killed by -9) and 'top -H' shows three > threads each taking 70-90% CPU-time. > > If I take a 'gcore' snapshot of it (dunno how trustful that is) > it shows all threads in _thr_umtx_wait() (script-log attached). > > But : > > - only 2x2 smp-amd64 releng_6, 1x2 smp goes OK > - only easy to produce when using optimized VM (I'll retry > harder to produce a hang with java_g) > - no prob on releng_7 (2x2 smp included) for this test > > This is thin, but all I have for now ... Obviously this isn't necessarily the case, but more times than not hangs in the VM on multiple CPUs are almost always related to bugs in the Java code. Rarely do developers write code that is multi-thread safe, and just because code works fine on one platform doesn't mean that the code is truly multi-thread safe. We had code that was originally developed under FreeBSD, but used code from multiple vendors. We found problems in the code that were triggered by our usage that needed to get fixed. Then, we we migrated the code to Windows, we found more bugs (both in our code and other code). Finally, we found another set of bugs when we ran the code under Solaris, and yet another class of bugs when we switched thread models under Solaris (Solaris use to give you the option of using different threading models for Java). That may not be the case here, but just because something is in ports doesn't necessarily mean it is bug free. It's certainly possible that FreeBSD's SMP libraries may now expose incorrect assumptions the author of the Java code never considered which will cause deadlocks using a different threading model. Java's threading model allows one to write bug-free code as the model is very simple, but you have to be *very* careful with it and when you do, performance can suffer so many times developers take short-cuts as it doesn't appear to negatively effect their code during development on their platform of choice. Nate