From owner-freebsd-java@FreeBSD.ORG Sat Jan 14 16:31:03 2006 Return-Path: X-Original-To: java@freebsd.org Delivered-To: freebsd-java@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 64E1616A420 for ; Sat, 14 Jan 2006 16:31:03 +0000 (GMT) (envelope-from hwh@gddsn.org.cn) Received: from gddsn.org.cn (gddsn.org.cn [218.19.164.145]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5B62D43D49 for ; Sat, 14 Jan 2006 16:31:01 +0000 (GMT) (envelope-from hwh@gddsn.org.cn) Received: from [192.168.1.5] (unknown [219.137.129.131]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by gddsn.org.cn (Postfix) with ESMTP id 25D6E38CB65 for ; Sun, 15 Jan 2006 00:30:59 +0800 (CST) Message-ID: <43C92741.7000803@gddsn.org.cn> Date: Sun, 15 Jan 2006 00:30:57 +0800 From: Huang wen hui User-Agent: Mozilla Thunderbird 1.0.7 (X11/20051212) X-Accept-Language: zh-cn,zh MIME-Version: 1.0 To: java@freebsd.org Content-Type: text/plain; charset=GB2312 Content-Transfer-Encoding: 7bit Cc: Subject: Performance patch for jdk1.5.0/amd64 X-BeenThere: freebsd-java@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting Java to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 14 Jan 2006 16:31:03 -0000 hi, I recently notice that jdk1.5.0-p2/amd64 is slower than jdk1.5.0-p2/i386 on the same hardware in some situation. In my test case is 15115 ms vs 3779 ms. using -Xprof option show that some methods(specially using a lot of cpu time) are run as interpreted mode on amd64, and are run as compiled code on i386. The following patch seems to solve this problem. I am not totally understand this patch. just back port from jdk16. but this patch really speed up jdk. Patched jdk1.5.0 test result: wfdb2# ~hwh/j2sdk-image/bin/java -Xmx256m -jar TestDatabaseEvtData.jar "2006-01-14 00:00:00" "2006-01-14 00:30:00" Use DatabaseEvtData.properties as log4j configuration INFO [main] (DatabaseEvtData.java:501) - DECODE MiniSeed elapse time: 3993 ms. orig jdk1.5.0 test result: wfdb2# ~hwh/j2sdk-image.orig/bin/java -Xmx256m -jar TestDatabaseEvtData.jar "2006-01-14 00:00:00" "2006-01-14 00:30:00" Use DatabaseEvtData.properties as log4j configuration INFO [main] (DatabaseEvtData.java:501) - DECODE MiniSeed elapse time: 16141 ms. # cat amd64.ad.patch --- ../../hotspot/src/cpu/amd64/vm/amd64.ad.orig Sat Jan 14 20:06:02 2006 +++ ../../hotspot/src/cpu/amd64/vm/amd64.ad Sat Jan 14 20:05:37 2006 @@ -6095,6 +6095,18 @@ ins_pipe(pipe_slow); // XXX %} +instruct prefetcht0(memory mem) +%{ + match(Prefetch mem); + predicate(!VM_Version::has_prefetchw()); + ins_cost(125); + + format %{ "prefetcht0 $mem\t# prefetch into L1" %} + opcode(0x0F, 0x18); /* Opcode 0F 18 /1 */ + ins_encode(REX_mem(mem), OpcP, OpcS, RM_opc_mem(0x01, mem)); + ins_pipe(pipe_slow); +%} + instruct prefetch(memory mem) %{ match(Prefetch mem); # cat prefetch_bsd_amd64.inline.hpp.patch --- ../../hotspot/src/os_cpu/bsd_amd64/vm/prefetch_bsd_amd64.inline.hpp.orig Sat Jan 14 23:51:41 2006 +++ ../../hotspot/src/os_cpu/bsd_amd64/vm/prefetch_bsd_amd64.inline.hpp Sat Jan 14 23:52:54 2006 @@ -6,24 +6,14 @@ * SUN PROPRIETARY/CONFIDENTIAL. Use is subject to license terms. */ -inline void Prefetch::read(void* loc, intx interval) +inline void Prefetch::read(void* loc, intx interval) { - __builtin_prefetch((char*) loc + interval, 0); // prefetcht0 (%rsi, %rdi,1) + __asm__ ("prefetcht0 (%0,%1,1)" : : "r" (loc), "r" (interval)); } inline void Prefetch::write(void* loc, intx interval) { - // Force prefetchw. The gcc builtin produces prefetcht0 or prefetchw - // depending on command line switches we don't control here. - // Use of this method should be gated by VM_Version::has_prefetchw. - /* - * How do we invoke VM_Version::has_prefetchw here? - * Can we do something at compile time instead to remove that overhead? - */ -//#ifdef __amd64__ -// __asm__ ("prefetchw (%0,%1,1)" : : "r" (loc), "r" (interval)); -//#elif __em64t__ + // Do not use the 3dnow prefetchw instruction. It isn't supported on em64t. + // __asm__ ("prefetchw (%0,%1,1)" : : "r" (loc), "r" (interval)); __asm__ ("prefetcht0 (%0,%1,1)" : : "r" (loc), "r" (interval)); -//#endif - // __builtin_prefetch((char*) loc + interval, 1); // prefetcht0/prefetchw (%rsi,%rdi,1) }