Date: Sun, 15 Jan 2006 00:30:57 +0800 From: Huang wen hui <hwh@gddsn.org.cn> To: java@freebsd.org Subject: Performance patch for jdk1.5.0/amd64 Message-ID: <43C92741.7000803@gddsn.org.cn>
next in thread | raw e-mail | index | archive | help
hi, I recently notice that jdk1.5.0-p2/amd64 is slower than jdk1.5.0-p2/i386 on the same hardware in some situation. In my test case is 15115 ms vs 3779 ms. using -Xprof option show that some methods(specially using a lot of cpu time) are run as interpreted mode on amd64, and are run as compiled code on i386. The following patch seems to solve this problem. I am not totally understand this patch. just back port from jdk16. but this patch really speed up jdk. Patched jdk1.5.0 test result: wfdb2# ~hwh/j2sdk-image/bin/java -Xmx256m -jar TestDatabaseEvtData.jar "2006-01-14 00:00:00" "2006-01-14 00:30:00" Use DatabaseEvtData.properties as log4j configuration INFO [main] (DatabaseEvtData.java:501) - DECODE MiniSeed elapse time: 3993 ms. orig jdk1.5.0 test result: wfdb2# ~hwh/j2sdk-image.orig/bin/java -Xmx256m -jar TestDatabaseEvtData.jar "2006-01-14 00:00:00" "2006-01-14 00:30:00" Use DatabaseEvtData.properties as log4j configuration INFO [main] (DatabaseEvtData.java:501) - DECODE MiniSeed elapse time: 16141 ms. # cat amd64.ad.patch --- ../../hotspot/src/cpu/amd64/vm/amd64.ad.orig Sat Jan 14 20:06:02 2006 +++ ../../hotspot/src/cpu/amd64/vm/amd64.ad Sat Jan 14 20:05:37 2006 @@ -6095,6 +6095,18 @@ ins_pipe(pipe_slow); // XXX %} +instruct prefetcht0(memory mem) +%{ + match(Prefetch mem); + predicate(!VM_Version::has_prefetchw()); + ins_cost(125); + + format %{ "prefetcht0 $mem\t# prefetch into L1" %} + opcode(0x0F, 0x18); /* Opcode 0F 18 /1 */ + ins_encode(REX_mem(mem), OpcP, OpcS, RM_opc_mem(0x01, mem)); + ins_pipe(pipe_slow); +%} + instruct prefetch(memory mem) %{ match(Prefetch mem); # cat prefetch_bsd_amd64.inline.hpp.patch --- ../../hotspot/src/os_cpu/bsd_amd64/vm/prefetch_bsd_amd64.inline.hpp.orig Sat Jan 14 23:51:41 2006 +++ ../../hotspot/src/os_cpu/bsd_amd64/vm/prefetch_bsd_amd64.inline.hpp Sat Jan 14 23:52:54 2006 @@ -6,24 +6,14 @@ * SUN PROPRIETARY/CONFIDENTIAL. Use is subject to license terms. */ -inline void Prefetch::read(void* loc, intx interval) +inline void Prefetch::read(void* loc, intx interval) { - __builtin_prefetch((char*) loc + interval, 0); // prefetcht0 (%rsi, %rdi,1) + __asm__ ("prefetcht0 (%0,%1,1)" : : "r" (loc), "r" (interval)); } inline void Prefetch::write(void* loc, intx interval) { - // Force prefetchw. The gcc builtin produces prefetcht0 or prefetchw - // depending on command line switches we don't control here. - // Use of this method should be gated by VM_Version::has_prefetchw. - /* - * How do we invoke VM_Version::has_prefetchw here? - * Can we do something at compile time instead to remove that overhead? - */ -//#ifdef __amd64__ -// __asm__ ("prefetchw (%0,%1,1)" : : "r" (loc), "r" (interval)); -//#elif __em64t__ + // Do not use the 3dnow prefetchw instruction. It isn't supported on em64t. + // __asm__ ("prefetchw (%0,%1,1)" : : "r" (loc), "r" (interval)); __asm__ ("prefetcht0 (%0,%1,1)" : : "r" (loc), "r" (interval)); -//#endif - // __builtin_prefetch((char*) loc + interval, 1); // prefetcht0/prefetchw (%rsi,%rdi,1) }
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?43C92741.7000803>