From owner-freebsd-ia32@FreeBSD.ORG  Mon Dec 11 16:31:16 2006
Return-Path: <owner-freebsd-ia32@FreeBSD.ORG>
X-Original-To: freebsd-ia32@freebsd.org
Delivered-To: freebsd-ia32@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 2FB0516A47B
	for <freebsd-ia32@freebsd.org>; Mon, 11 Dec 2006 16:31:16 +0000 (UTC)
	(envelope-from ranjith_kumar_b4u@yahoo.com)
Received: from rrr2-v2.mail.re1.yahoo.com (rrr2-v2.mail.re1.yahoo.com
	[66.196.101.127])
	by mx1.FreeBSD.org (Postfix) with SMTP id D16A843CC9
	for <freebsd-ia32@freebsd.org>; Mon, 11 Dec 2006 16:29:50 +0000 (GMT)
	(envelope-from ranjith_kumar_b4u@yahoo.com)
Received: (qmail 20498 invoked from network); 11 Dec 2006 16:05:22 -0000
Received: from web58616.mail.re3.yahoo.com (68.142.236.250)
	by rrr2-v2.mail.re1.yahoo.com with SMTP; 11 Dec 2006 16:05:22 -0000
Received: (qmail 13931 invoked by uid 60001); 11 Dec 2006 16:05:22 -0000
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com;
	h=X-YMail-OSG:Received:Date:From:Subject:To:MIME-Version:Content-Type:Content-Transfer-Encoding:Message-ID;
	b=5FwI4FntFna8Fm5ubSfs1RjWEmL8oVqLFJ65eMA2ap76Gv2YvwAAFHV9TDjsdIQ2eVnNH8lUyTyPhtnAI+ZvOylu7xgZMSlWPWdl05ScmhZ5XR5oUXsQN0OfTm7ZVS/IihbNwTnEPf9cppMKzsgnbQr5OuhP7Ha3P29EA3nHi+w=;
X-YMail-OSG: EgUdHRsVM1l_HIntkEKJcPKX5C3HV910EgqnyCgX
Received: from [202.68.145.230] by web58616.mail.re3.yahoo.com via HTTP;
	Mon, 11 Dec 2006 08:05:21 PST
Date: Mon, 11 Dec 2006 08:05:21 -0800 (PST)
From: ranjith kumar <ranjith_kumar_b4u@yahoo.com>
To: freebsd-ia32@freebsd.org
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit
Message-ID: <234144.12400.qm@web58616.mail.re3.yahoo.com>
X-Mailman-Approved-At: Mon, 11 Dec 2006 19:27:08 +0000
Subject: Re: prefetching on pentium 4 
X-BeenThere: freebsd-ia32@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: FreeBSD on the IA-32 platform <freebsd-ia32.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-ia32>,
	<mailto:freebsd-ia32-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-ia32>
List-Post: <mailto:freebsd-ia32@freebsd.org>
List-Help: <mailto:freebsd-ia32-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-ia32>,
	<mailto:freebsd-ia32-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 11 Dec 2006 16:31:16 -0000

--- Attilio Rao  wrote:

> 2006/12/6, ranjith kumar
> :
> > Hi,
> >     There are 4 types of prefetch instructions on
> > pentium 4 (IA-32) processor.
> > prefetchnta,prefetcht0,prefetcht1,prefetcht2.
> >
> > In case of pentium 4, IA-32 otimization manuvals
> say
> > that prefetcht0,prefetcht1,prefetcht2 are
> identical.
> >
> > It also says ONLY prefetchnta instruction
> prefetches
> > data into L2 cache without poluting caches.
> >
> >  When all the four instructions prefetches data
> into
> > L2 cache (not into L1 cache) , what is the meaning
> in
> > saying prefetchnta does not polute caches?
> >
> > ie)what is the difference between prefetchnta and
> > other instructions?
> 
> First of all, it is important to say that prefetch*
> instruction is
> only an hint for the CPU and not a *command* for
> that, so the CPU
> needs to evaluate (in a not precisated way) if
> accept or not the
> caching request.
> From this point of view, prefetch* instruction might
> be the more
> accomodant possible for the CPU.
> Different numbers mean different 'critical' level
> for the CPU (0 -
> high critical, 2 - low critical), which means
> prefetching the cache
> line to an higher level into the cache hierarchy.
> This would means, in an hypotetical way:
> 
> prefetch0 -> L1 prefetching
> prefetch1 -> L2 prefetching
> prefetch2 -> L3 prefetching
> 
> And this is what really happens, for example, on P3
> (if you consider
> P3 has not L3 cache, prefetch2 == prefetch1).
> On P4 things are different beacause you would not
> manipulate directly
> L1 cache and, so, what happens is:
> 
> prefetch0 -> L2 prefetching
> prefetch1 -> L2 prefetching
> prefetch2 -> L3 prefetching
> (if L3 cache is not present prefetch2 is the same as
> the other, from
> this the assumption all the three instructions
> behave at the same).
> 
> prefetchnta is completely different beacause it
> fetches a cache line
> into the NT cache structure.
> Non Temporal caches are global caches which are
> particulary powerful
> beacause they don't need of snooping messages
> between CPUs (and, in
> this way, they reduce the CPUs<->caches traffic) and
> are used by NTI
> family.
Thanks. But when I executed two programs one
prefetching using prefetchnta and the second using
prefetcht0, the second program executed faster.
(I used pentium4 processor and gcc compiler.)What
could be the reason?When prefechnta is preferable over
prefecht0?

Also in "IA-32 systems programmers manual" nothing
about nontemporal cache structure is written.The
caches in IA-32 processors are L1 cache, L2
cache,write-combing cache,store buffer, instruction
TLB and data TLB and L3 cache(not present in
pentium4). Does non temporal cache and write combining
buffer are same? 
Thanks in advance.

> 
> Attilio
> 
> 
> -- 
> Peace can only be achieved by understanding - A.
> Einstein
> 


____________________________________________________________________________________
Want to start your own business?
Learn how on Yahoo! Small Business.
http://smallbusiness.yahoo.com/r-index

From owner-freebsd-ia32@FreeBSD.ORG  Mon Dec 11 23:25:16 2006
Return-Path: <owner-freebsd-ia32@FreeBSD.ORG>
X-Original-To: freebsd-ia32@freebsd.org
Delivered-To: freebsd-ia32@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 71B8C16A500
	for <freebsd-ia32@freebsd.org>; Mon, 11 Dec 2006 23:25:16 +0000 (UTC)
	(envelope-from asmrookie@gmail.com)
Received: from nf-out-0910.google.com (nf-out-0910.google.com [64.233.182.187])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 9468143CB4
	for <freebsd-ia32@freebsd.org>; Mon, 11 Dec 2006 23:23:56 +0000 (GMT)
	(envelope-from asmrookie@gmail.com)
Received: by nf-out-0910.google.com with SMTP id x37so56698nfc
	for <freebsd-ia32@freebsd.org>; Mon, 11 Dec 2006 15:25:03 -0800 (PST)
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com;
	h=received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references:x-google-sender-auth;
	b=lPCoKzsxchuc8idu5nD9DbpWl6Gr9J5tojotmObFGRl13MR8BoUWBBNgLbxG28vL6Blo/kVCCtdrMIktIfx8iJUykxaTrptyeKSSrJcZXR7mGjLskdkKY4gkslfNdfTKEPafMfd84JVcF5NDehOs00NIyHRt+Mwv23xsh1MmKm4=
Received: by 10.82.138.6 with SMTP id l6mr541241bud.1165879471682;
	Mon, 11 Dec 2006 15:24:31 -0800 (PST)
Received: by 10.82.178.4 with HTTP; Mon, 11 Dec 2006 15:24:31 -0800 (PST)
Message-ID: <3bbf2fe10612111524i60cb7807wfbb9228b6c8d4b39@mail.gmail.com>
Date: Tue, 12 Dec 2006 00:24:31 +0100
From: "Attilio Rao" <attilio@freebsd.org>
Sender: asmrookie@gmail.com
To: "ranjith kumar" <ranjith_kumar_b4u@yahoo.com>
In-Reply-To: <234144.12400.qm@web58616.mail.re3.yahoo.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
References: <234144.12400.qm@web58616.mail.re3.yahoo.com>
X-Google-Sender-Auth: be232a89163aa9ee
Cc: freebsd-ia32@freebsd.org
Subject: Re: prefetching on pentium 4
X-BeenThere: freebsd-ia32@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: FreeBSD on the IA-32 platform <freebsd-ia32.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-ia32>,
	<mailto:freebsd-ia32-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-ia32>
List-Post: <mailto:freebsd-ia32@freebsd.org>
List-Help: <mailto:freebsd-ia32-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-ia32>,
	<mailto:freebsd-ia32-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 11 Dec 2006 23:25:16 -0000

2006/12/11, ranjith kumar <ranjith_kumar_b4u@yahoo.com>:
> --- Attilio Rao  wrote:
>
> > 2006/12/6, ranjith kumar
> > :
> > > Hi,
> > >     There are 4 types of prefetch instructions on
> > > pentium 4 (IA-32) processor.
> > > prefetchnta,prefetcht0,prefetcht1,prefetcht2.
> > >
> > > In case of pentium 4, IA-32 otimization manuvals
> > say
> > > that prefetcht0,prefetcht1,prefetcht2 are
> > identical.
> > >
> > > It also says ONLY prefetchnta instruction
> > prefetches
> > > data into L2 cache without poluting caches.
> > >
> > >  When all the four instructions prefetches data
> > into
> > > L2 cache (not into L1 cache) , what is the meaning
> > in
> > > saying prefetchnta does not polute caches?
> > >
> > > ie)what is the difference between prefetchnta and
> > > other instructions?
> >
> > First of all, it is important to say that prefetch*
> > instruction is
> > only an hint for the CPU and not a *command* for
> > that, so the CPU
> > needs to evaluate (in a not precisated way) if
> > accept or not the
> > caching request.
> > From this point of view, prefetch* instruction might
> > be the more
> > accomodant possible for the CPU.
> > Different numbers mean different 'critical' level
> > for the CPU (0 -
> > high critical, 2 - low critical), which means
> > prefetching the cache
> > line to an higher level into the cache hierarchy.
> > This would means, in an hypotetical way:
> >
> > prefetch0 -> L1 prefetching
> > prefetch1 -> L2 prefetching
> > prefetch2 -> L3 prefetching
> >
> > And this is what really happens, for example, on P3
> > (if you consider
> > P3 has not L3 cache, prefetch2 == prefetch1).
> > On P4 things are different beacause you would not
> > manipulate directly
> > L1 cache and, so, what happens is:
> >
> > prefetch0 -> L2 prefetching
> > prefetch1 -> L2 prefetching
> > prefetch2 -> L3 prefetching
> > (if L3 cache is not present prefetch2 is the same as
> > the other, from
> > this the assumption all the three instructions
> > behave at the same).
> >
> > prefetchnta is completely different beacause it
> > fetches a cache line
> > into the NT cache structure.
> > Non Temporal caches are global caches which are
> > particulary powerful
> > beacause they don't need of snooping messages
> > between CPUs (and, in
> > this way, they reduce the CPUs<->caches traffic) and
> > are used by NTI
> > family.
> Thanks. But when I executed two programs one
> prefetching using prefetchnta and the second using
> prefetcht0, the second program executed faster.
> (I used pentium4 processor and gcc compiler.)What
> could be the reason?When prefechnta is preferable over
> prefecht0?

As I said, prefetchnta is particulary important in SMP systems.
Are you using a dual-core CPU?
In this case CPUs in order to mantain their caches syncronized need to
do snooping procedures (that are exactly explained into the "IA32
Software Developers Manual, vol 3" (sorry but I can't remind the n. of
the chapter, BTW it is the one speaking about cache tricks)) which
will take the CPU-cache buses.
Using prefetchnta, bytes are fetched into the NT cache system, so the
snooping traffic doesn't affect performance for load/store.

> Also in "IA-32 systems programmers manual" nothing
> about nontemporal cache structure is written.The
> caches in IA-32 processors are L1 cache, L2
> cache,write-combing cache,store buffer, instruction
> TLB and data TLB and L3 cache(not present in
> pentium4). Does non temporal cache and write combining
> buffer are same?

No, they are not.

Attilio


-- 
Peace can only be achieved by understanding - A. Einstein

From owner-freebsd-ia32@FreeBSD.ORG  Wed Dec 13 12:25:05 2006
Return-Path: <owner-freebsd-ia32@FreeBSD.ORG>
X-Original-To: freebsd-ia32@freebsd.org
Delivered-To: freebsd-ia32@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 6013A16A412
	for <freebsd-ia32@freebsd.org>; Wed, 13 Dec 2006 12:25:05 +0000 (UTC)
	(envelope-from ranjith_kumar_b4u@yahoo.com)
Received: from rrr4-v2.mail.re1.yahoo.com (rrr4-v2.mail.re1.yahoo.com
	[66.196.101.251])
	by mx1.FreeBSD.org (Postfix) with SMTP id 0C8AD43CA2
	for <freebsd-ia32@freebsd.org>; Wed, 13 Dec 2006 12:23:35 +0000 (GMT)
	(envelope-from ranjith_kumar_b4u@yahoo.com)
Received: (qmail 23063 invoked from network); 13 Dec 2006 09:59:23 -0000
Received: from web58615.mail.re3.yahoo.com (68.142.236.213)
	by rrr4-v2.mail.re1.yahoo.com with SMTP; 13 Dec 2006 09:59:23 -0000
Received: (qmail 6760 invoked by uid 60001); 13 Dec 2006 09:59:23 -0000
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com;
	h=X-YMail-OSG:Received:Date:From:Subject:To:MIME-Version:Content-Type:Content-Transfer-Encoding:Message-ID;
	b=4UJoT4MDaBsCXz+qtDkLDkk10z2frwEE0zapNQLLQh5ytdOeNKaKOnI66CzCZszUTMLNttc4FxiNny2ixSxxuW9d0Y5NkKQXUAmvJAoR1gGMBroD9xY4Sg7mrUXO5o3l4dXQmkS6gaK0LJv0Z1HfqkGV5hheQF9tlDJh/WQ2bFE=;
X-YMail-OSG: OJdqXM4VM1n.tjz8tzj39rhxSqN2Gph6XbFbUagMuXduya8w23ZWN_Vvr.A9j3zdq.G8Ik5mFQq9om5c6P6PpI0j29sTENUo8fvo5Kz0ZVtwWRAjqrH4KA--
Received: from [59.163.25.48] by web58615.mail.re3.yahoo.com via HTTP;
	Wed, 13 Dec 2006 01:59:23 PST
Date: Wed, 13 Dec 2006 01:59:23 -0800 (PST)
From: ranjith kumar <ranjith_kumar_b4u@yahoo.com>
To: freebsd-ia32@freebsd.org
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit
Message-ID: <75870.6526.qm@web58615.mail.re3.yahoo.com>
X-Mailman-Approved-At: Wed, 13 Dec 2006 12:53:00 +0000
Subject: writing to performance event select registers
X-BeenThere: freebsd-ia32@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: FreeBSD on the IA-32 platform <freebsd-ia32.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-ia32>,
	<mailto:freebsd-ia32-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-ia32>
List-Post: <mailto:freebsd-ia32@freebsd.org>
List-Help: <mailto:freebsd-ia32-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-ia32>,
	<mailto:freebsd-ia32-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 13 Dec 2006 12:25:05 -0000

Hi,
    I want to measure number of last level cache
misses in Pentium 4 processor. In IA-32 programmers
manuals it was given that there are (architectural=
same across all IA-32 processors)perfomance monitoring
counters starting at address   0c1H and
performance_event_select registers starting at address
186H. 

1) When I tried to run a kernel module to write some
value in performance event select register (with
address 186H) by wrmsr instruction, the system is
hanging.Why?
The program is :
#include <linux/module.h> /* Needed by all modules */
#include <linux/kernel.h> /* Needed for KERN_INFO */
//#include<xmmintrin.h>
int i,j,k=0;
unsigned int xx,yy,xx1,yy1,xx2,yy2;
unsigned int t1,t2,t3,t4,BIG=0xffffffff;
int init_module(void)
{

asm volatile (" movl $0x186, %%ecx;"
	      " movl $0x0,   %%edx;"
	      " movl $0x0009412E, %%eax;"
	      " wrmsr;"
		:
		:
		:"%eax","%edx","%ecx"); 


printk(KERN_INFO " Initially %u=t1 %u=t2 %u=t3 %u=t4
\n",t1,t2,t3,t4);
 return 0;
}


void cleanup_module(void)
{
printk(KERN_INFO "Goodbye world \n");
}
-------------------------------------------------------


Thanks in advane.


____________________________________________________________________________________
Do you Yahoo!?
Everyone is raving about the all-new Yahoo! Mail beta.
http://new.mail.yahoo.com