From owner-freebsd-alpha  Thu Nov 21 14:49:40 2002
Delivered-To: freebsd-alpha@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 7FED637B401
	for <freebsd-alpha@freebsd.org>; Thu, 21 Nov 2002 14:49:37 -0800 (PST)
Received: from aismail.ais.msu.edu (ais.msu.edu [35.8.113.169])
	by mx1.FreeBSD.org (Postfix) with ESMTP id CDA5143E88
	for <freebsd-alpha@freebsd.org>; Thu, 21 Nov 2002 14:49:36 -0800 (PST)
	(envelope-from murphyp1@ais.msu.edu)
Received: by ais.msu.edu with Internet Mail Service (5.5.2656.59)
	id <W1Q6VRMF>; Thu, 21 Nov 2002 17:49:36 -0500
Message-ID: <17F0EC17EF87D311BF65009027D3C39D04DA0E8B@ais.msu.edu>
From: "Murphy, Patrick" <murphyp1@ais.msu.edu>
To: "'freebsd-alpha@freebsd.org'" <freebsd-alpha@freebsd.org>
Subject: RE: Extreme time drift in SMP mode
Date: Thu, 21 Nov 2002 17:49:32 -0500
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2656.59)
Content-Type: text/plain
Sender: owner-freebsd-alpha@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-alpha.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-alpha>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-alpha>
X-Loop: FreeBSD.org

I have a 2100A with 3 CPUs and have seen the clock drift problem.  I just
upgraded to the 5.0-Current system as of yesterday to try the change
recommended in this thread.  Without the change, a "sleep 10" command takes
20 seconds to complete and the system clock loses time.  With the
interrupt.c patch, the "sleep 10" command completes in 7 seconds and the
system clock gains time.  So, it seems that the clock is off by a factor of
2 in SMP mode.  If anyone would like a different patch tested, I would be
glad to do so.  In case you want the dmesg info for the system, it is:

FreeBSD 5.0-CURRENT #1: Thu Nov 21 15:53:40 EST 2002
 
murphyp1@bsdalpha.ais.msu.edu:/usr2/FreeBSD-Obj-Current/usr2/FreeBSD-Source-
Current/sys/GENERIC
Preloaded elf kernel "/boot/kernel/kernel" at 0xfffffc0000866000.
DEC AlphaServer 2100A
AlphaServer 2100A 5/250, 250MHz
8192 byte page size, 3 processors.
CPU: EV5 (21164) major=5 minor=5
OSF PAL rev: 0x4000100020116
real memory  = 1071570944 (1021 MB)
avail memory = 1034870784 (986 MB)
FreeBSD/SMP: Multiprocessor System Detected: 3 CPUs
Timecounters tick every 0.976 msec
Timecounter "i8254"  frequency 1193182 Hz
Waiting 15 seconds for SCSI devices to settle
SMP: AP CPU #1 Launched!
SMP: AP CPU #2 Launched!

Patrick Murphy
Michigan State University


-----Original Message-----

Thanks for your time in explaining this.  

I should be receiving some more cpu's for this thing in a day or two -
I'll let you know how that changes the timing issues.

-----Original Message-----
From: owner-freebsd-alpha@FreeBSD.org
[mailto:owner-freebsd-alpha@FreeBSD.org] On Behalf Of John Baldwin
Sent: Wednesday, November 13, 2002 11:03 AM
To: Michael A. Mackey
Cc: freebsd-alpha@FreeBSD.org; Terry Lambert
Subject: Re: Extreme time drift in SMP mode


On 12-Nov-2002 Michael A. Mackey wrote:
> I guess I don't understand the problem.
> 
> It seemed to me that the problem was that not all the interrupts were
> being delivered because the Lynx architecture expects each processor
to
> generate interrupts.  Before the fix, the system lost time by an
amount
> which was equivalent to throwing away half of the interrupts. After my
> modification, each processor is allowed to generate clock interrupts,
> and the system receives the complete set of interrupts, yielding the
> result that the system keeps time correctly.
> 
> I'm sure that this is a naive picture of what's going on (and I'm not
a
> kernel developer), but it works.  I realize that it is probably
specific
> to the Lynx architecture, and I of course would be happy for a
'correct'
> way to allow this old box to happily crank along solving PDE's.
>  
> 
> Anyway, I sure am glad to have such high quality software to run on
this
> box.  
> Keep up the great work FreeBSD-Alpha!

I'll try to explain.

For better or for worse, FreeBSD currently uses the following model
of clock interrupts to drive hardclock() (update timecounters, handle
profiling, drive softclock) and statclock() (update statistics):

For each "virtual" system-wide clock interrupt, all CPU's execute
statclock_process() and hardclock_process() (should be renamed to
*_thread() at this point) to perform process-specific updates
(profiling, stats, etc.) that need to happen on all CPU's.  All but
one of these CPU's execute these functions directly from their clock
interrupt.  One CPU executes statclock() and hardclock() directly
which call the _process() variants as part of their task, but also
perform system-wide updates such as update the timecounters and
drive softclock().

On i386, clock interrupts are only sent to one processor in a sort
of round-robin fashion.  What we do there is that each time a clock
interrupt occurs, the receiving CPU acts as the "master" CPU and
executes hardclock() and statclock().  It then IPI's all the other
CPU's in the system to simulate a system-wide clock interrupt, and
all the other CPU's in the system then execute the _process()
functions.

When we did SMP on Alpha we wanted to avoid sending all those IPI's
if possible.  For one thing, IPI's in general are expensive.  On the
Alpha they are a bit worse though as you can only IPI one CPU at a
time whereas on i386 you can send broadcast IPI's to all other CPU's
at once.  On at least the 4100 and DS20 type machines, we found that
the clock interrupt was broadcast to all CPU's, but in a round-robin
fashion.  That is, if we were getting X clock interrupts / sec in UP
on CPU 0, we still got X clock ints / sec on CPU 0, but we also got
X clock ints / sec on CPU 1, 2, etc.  but offset so that they didn't
all get interrupted at once.  Thus, we made the boot processor the
"master" CPU for all clock interrupts and had it call hardclock() and
statclock() while all the other CPU's would call the _process()
variants.  Basically, the system was doing the global IPI for us
except that since the interrupts were staggered, there was less
contesting on common locks.

Now enter the 2100 into the picture.  I'm not really sure what it is
doing with its clock interrupts.  I'm not sure if it is acting like a
i386 and round-robin'ing the clock interrupts to all processors or if
it is acting like other Alpha's but slowing down the clock.  Probably
it is acting like a i386 and we might need to just change the 2100
clock interrupt handler to use the i386 model and go and IPI the other
CPU's when a clock interrupt comes in.  If anyone can stick more than 2
CPU's in a 2100 system and see if the clock runs 3x or 4x as slow that
might help.

The reason I would prefer to just dink with the timer_freq is that it
is simple and doesn't change the model that system-wide things like
timekeeping only happen once per "virtual" clock interrupt.

-- 

John Baldwin <jhb@FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-alpha" in the body of the message