From owner-freebsd-alpha Thu Nov 21 14:49:40 2002 Delivered-To: freebsd-alpha@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7FED637B401 for ; Thu, 21 Nov 2002 14:49:37 -0800 (PST) Received: from aismail.ais.msu.edu (ais.msu.edu [35.8.113.169]) by mx1.FreeBSD.org (Postfix) with ESMTP id CDA5143E88 for ; Thu, 21 Nov 2002 14:49:36 -0800 (PST) (envelope-from murphyp1@ais.msu.edu) Received: by ais.msu.edu with Internet Mail Service (5.5.2656.59) id ; Thu, 21 Nov 2002 17:49:36 -0500 Message-ID: <17F0EC17EF87D311BF65009027D3C39D04DA0E8B@ais.msu.edu> From: "Murphy, Patrick" To: "'freebsd-alpha@freebsd.org'" Subject: RE: Extreme time drift in SMP mode Date: Thu, 21 Nov 2002 17:49:32 -0500 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2656.59) Content-Type: text/plain Sender: owner-freebsd-alpha@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org I have a 2100A with 3 CPUs and have seen the clock drift problem. I just upgraded to the 5.0-Current system as of yesterday to try the change recommended in this thread. Without the change, a "sleep 10" command takes 20 seconds to complete and the system clock loses time. With the interrupt.c patch, the "sleep 10" command completes in 7 seconds and the system clock gains time. So, it seems that the clock is off by a factor of 2 in SMP mode. If anyone would like a different patch tested, I would be glad to do so. In case you want the dmesg info for the system, it is: FreeBSD 5.0-CURRENT #1: Thu Nov 21 15:53:40 EST 2002 murphyp1@bsdalpha.ais.msu.edu:/usr2/FreeBSD-Obj-Current/usr2/FreeBSD-Source- Current/sys/GENERIC Preloaded elf kernel "/boot/kernel/kernel" at 0xfffffc0000866000. DEC AlphaServer 2100A AlphaServer 2100A 5/250, 250MHz 8192 byte page size, 3 processors. CPU: EV5 (21164) major=5 minor=5 OSF PAL rev: 0x4000100020116 real memory = 1071570944 (1021 MB) avail memory = 1034870784 (986 MB) FreeBSD/SMP: Multiprocessor System Detected: 3 CPUs Timecounters tick every 0.976 msec Timecounter "i8254" frequency 1193182 Hz Waiting 15 seconds for SCSI devices to settle SMP: AP CPU #1 Launched! SMP: AP CPU #2 Launched! Patrick Murphy Michigan State University -----Original Message----- Thanks for your time in explaining this. I should be receiving some more cpu's for this thing in a day or two - I'll let you know how that changes the timing issues. -----Original Message----- From: owner-freebsd-alpha@FreeBSD.org [mailto:owner-freebsd-alpha@FreeBSD.org] On Behalf Of John Baldwin Sent: Wednesday, November 13, 2002 11:03 AM To: Michael A. Mackey Cc: freebsd-alpha@FreeBSD.org; Terry Lambert Subject: Re: Extreme time drift in SMP mode On 12-Nov-2002 Michael A. Mackey wrote: > I guess I don't understand the problem. > > It seemed to me that the problem was that not all the interrupts were > being delivered because the Lynx architecture expects each processor to > generate interrupts. Before the fix, the system lost time by an amount > which was equivalent to throwing away half of the interrupts. After my > modification, each processor is allowed to generate clock interrupts, > and the system receives the complete set of interrupts, yielding the > result that the system keeps time correctly. > > I'm sure that this is a naive picture of what's going on (and I'm not a > kernel developer), but it works. I realize that it is probably specific > to the Lynx architecture, and I of course would be happy for a 'correct' > way to allow this old box to happily crank along solving PDE's. > > > Anyway, I sure am glad to have such high quality software to run on this > box. > Keep up the great work FreeBSD-Alpha! I'll try to explain. For better or for worse, FreeBSD currently uses the following model of clock interrupts to drive hardclock() (update timecounters, handle profiling, drive softclock) and statclock() (update statistics): For each "virtual" system-wide clock interrupt, all CPU's execute statclock_process() and hardclock_process() (should be renamed to *_thread() at this point) to perform process-specific updates (profiling, stats, etc.) that need to happen on all CPU's. All but one of these CPU's execute these functions directly from their clock interrupt. One CPU executes statclock() and hardclock() directly which call the _process() variants as part of their task, but also perform system-wide updates such as update the timecounters and drive softclock(). On i386, clock interrupts are only sent to one processor in a sort of round-robin fashion. What we do there is that each time a clock interrupt occurs, the receiving CPU acts as the "master" CPU and executes hardclock() and statclock(). It then IPI's all the other CPU's in the system to simulate a system-wide clock interrupt, and all the other CPU's in the system then execute the _process() functions. When we did SMP on Alpha we wanted to avoid sending all those IPI's if possible. For one thing, IPI's in general are expensive. On the Alpha they are a bit worse though as you can only IPI one CPU at a time whereas on i386 you can send broadcast IPI's to all other CPU's at once. On at least the 4100 and DS20 type machines, we found that the clock interrupt was broadcast to all CPU's, but in a round-robin fashion. That is, if we were getting X clock interrupts / sec in UP on CPU 0, we still got X clock ints / sec on CPU 0, but we also got X clock ints / sec on CPU 1, 2, etc. but offset so that they didn't all get interrupted at once. Thus, we made the boot processor the "master" CPU for all clock interrupts and had it call hardclock() and statclock() while all the other CPU's would call the _process() variants. Basically, the system was doing the global IPI for us except that since the interrupts were staggered, there was less contesting on common locks. Now enter the 2100 into the picture. I'm not really sure what it is doing with its clock interrupts. I'm not sure if it is acting like a i386 and round-robin'ing the clock interrupts to all processors or if it is acting like other Alpha's but slowing down the clock. Probably it is acting like a i386 and we might need to just change the 2100 clock interrupt handler to use the i386 model and go and IPI the other CPU's when a clock interrupt comes in. If anyone can stick more than 2 CPU's in a 2100 system and see if the clock runs 3x or 4x as slow that might help. The reason I would prefer to just dink with the timer_freq is that it is simple and doesn't change the model that system-wide things like timekeeping only happen once per "virtual" clock interrupt. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-alpha" in the body of the message