From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 04:44:04 2010 Return-Path: Delivered-To: stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1E977106566B for ; Tue, 28 Sep 2010 04:44:04 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (adsl-75-1-14-242.dsl.scrm01.sbcglobal.net [75.1.14.242]) by mx1.freebsd.org (Postfix) with ESMTP id 9C4D68FC12 for ; Tue, 28 Sep 2010 04:44:03 +0000 (UTC) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.13.3/8.13.3) with ESMTP id o8S4PA0d058131 for ; Mon, 27 Sep 2010 21:25:14 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201009280425.o8S4PA0d058131@gw.catspoiler.org> Date: Mon, 27 Sep 2010 21:25:10 -0700 (PDT) From: Don Lewis To: stable@FreeBSD.org MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Cc: Subject: CPU time accounting broken on 8-STABLE machine after a few hours of uptime X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 04:44:04 -0000 CPU time accounting is broken on one of my machines running 8-STABLE. I ran a test with a simple program that just loops and consumes CPU time: % time ./a.out 94.544u 0.000s 19:14.10 8.1% 62+2054k 0+0io 0pf+0w The display in top shows the process with WCPU at 100%, but TIME increments very slowly. Several hours after booting, I got a bunch of "calcru: runtime went backwards" messages, but they stopped right away and never appeared again. Aug 23 13:40:07 scratch ntpd[1159]: ntpd 4.2.4p5-a (1) Aug 23 13:43:18 scratch ntpd[1160]: kernel time sync status change 2001 Aug 23 18:05:57 scratch dbus-daemon: [system] Reloaded configuration Aug 23 18:06:16 scratch dbus-daemon: [system] Reloaded configuration Aug 23 18:12:40 scratch ntpd[1160]: time reset +18.059948 s [snip] Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 6836685136 usec to 5425839798 usec for pid 1526 (csh) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 4747 usec to 2403 usec for pid 1519 (csh) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 5265 usec to 2594 usec for pid 1494 (hald-addon-mouse-sy) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 7818 usec to 3734 usec for pid 1488 (console-kit-daemon) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 977 usec to 459 usec for pid 1480 (getty) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 958 usec to 450 usec for pid 1479 (getty) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 957 usec to 449 usec for pid 1478 (getty) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 952 usec to 447 usec for pid 1477 (getty) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 959 usec to 450 usec for pid 1476 (getty) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 975 usec to 458 usec for pid 1475 (getty) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 1026 usec to 482 usec for pid 1474 (getty) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 1333 usec to 626 usec for pid 1473 (getty) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 2469 usec to 1160 usec for pid 1440 (inetd) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 719 usec to 690 usec for pid 1402 (sshd) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 120486 usec to 56770 usec for pid 1360 (cupsd) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 6204 usec to 2914 usec for pid 1289 (dbus-daemon) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 179 usec to 84 usec for pid 1265 (moused) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 22156 usec to 10407 usec for pid 1041 (nfsd) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 1292 usec to 607 usec for pid 1032 (mountd) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 8801 usec to 4134 usec for pid 664 (devd) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 19 usec to 9 usec for pid 9 (sctp_iterator) If I reboot and run the test again, the CPU time accounting seems to be working correctly. % time ./a.out 1144.226u 0.000s 19:06.62 99.7% 5+168k 0+0io 0pf+0w I'm not sure how long this problem has been present. I do remember seeing the calcru messages with an August 23rd kernel. I have not seen the calcru messages when running -CURRENT on the same hardware. I also have not seen this same problem on my other Athlon 64 box running the August 23rd kernel. Before reboot: # sysctl kern.timecounter kern.timecounter.tick: 1 kern.timecounter.choice: TSC(800) ACPI-fast(1000) i8254(0) dummy(-1000000) kern.timecounter.hardware: ACPI-fast kern.timecounter.stepwarnings: 0 kern.timecounter.tc.i8254.mask: 4294967295 kern.timecounter.tc.i8254.counter: 3534 kern.timecounter.tc.i8254.frequency: 1193182 kern.timecounter.tc.i8254.quality: 0 kern.timecounter.tc.ACPI-fast.mask: 16777215 kern.timecounter.tc.ACPI-fast.counter: 8685335 kern.timecounter.tc.ACPI-fast.frequency: 3579545 kern.timecounter.tc.ACPI-fast.quality: 1000 kern.timecounter.tc.TSC.mask: 4294967295 kern.timecounter.tc.TSC.counter: 2204228369 kern.timecounter.tc.TSC.frequency: 2500018183 kern.timecounter.tc.TSC.quality: 800 kern.timecounter.invariant_tsc: 0 After reboot: % sysctl kern.timecounter kern.timecounter.tick: 1 kern.timecounter.choice: TSC(800) ACPI-fast(1000) i8254(0) dummy(-1000000) kern.timecounter.hardware: ACPI-fast kern.timecounter.stepwarnings: 0 kern.timecounter.tc.i8254.mask: 4294967295 kern.timecounter.tc.i8254.counter: 2241 kern.timecounter.tc.i8254.frequency: 1193182 kern.timecounter.tc.i8254.quality: 0 kern.timecounter.tc.ACPI-fast.mask: 16777215 kern.timecounter.tc.ACPI-fast.counter: 4636239 kern.timecounter.tc.ACPI-fast.frequency: 3579545 kern.timecounter.tc.ACPI-fast.quality: 1000 kern.timecounter.tc.TSC.mask: 4294967295 kern.timecounter.tc.TSC.counter: 1429996208 kern.timecounter.tc.TSC.frequency: 2500020459 kern.timecounter.tc.TSC.quality: 800 kern.timecounter.invariant_tsc: 0 Here's my kernel config file (uni-processor i386 kernel running on an Athlon 64 x2 CPU): include GENERIC nocpu I486_CPU nocpu I586_CPU nooptions SCHED_4BSD # 4BSD scheduler options SCHED_ULE # Debugging for use in -current options KDB # Enable kernel debugger support. options DDB # Support DDB. options GDB # Support remote GDB. nooptions SMP # Symmetric MultiProcessor Kernel nodevice apic # I/O APIC nodevice atapicd # ATAPI CDROM drives device atapicam # emulate ATAPI devices as SCSI ditto via CAM /var/run/dmesg.boot: Copyright (c) 1992-2010 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 8.1-STABLE #6: Thu Sep 23 16:03:29 PDT 2010 dl@scratch.catspoiler.org:/usr/obj/usr/src/sys/GENERICDDB i386 Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: AMD Athlon(tm) 64 X2 Dual Core Processor 4800+ (2500.02-MHz 686-class CPU) Origin = "AuthenticAMD" Id = 0x60fb1 Family = f Model = 6b Stepping = 1 Features=0x178bfbff Features2=0x2001 AMD Features=0xea500800 AMD Features2=0x11f real memory = 4294967296 (4096 MB) avail memory = 3607351296 (3440 MB) kbd1 at kbdmux0 ACPI Warning: Optional field Pm2ControlBlock has zero address or length: 0x0000000000000000/0x1 (20100331/tbfadt-655) acpi0: on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) acpi0: reservation of 0, a0000 (3) failed acpi0: reservation of 100000, dbdf0000 (3) failed Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0 cpu0: on acpi0 acpi_hpet0: iomem 0xfeff0000-0xfeff03ff on acpi0 device_attach: acpi_hpet0 attach returned 12 acpi_button0: on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 pci0: at device 0.0 (no driver attached) isab0: at device 1.0 on pci0 isa0: on isab0 pci0: at device 1.1 (no driver attached) pci0: at device 1.2 (no driver attached) ohci0: mem 0xfe02f000-0xfe02ffff irq 10 at device 2.0 on pci0 ohci0: [ITHREAD] usbus0: on ohci0 ehci0: mem 0xfe02e000-0xfe02e0ff irq 11 at device 2.1 on pci0 ehci0: [ITHREAD] usbus1: EHCI version 1.0 usbus1: on ehci0 ohci1: mem 0xfe02d000-0xfe02dfff irq 5 at device 4.0 on pci0 ohci1: [ITHREAD] usbus2: on ohci1 ehci1: mem 0xfe02c000-0xfe02c0ff irq 10 at device 4.1 on pci0 ehci1: [ITHREAD] usbus3: EHCI version 1.0 usbus3: on ehci1 atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xf000-0xf00f at device 6.0 on pci0 ata0: on atapci0 ata0: [ITHREAD] ata1: on atapci0 ata1: [ITHREAD] pci0: at device 7.0 (no driver attached) pcib1: at device 8.0 on pci0 pci1: on pcib1 fwohci0: mem 0xfd0ff000-0xfd0ff7ff,0xfd0f8000-0xfd0fbfff irq 11 at device 7.0 on pci1 fwohci0: [ITHREAD] fwohci0: OHCI version 1.10 (ROM=1) fwohci0: No. of Isochronous channels is 4. fwohci0: EUI64 00:50:8d:00:00:99:f0:69 fwohci0: Phy 1394a available S400, 2 ports. fwohci0: Link S400, max_rec 2048 bytes. firewire0: on fwohci0 dcons_crom0: on firewire0 dcons_crom0: bus_addr 0x1090000 fwe0: on firewire0 if_fwe0: Fake Ethernet address: 02:50:8d:99:f0:69 fwe0: Ethernet address: 02:50:8d:99:f0:69 fwip0: on firewire0 fwip0: Firewire address: 00:50:8d:00:00:99:f0:69 @ 0xfffe00000000, S400, maxrec 2048 fwohci0: Initiate bus reset fwohci0: fwohci_intr_core: BUS reset fwohci0: fwohci_intr_core: node_id=0x00000000, SelfID Count=1, CYCLEMASTER mode ahc0: port 0xcc00-0xccff mem 0xfd0fe000-0xfd0fefff irq 11 at device 9.0 on pci1 ahc0: [ITHREAD] aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs atapci1: port 0x9f0-0x9f7,0xbf0-0xbf3,0x970-0x977,0xb70-0xb73,0xdc00-0xdc0f mem 0xfe026000-0xfe027fff irq 10 at device 9.0 on pci0 atapci1: [ITHREAD] atapci1: AHCI v1.10 controller with 4 3Gbps ports, PM supported ata2: on atapci1 ata2: [ITHREAD] ata3: on atapci1 ata3: [ITHREAD] ata4: on atapci1 ata4: [ITHREAD] ata5: on atapci1 ata5: [ITHREAD] nfe0: port 0xd800-0xd807 mem 0xfe02b000-0xfe02bfff,0xfe02a000-0xfe02a0ff,0xfe029000-0xfe02900f irq 15 at device 10.0 on pci0 miibus0: on nfe0 e1000phy0: PHY 1 on miibus0 e1000phy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto nfe0: Ethernet address: 00:50:8d:9f:6d:e3 nfe0: [FILTER] pcib2: at device 11.0 on pci0 pci2: on pcib2 pcib3: at device 12.0 on pci0 pci3: on pcib3 atapci2: port 0xac00-0xac7f mem 0xfdcff000-0xfdcff07f,0xfdcf8000-0xfdcfbfff irq 5 at device 0.0 on pci3 atapci2: [ITHREAD] ata6: on atapci2 ata6: [ITHREAD] ata7: on atapci2 ata7: [ITHREAD] pcib4: at device 13.0 on pci0 pci4: on pcib4 pcib5: at device 14.0 on pci0 pci5: on pcib5 pcib6: at device 15.0 on pci0 pci6: on pcib6 pcib7: at device 16.0 on pci0 pci7: on pcib7 pcib8: at device 17.0 on pci0 pci8: on pcib8 vgapci0: mem 0xfb000000-0xfbffffff,0xe0000000-0xefffffff,0xfc000000-0xfcffffff irq 10 at device 18.0 on pci0 acpi_tz0: on acpi0 atrtc0: port 0x70-0x73 on acpi0 fdc0: port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: [FILTER] atkbdc0: port 0x60,0x64 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] psm0: irq 12 on atkbdc0 psm0: [GIANT-LOCKED] psm0: [ITHREAD] psm0: model MouseMan+, device ID 0 pmtimer0 on isa0 sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 ppc0: parallel port not found. powernow0: on cpu0 Timecounter "TSC" frequency 2500018183 Hz quality 800 Timecounters tick every 1.000 msec firewire0: 1 nodes, maxhop <= 0 cable IRM irm(0) (me) firewire0: bus manager 0 usbus0: 12Mbps Full Speed USB v1.0 usbus1: 480Mbps High Speed USB v2.0 usbus2: 12Mbps Full Speed USB v1.0 usbus3: 480Mbps High Speed USB v2.0 ugen0.1: at usbus0 uhub0: on usbus0 ugen1.1: at usbus1 uhub1: on usbus1 ugen2.1: at usbus2 uhub2: on usbus2 ugen3.1: at usbus3 uhub3: on usbus3 uhub0: 6 ports with 6 removable, self powered uhub2: 6 ports with 6 removable, self powered ad6: 476940MB at ata3-master UDMA100 SATA 3Gb/s uhub1: 6 ports with 6 removable, self powered uhub3: 6 ports with 6 removable, self powered unknown: FAILURE - INQUIRY ILLEGAL REQUEST asc=0x24 ascq=0x00 sks=0x40 0x00 0x01 cd0 at ata0 bus 0 scbus1 target 0 lun 0 cd0: Removable CD-ROM SCSI-0 device cd0: 3.300MB/s transfers cd0: cd present [138590 x 2048 byte records] da0 at ahc0 bus 0 scbus0 target 0 lun 0 da0: Fixed Direct Access SCSI-3 device da0: 160.000MB/s transfers (80.000MHz DT, offset 63, 16bit) da0: Command Queueing enabled da0: 35003MB (71687370 512 byte sectors: 255H 63S/T 4462C) Trying to mount root from ufs:/dev/ad6s1a nfe0: link state changed to UP