Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 1 Jan 2011 18:30:16 GMT
From:      Sergey Kandaurov <pluknet@gmail.com>
To:        freebsd-bugs@FreeBSD.org
Subject:   Re: bin/30360: vmstat(8) returns impossible data
Message-ID:  <201101011830.p01IUG2w059319@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help
The following reply was made to PR bin/30360; it has been noted by GNATS.

From: Sergey Kandaurov <pluknet@gmail.com>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: bin/30360: vmstat(8) returns impossible data
Date: Sat, 1 Jan 2011 21:23:24 +0300

 That's a type overflow bug which I think isn't easy to fix, b.c. it
 breaks cp_time ABI.
 cp_time is (roughly) an array[CPUSTATES] of longs.
 long type is 4-bytes on i386, and 8-bytes on amd64.
 That's why I don't see this bug on amd64 boxes.
 
 Sometimes the bug might not manifest on i386 sysctl kern.cp_time, but
 generally it does.
 That's because the exported cp_time[] fmt (used by /sbin/sysctl) is
 different ("UL"),
 and that gives extended type capacity (for a while) by casting signed
 to unsigned.
 
 In this example bug manifests for `id' as well with /sbin/sysctl on
 i386 (uptime 597 days):
 # sysctl kern.cp_time
 kern.cp_time: 4021277307 75175092 2025746497 49748493 2746074583
 # vmstat
  procs      memory      page                    disks     faults      cpu
  r b w     avm    fre  flt  re  pi  po  fr  sr da0 da1   in   sy  cs us sy id
  1 5 0   93720 458992   14   0   0   3  53   1   0   0   37    1   5
 -61 633 -472
 
 Both boxes, hub and freefall, reported by arundel@ are i386.
 
 In this example /sbin/sysctl abuses "UL" fmt, but it doesn't work for vmstat
 which uses libdevstat which in turn properly uses cp_time[] as long signed.
 
 # sysctl kern.cp_time
 kern.cp_time: 795491304 5844771 246148418 43709451 2752874123
 # ./test
 printf("%lu\n", l): 2752874123
 printf("%ld\n", l): -1542093173 [compare]
 
 # ./vmstat
  procs      memory      page                   disk   faults         cpu
  r b w     avm    fre  flt  re  pi  po  fr  sr aa0   in   sy  cs us sy id
  3 3 0   5776M   172M  173  39  22   5 617 444   0  743  193  60
 cpustats(): before 'total += cur.cp_time[state]': cp_time[]: 795758944
 cpustats(): before 'total += cur.cp_time[state]': total: 0.000000
 cpustats(): after  'total += cur.cp_time[state]': cp_time[]: 795758944
 cpustats(): after  'total += cur.cp_time[state]': total: 795758944.000000
 
 cpustats(): before 'total += cur.cp_time[state]': cp_time[]: 5844771
 cpustats(): before 'total += cur.cp_time[state]': total: 795758944.000000
 cpustats(): after  'total += cur.cp_time[state]': cp_time[]: 5844771
 cpustats(): after  'total += cur.cp_time[state]': total: 801603715.000000
 
 cpustats(): before 'total += cur.cp_time[state]': cp_time[]: 246218512
 cpustats(): before 'total += cur.cp_time[state]': total: 801603715.000000
 cpustats(): after  'total += cur.cp_time[state]': cp_time[]: 246218512
 cpustats(): after  'total += cur.cp_time[state]': total: 1047822227.000000
 
 cpustats(): before 'total += cur.cp_time[state]': cp_time[]: 43723365
 cpustats(): before 'total += cur.cp_time[state]': total: 1047822227.000000
 cpustats(): after  'total += cur.cp_time[state]': cp_time[]: 43723365
 cpustats(): after  'total += cur.cp_time[state]': total: 1091545592.000000
 
 cpustats(): before 'total += cur.cp_time[state]': cp_time[]:
 -1541158615 [compare]
 cpustats(): before 'total += cur.cp_time[state]': total: 1091545592.000000
 cpustats(): after  'total += cur.cp_time[state]': cp_time[]: -1541158615
 cpustats(): after  'total += cur.cp_time[state]': total: -449613023.000000
 
  -178 -64 343
 ^^1   ^^2    ^^3
 
 (1) and (2) is negative b.c. both multiplied by neg. total cp_time index;
 (3) is positive b.c. it's neg. cp_time[CP_IDLE] multiplied by neg.
 total cp_time index
 After summation, total has wrong sign and wrong value hence high pct. values.
 
 -- 
 wbr,
 pluknet



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201101011830.p01IUG2w059319>