Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 21 Sep 2001 13:55:58 +0100
From:      Josef Karthauser <joe@tao.org.uk>
To:        Dag-Erling Smorgrav <des@ofug.org>
Cc:        Jun Kuriyama <kuriyama@imgsrc.co.jp>, Julian Elischer <julian@FreeBSD.org>, current@freebsd.org
Subject:   Re: Problems with interrupts on -current.
Message-ID:  <20010921135558.A761@tao.org.uk>
In-Reply-To: <20010916144848.A726@tao.org.uk>; from joe@tao.org.uk on Sun, Sep 16, 2001 at 02:48:48PM %2B0100
References:  <200109120838.f8C8cDa51745@freefall.freebsd.org> <7m8zfimoi6.wl@waterblue.imgsrc.co.jp> <20010914125530.C3913@tao.org.uk> <xzp3d5osgtw.fsf@flood.ping.uio.no> <20010916013520.A689@tao.org.uk> <20010916144848.A726@tao.org.uk>

next in thread | previous in thread | raw e-mail | index | archive | help

--h31gzZEtNLTqOjlF
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

[This is the continuation of a thread that started on -committers]

On Sun, Sep 16, 2001 at 02:48:48PM +0100, Josef Karthauser wrote:
> On Sun, Sep 16, 2001 at 01:35:20AM +0100, Josef Karthauser wrote:
> > On Sat, Sep 15, 2001 at 03:51:07PM +0200, Dag-Erling Smorgrav wrote:
> > > Josef Karthauser <joe@tao.org.uk> writes:
> > > > Is there a possibility that this commit is causing me to lose key
> > > > presses?  I'm finding it hard to imagine that I'm miss typing as
> > > > I've never noticed it before.  (Every N, where N is > 30 or 40, a k=
ey
> > > > that I press doesn't register and I have to press it again).
> > >=20
> > > Educated guess: your interrupt latency just went to hell (where mine's
> > > been for three months now, I'm still waiting to hear if Matt could
> > > make any sense out of my crash dump) and you're losing interrupts.  If
> > > you have a serial mouse, try moving it around a lot and see if it
> > > seems to hang (you should see mentions of interrupt-level buffer
> > > overflows in your /var/log/messages).  Also, just for kicks, check how
> > > much CPU time your syncer process is using, and try running sync(8)
> > > and see if your keyboard wedges for a couple of seconds when you do
> > > that.
> >=20
> > My mouse is /dev/psm0. From time to time the ata device's
> > interrupt/second goes through the roof for not apparent reason (i.e.
> > several hundred interrupts/sec).  Sync never wedges anything.
>=20
> There's almost definitely an interrupt problem.  I regularly have
> the machine wedge almost solid when rsyncing a lot of data to and
> fro.  The machine begins to behave eratically, which I now think
> happens mainly because all the timers stop working (maybe the
> interrupts stop working?), 'systat -vmstat' doesn't produce any
> numbers because the initial time delay never passes.  :(.  Also, I
> don't appear to be able to enter the kernel debugger when this
> happens!  :(  Can someone in the know give me a hand debugging this.
> It really ought to be fixed, but my knowledge isn't sufficient to
> find this on my own.
>=20
> Thanks,
> Joe

This also happens from time to time:


    6 users    Load  1.39  1.23  1.14                  Sep 21 13:32        =
    =20
                                                                           =
    =20
Mem:KB    REAL            VIRTUAL                     VN PAGER  SWAP PAGER =
    =20
        Tot   Share      Tot    Share    Free         in  out     in  out  =
    =20
Act   62696    8932   111764    14728   15052 count                        =
    =20
All  249864   12164  2806932    25860         pages                        =
    =20
                                                                 Interrupts=
    =20
Proc:r  p  d  s  w    Csw  Trp  Sys  Int  Sof  Flt      1 cow    1743 total=
    =20
           6 32     12398   13  866 1823        26  45516 wire        stray=
 irq0
                                                    90820 act         stray=
 irq6
 8.3%Sys   5.1%Intr  0.2%User  0.0%Nice 86.4%Idl   102140 inact       stray=
 irq7
|    |    |    |    |    |    |    |    |    |      11388 cache     1 acpi0=
 irq9
=3D=3D=3D=3D+++                                              3664 free   15=
05 ata0 irq14
                                                          daefr       uhci0=
 irq5
Namei         Name-cache    Dir-cache                   5 prcfr     2 pcm0 =
irq5=20
    Calls     hits    %     hits    %                     react     7 atkbd=
0 irq
      688      687  100                                   pdwak       psm0 =
irq12
                                        4 zfod            pdpgs   100 clk i=
rq0 =20
Disks   ad0   fd0                         ofod            intrn   128 rtc i=
rq8 =20
KB/t   6.00  0.00                       9 %slo-z    35712 buf              =
    =20
tps    1507     0                       7 tfree        10 dirtybuf         =
    =20
MB/s   8.83  0.00                                   17913 desiredvnodes    =
    =20
% busy   98     0                                   14595 numvnodes        =
    =20
                                                     4798 freevnodes       =
    =20
                                                                           =
    =20

Look at the number of interrupts that the ata device is generating.
This is in no way normal!  It happens randomly and causes the machine
to basically grind to a halt.

As a comparison on the same machine, here's the output of systat -vmstat
for the machine after I rebooted it and it was running a background
fsck:


    4 users    Load  1.01  0.42  0.16                  Sep 21 13:50        =
    =20
                                                                           =
    =20
Mem:KB    REAL            VIRTUAL                     VN PAGER  SWAP PAGER =
    =20
        Tot   Share      Tot    Share    Free         in  out     in  out  =
    =20
Act   40328    3848    71980     4408   53308 count                        =
    =20
All  200248    6884  1085132    10232         pages                        =
    =20
                                                                 Interrupts=
    =20
Proc:r  p  d  s  w    Csw  Trp  Sys  Int  Sof  Flt        cow     329 total=
    =20
           2 30       622   11  955  402    2   34  35928 wire        stray=
 irq0
                                                    35492 act         stray=
 irq6
 1.4%Sys   1.9%Intr  1.2%User  0.6%Nice 94.9%Idl   128800 inact       stray=
 irq7
|    |    |    |    |    |    |    |    |    |         28 cache       acpi0=
 irq9
=3D+-                                                 53280 free     97 ata=
0 irq14
                                                          daefr       uhci0=
 irq5
Namei         Name-cache    Dir-cache                     prcfr     1 pcm0 =
irq5=20
    Calls     hits    %     hits    %                     react     3 atkbd=
0 irq
      536      534  100                                   pdwak       psm0 =
irq12
                                        8 zfod            pdpgs   100 clk i=
rq0 =20
Disks   ad0   fd0                       1 ofod            intrn   128 rtc i=
rq8 =20
KB/t   7.99  0.00                       7 %slo-z    35712 buf              =
    =20
tps      97     0                       1 tfree        33 dirtybuf         =
    =20
MB/s   0.76  0.00                                   17913 desiredvnodes    =
    =20
% busy   98     0                                    1655 numvnodes        =
    =20
                                                       29 freevnodes       =
    =20


Who's responsible for this area?  I'm happy to help in getting to the
bottom of it.  Is it an interrupt routing problem?  It is a ata device
problem?  It is something else (maybe locking) altogether?

This problem has existed in -current for at least 6 weeks.

Thanks for any suggestions,
Joe

--h31gzZEtNLTqOjlF
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (FreeBSD)
Comment: For info see http://www.gnupg.org

iEYEARECAAYFAjurON0ACgkQXVIcjOaxUBayQACdESArWroFnlOyEWhLWf/C9g8e
DcEAn2OZgPbcUE3hKBIE1bMkETUJsEbI
=mXjn
-----END PGP SIGNATURE-----

--h31gzZEtNLTqOjlF--

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010921135558.A761>