Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 24 Aug 1999 13:29:43 -0700
From:      John Plevyak <jplevyak@inktomi.com>
To:        freebsd-hackers@freebsd.org
Subject:   K6/3 on 3.2-STABLE
Message-ID:  <19990824132943.B11107@proxydev.inktomi.com>

next in thread | raw e-mail | index | archive | help


I am experiencing reproducible crashes with FreeBSD (3.2-STABLE) on 
a K6/3-450 running on an ASUS P5S-VM motherboard.  The problem is highly
repeatable (happens about 1/4 of the way through compiling the kernel)
and goes away if a K6/2-450 is substituted for the K6/3-450 with
all other things held equal.

The problem shows up in sys/kern/vfs_bio.c on line 757 in brelse:

        if ((bp->b_flags & B_INVAL) ||
                (bp->b_flags & (B_LOCKED|B_DELWRI)) == 0) {
                if (bp->b_flags & B_DELWRI) {
                        --numdirtybuffers;
                        bp->b_flags &= ~B_DELWRI;
                }
                vfs_bio_need_satisfy();
        }       


The corresponding assembly code is:

.stabn 68,0,757,.LM335-brelse
.LM335:
        testb $32,37(%esi)
        jne .L560
        testl $16512,36(%esi)
        jne .L559
.L560:
.stabn 68,0,759,.LM336-brelse
.LM336:
        cmpb $0,36(%esi)
        jge .L561
.stabn 68,0,760,.LM337-brelse
.LM337: 
        decl numdirtybuffers  

The problem is that the K6/3 ends up *between* instructions.

In the image this code corresponds to :

0xc017ed35 <brelse+1077>:       movl   %ebx,0xc02d681c
0xc017ed3b <brelse+1083>:       testb  $0x20,0x25(%esi)
0xc017ed3f <brelse+1087>:       jne    0xc017ed4a <brelse+1098>
0xc017ed41 <brelse+1089>:       testl  $0x4080,0x24(%esi)
0xc017ed48 <brelse+1096>:       jne    0xc017ed62 <brelse+1122>
0xc017ed4a <brelse+1098>:       cmpb   $0x0,0x24(%esi)
0xc017ed4e <brelse+1102>:       jnl    0xc017ed5d <brelse+1117>
0xc017ed50 <brelse+1104>:       decl   0xc030680c
0xc017ed56 <brelse+1110>:       andl   $0xffffff7f,0x24(%esi)
0xc017ed5d <brelse+1117>:       call   0xc017e720 <vfs_bio_need_satisfy>

But the kernel crashes with the $pc == 0xc017ed46

which corresponds to :

(gdb) x/10i 0xc017ed46
0xc017ed46 <brelse+1094>:       addb   %al,(%eax)
0xc017ed48 <brelse+1096>:       jne    0xc017ed62 <brelse+1122>
0xc017ed4a <brelse+1098>:       cmpb   $0x0,0x24(%esi)
0xc017ed4e <brelse+1102>:       jnl    0xc017ed5d <brelse+1117>
0xc017ed50 <brelse+1104>:       decl   0xc030680c

and since eax is 0, this results in a protection fault.

This is very repeatable, as I said, it happens about 1/4 of the way
through building the kernel.  It is next to impossible to get through 
an entire build of the kernel.

Swapping out the K6/3 for a K6/2 solved the problem as does running
the same binaries on Intel hardware.

Has anyone else had any similar experience with the K6/3?

Has anyone had success with the K6/3?

On further note: recompiling the kernel with egcs-1.1.2 causes the
problem to go away (probably different instruction selection/scheduling)
but a problem remains in libc (in the 'free' function) which prevents
reliable operation.  'make world' with egcs-1.1.2 requuires a number of
changes, so I haven't tried that yet, but in any case I don't feel
comfortable with changing out the compiler and hoping that 
the bug doesn't just move somewhere else.

Any ideas/pointers appreciated.

john

-- 
John Bradley Plevyak,    PhD,    jplevyak@inktomi.com,     PGP KeyID: 051130BD
Inktomi Corporation,  1900 S. Norfolk Street,  Suite 310,  San Mateo, CA 94403
W:(650)653-2830 F:(650)653-2889 P:(888)491-1332/5103192436.4911332@pagenet.net


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19990824132943.B11107>