Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 28 Mar 2001 05:26:35 +0900
From:      Yoshihiro Koya <Yoshihiro.Koya@math.yokohama-cu.ac.jp>
To:        kris@obsecurity.org
Cc:        Yoshihiro.Koya@math.yokohama-cu.ac.jp, freebsd-stable@freebsd.org
Subject:   Re: buildworld broken in 4.3-RC?
Message-ID:  <20010328052635H.koya@pluto.math.yokohama-cu.ac.jp>
In-Reply-To: <20010326214833.A13267@xor.obsecurity.org>
References:  <20010326164442.A10495@xor.obsecurity.org> <20010327102038Q.koya@pluto.math.yokohama-cu.ac.jp> <20010326214833.A13267@xor.obsecurity.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Hello,

I trimmed CC: field.

From: Kris Kennaway <kris@obsecurity.org>
Subject: Re: buildworld broken in 4.3-RC?
Date: Mon, 26 Mar 2001 21:48:33 -0800
Message-ID: <20010326214833.A13267@xor.obsecurity.org>

> On Tue, Mar 27, 2001 at 10:20:38AM +0900, Yoshihiro Koya wrote:
> 
> > This make world session was frequently prevented by such internal 
> > compiler error.  That cc was compiled with CPUTYPE=k6-2.
> > But I'm not suffered such a problem now.
> 
> The fact that it was failing in a different place each time is a very
> strong indicator that it was hardware-related: compilers are pretty
> deterministic beasts in what they do; given the same input they will
> go through the same set of steps and produce the same output.  If this
> was a bug in gcc, I'd expect it to fail in the same place each time
> when it tries to compile the magic code.

I did some experiment. I set CPUTYPE=k6-2 in my /etc/make.conf, and
did make world again.  As I guess, it was interrupted.
For example, I got something like the following message twice 
during the compilation.

cc -fpic -DPIC -O -pipe -march=k6 -DTERMIOS -DANSI_SOURCE \
-I/usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto \
-I/usr/obj/usr/src/secure/lib/libcrypto -DNO_IDEA -DL_ENDIAN \
-DSHA1_ASM -DBN_ASM -DMD5_ASM -DRMD160_ASM -DNO_IDEA \
-I/usr/obj/usr/src/i386/usr/include \
-c /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/rc2/rc2_ecb.c 
-o rc2_ecb.So
cc: Internal compiler error: program cc1 got fatal signal 11
*** Error code 1

During this, I got two core files. And I observed where cc1 caught 
the signal 11.

presario# cd /usr/obj/usr/src/lib/libc
presario# ls *.core
cc1.core
presario# gdb /usr/obj/usr/src/i386/usr/libexec/cc1 ./cc1.core
GNU gdb 4.18
[snip]
(no debugging symbols found)...
Core was generated by `cc1'.
Program terminated with signal 11, Segmentation fault.
#0  0x6093b58 in ?? ()
(gdb) where
#0  0x6093b58 in ?? ()
#1  0x808da59 in operands_match_p ()
#2  0x808dc28 in safe_from_earlyclobber ()
#3  0x809e513 in constrain_operands ()
#4  0x8083282 in reload_cse_regs ()
#5  0x8082c64 in count_occurrences ()
#6  0x8082e7b in reload_cse_regs ()
#7  0x8069585 in rest_of_compilation ()
#8  0x8053b18 in finish_function ()
#9  0x80488fe in yyparse ()
#10 0x8067e17 in check_global_declarations ()
#11 0x806aefb in main ()
#12 0x8048135 in _start ()
(gdb) disassemble 0x6093b58
No function contains specified address.

(gdb) disassemble 0x808da59
Dump of assembler code for function operands_match_p:
[snip]
0x808da4e <operands_match_p+1586>:	pushl  0x20(%ebp)
0x808da51 <operands_match_p+1589>:	pushl  0x1c(%ebp)
0x808da54 <operands_match_p+1592>:	
    call   0x8093910 <refers_to_regno_for_reload_p>
0x808da59 <operands_match_p+1597>:	test   %eax,%eax
0x808da5b <operands_match_p+1599>:	sete   %al
0x808da5e <operands_match_p+1602>:	and    $0xff,%eax
0x808da63 <operands_match_p+1607>:	jmp    0x808dbed <operands_match_p+2001>
0x808da68 <operands_match_p+1612>:	cmpl   $0x0,0x14(%ebp)
0x808da6c <operands_match_p+1616>:	jne    0x808dbc9 <operands_match_p+1965>
[snip]
(gdb) q
presario# pwd
/usr/obj/usr/src/lib/libc
presario# cd ../../secure
presario# cd lib
presario# cd libcrypto
presario# ls *.core
cc1.core
presario# gdb /usr/obj/usr/src/i386/usr/libexec/cc1 ./cc1.core
GNU gdb 4.18
[snip]
(no debugging symbols found)...
Core was generated by `cc1'.
Program terminated with signal 11, Segmentation fault.
#0  0x6093b58 in ?? ()
(gdb) where
#0  0x6093b58 in ?? ()
#1  0x808da59 in operands_match_p ()
#2  0x808dc28 in safe_from_earlyclobber ()
#3  0x809e513 in constrain_operands ()
#4  0x8083282 in reload_cse_regs ()
#5  0x8082c64 in count_occurrences ()
#6  0x8082e7b in reload_cse_regs ()
#7  0x8069585 in rest_of_compilation ()
#8  0x8053b18 in finish_function ()
#9  0x80488fe in yyparse ()
#10 0x8067e17 in check_global_declarations ()
#11 0x806aefb in main ()
#12 0x8048135 in _start ()
(gdb) q

Please note the address 0x6093b58. My two core files tells me that 
cc1 always caught signal there in my case. 

If the failing was caused by the hardware fault, 
is it possible for cc1 to catch a signal at the same address?
I guess that the observation above of gdb should give more 
random results, when I have some hardware problems.

On the other hand, I began to believe your opinion:

> Perhaps what you're seeing is a hardware fault which is only triggered
> by the particular combination of instructions output by gcc with

The result from the observation using gdb also agrees with above your
opition.

But I cannot say anything definite now.  I only have two core files, 
and it might be apparently too few to conclude.

Anyway, thank you very much for your suggestion.

koya

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010328052635H.koya>