Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 01 Jul 2014 17:23:14 +0200
From:      Willem Jan Withagen <wjw@digiware.nl>
To:        "Rang, Anton" <anton.rang@isilon.com>,  "O. Hartmann" <ohartman@zedat.fu-berlin.de>, Dimitry Andric <dim@FreeBSD.org>
Cc:        Adrian Chadd <adrian@freebsd.org>, FreeBSD CURRENT <freebsd-current@freebsd.org>
Subject:   Re: [CURRENT]: weird memory/linker problem?
Message-ID:  <53B2D262.2040502@digiware.nl>
In-Reply-To: <F21EDC44C64DB34B90AF485AC3CEDD4B3539868C@MX104CL01.corp.emc.com>
References:  <20140622165639.17a1ba1e.ohartman@zedat.fu-berlin.de> <CAJ-Vmok0Oh6XGe62acXE-82pTmEaouibd1GqDT0pCo8P6x6Hog@mail.gmail.com> <20140623163115.03bdd675.ohartman@zedat.fu-berlin.de> <F427210C-D7A9-499F-AFF9-C0B29CC6D51B@FreeBSD.org> <20140701150755.548ed6b9.ohartman@zedat.fu-berlin.de> <F21EDC44C64DB34B90AF485AC3CEDD4B3539868C@MX104CL01.corp.emc.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2014-07-01 16:48, Rang, Anton wrote:
> DOT => DOD
>
> 444F54 => 444F44
>
> That's a single-bit flip.  Bad memory, perhaps?

Very likely, especially if the system does not have ECC....
It just happens on rare occasions that a alpha particle, power cycle, or 
any things else disruptive damages a memory cell. And it could be that 
it requires a special pattern of accesses to actually exhibit the error.

In the past (199x's) 'make buildworld' used to be a rather good memory 
tester. But nowadays look at
	http://www.memtest.org/

This tool has found all of the bad memory in all the systems I used and 
or build for others...
Note that it might take a few runs and some more heat to actually 
trigger the faulty cell, but memtest86 will usually find it.

Note that on big systems with lots of memory it can take a loooooong 
time to run just one full testset to completion.

--WjW


>
> Anton
>
> -----Original Message-----
> From: owner-freebsd-current@freebsd.org [mailto:owner-freebsd-current@freebsd.org] On Behalf Of O. Hartmann
> Sent: Tuesday, July 01, 2014 8:08 AM
> To: Dimitry Andric
> Cc: Adrian Chadd; FreeBSD CURRENT
> Subject: Re: [CURRENT]: weird memory/linker problem?
>
> Am Mon, 23 Jun 2014 17:22:25 +0200
> Dimitry Andric <dim@FreeBSD.org> schrieb:
>
>> On 23 Jun 2014, at 16:31, O. Hartmann <ohartman@zedat.fu-berlin.de> wrote:
>>> Am Sun, 22 Jun 2014 10:10:04 -0700
>>> Adrian Chadd <adrian@freebsd.org> schrieb:
>>>> When they segfault, where do they segfault?
>> ...
>>> GIMP, LaTeX work, nothing special, but a bit memory consuming
>>> regrading GIMP) I tried updating the ports tree and surprisingly the
>>> tree is left over in a unclean condition while /usr/bin/svn segfault
>>> (on console: pid 18013 (svn), uid 0: exited on signal 11 (core dumped)).
>>>
>>> Using /usr/local/bin/svn, which is from the devel/subversion port,
>>> performs well, while FreeBSD 11's svn contribution dies as described. It did not hours ago!
>>
>> I think what Adrian meant was: can you run svn (or another crashing
>> program) in gdb, and post a backtrace?  Or maybe run ktrace, and see
>> where it dies?
>>
>> Alternatively, put a core dump and the executable (with debug info) in
>> a tarball, and upload it somewhere, so somebody else can analyze it.
>>
>> -Dimitry
>>
>
> It's me again, with the same weird story.
>
> After a couple of days silence, the mysterious entity in my computer is back. This time it is again a weird compiler message of failure (trying to buildworld):
>
> [...]
> c++  -O2 -pipe -O3 -O3
> c++ -I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/include
> -I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/tools/clang/include
> -I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/lib/Support -I.
> -I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/../../lib/clang/include
> -DLLVM_ON_UNIX -DLLVM_ON_FREEBSD -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -fno-strict-aliasing -DLLVM_DEFAULT_TARGET_TRIPLE=\"x86_64-unknown-freebsd11.0\"
> -DLLVM_HOST_TRIPLE=\"x86_64-unknown-freebsd11.0\" -DDEFAULT_SYSROOT=\"\"
> -Qunused-arguments -I/usr/obj/usr/src/tmp/legacy/usr/include -std=c++11 -fno-exceptions -fno-rtti -Wno-c++11-extensions -c /usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/lib/Support/Host.cpp -o Host.o
> --- GraphWriter.o --- In file included
> from /usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/lib/Support/GraphWriter.cpp:14: /usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/include/llvm/Support/GraphWriter.h:269:10:
> error: use of undeclared identifier 'DOD'; did you mean 'DOT'? O << DOD::EscapeString(Label); ^~~ DOT /usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/include/llvm/Support/GraphWriter.h:35:11:
> note: 'DOT' declared here namespace DOT {  // Private functions... ^ 1 error generated.
> *** [GraphWriter.o] Error code 1
>
>
> Well, in the past I saw many of those messages, especially not found labels of routines in shared objects/libraries or even those "funny" misspelled messages shown above.
>
> I can not reproduce them after a reboot, but as long as the system is running with this error occured, it is sticky. So in order to compile the OS successfully, I reboot.
>
> Does anyone have an idea what this could be? Since it affects at the moment only one machine (the other CoreDuo has been retired in the meanwhile), it feels a bit like a miscompilation on a certain type of CPU.
>
> Thanks for your patience,
>
> Oliver
> _______________________________________________
> freebsd-current@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
>




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?53B2D262.2040502>