Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 19 Mar 2013 11:55:45 +0200
From:      Andriy Gapon <avg@FreeBSD.org>
To:        freebsd-current@FreeBSD.org
Cc:        Bjorn Larsson <bjwela@gmail.com>, Sergey Dyatko <sergey.dyatko@gmail.com>
Subject:   Re: gptzfsboot problem on HP P410i Smart Array
Message-ID:  <51483621.2060503@FreeBSD.org>
In-Reply-To: <CAJ0WZYBQcujPbW%2BiZVkPMY=voGgHQnuVLLi=DKb%2BL-%2B1OW_Arw@mail.gmail.com>
References:  <CAAG5QCs0G1ztH715j5pnsFmne30xZwUT5o_YkQW9k1dDc-=-Nw@mail.gmail.com> <50311741.3000204@yandex.ru> <CAAG5QCst%2BS6U7HRBAmvxhxZb-dhk1O9yuQMUxvrYT%2BT0T_V%2BzA@mail.gmail.com> <CAJ0WZYBQcujPbW%2BiZVkPMY=voGgHQnuVLLi=DKb%2BL-%2B1OW_Arw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
on 19/03/2013 07:41 Sergey Dyatko said the following:
> I was faced with same problem on my laptop. Adding printf() into main()
> before dsk = malloc(sizeof(struct dsk)); fix boot. Yesterday, avg@ proposed
> patch:
> Index: /usr/src/sys/boot/i386/zfsboot/zfsboot.c
> ===================================================================
> --- /usr/src/sys/boot/i386/zfsboot/zfsboot.c    (revision 248421)
> +++ /usr/src/sys/boot/i386/zfsboot/zfsboot.c    (working copy)
> @@ -302,6 +302,7 @@
>       * region in the SMAP, use the last 3MB of 'extended' memory as a
>       * high heap candidate.
>       */
> +       high_heap_size = 0;
>      if (bios_extmem >= HEAP_MIN && high_heap_size < HEAP_MIN) {
>         high_heap_size = HEAP_MIN;
>         high_heap_base = bios_extmem + 0x100000 - HEAP_MIN;
> 
> it works for me, without printf() :) Can you test it ?

A comment about a nature of this patch.

Based on the previous investigation by Christoph Hoffmann and jhb:
http://thread.gmane.org/gmane.os.freebsd.current/134199/focus=134309
I made a guess that either BIOS/firmware provides incorrect memory map or some
agent in the BIOS/firmware (e.g. SMM handler) or controller firmware writes
outside of a memory range reserved for it.
I think that jhb made a similar guess at the time while Christoph conjectured
that memory corruption was related to CPU caches or some such.
My conjecture is that it is simply a combination of timing and a particular
memory range.

Just in case, here is how the memory map looks on the Sergey's system:
SMAP type=01 base=0000000000000000 end=000000000009fc00 len=000000000009fc00
SMAP type=02 base=000000000009fc00 end=00000000000a0000 len=0000000000000400
SMAP type=02 base=00000000000e0000 end=0000000000100000 len=0000000000020000
SMAP type=01 base=0000000000100000 end=00000000bc1a1000 len=00000000bc0a1000
SMAP type=04 base=00000000bc1a1000 end=00000000bc1a4000 len=0000000000003000
SMAP type=01 base=00000000bc1a4000 end=00000000bdf04000 len=0000000001d60000
SMAP type=04 base=00000000bdf04000 end=00000000bdf3f000 len=000000000003b000
SMAP type=01 base=00000000bdf3f000 end=00000000bdf6a000 len=000000000002b000
SMAP type=02 base=00000000bdf6a000 end=00000000bdfbf000 len=0000000000055000
SMAP type=01 base=00000000bdfbf000 end=00000000bdfeb000 len=000000000002c000
SMAP type=03 base=00000000bdfeb000 end=00000000bdfff000 len=0000000000014000
SMAP type=01 base=00000000bdfff000 end=00000000be000000 len=0000000000001000
SMAP type=02 base=00000000be000000 end=00000000c0000000 len=0000000002000000
SMAP type=02 base=00000000f8000000 end=00000000fc000000 len=0000000004000000
SMAP type=02 base=00000000fec00000 end=00000000fec01000 len=0000000000001000
SMAP type=02 base=00000000fed10000 end=00000000fed14000 len=0000000000004000
SMAP type=02 base=00000000fed18000 end=00000000fed1a000 len=0000000000002000
SMAP type=02 base=00000000fed1c000 end=00000000fed20000 len=0000000000004000
SMAP type=02 base=00000000fee00000 end=00000000fee01000 len=0000000000001000
SMAP type=02 base=00000000ffe00000 end=0000000100000000 len=0000000000200000
SMAP type=01 base=0000000100000000 end=0000000140000000 len=0000000040000000

The algorithm for placing the heap picks up a range at bc1a4000, which is
between two ranges of type '4' (ACPI NVS memory).
So my idea was just to try a different memory range. Seems that it worked.

P.S. I am not sure why our algorithm for selecting heap location is what it is.
On all systems that I have I see that the "bios_extmem" range (the one starting
at 0x100000) is usually the largest one and has more than enough space for both
the heap and other things that are placed there.
Additionally, in the case of zfsboot I think that we do not use memory above 1MB
for anything else besides the heap.

-- 
Andriy Gapon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?51483621.2060503>