Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 23 Jul 2014 17:34:17 +0000
From:      bugzilla-noreply@freebsd.org
To:        freebsd-ports-bugs@FreeBSD.org
Subject:   [Bug 192066] New: sysutils/grub2 and ZFS: wrong lz4 endianness
Message-ID:  <bug-192066-13@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=192066

            Bug ID: 192066
           Summary: sysutils/grub2 and ZFS: wrong lz4 endianness
           Product: Ports Tree
           Version: Latest
          Hardware: Any
                OS: Any
            Status: Needs Triage
          Severity: Affects Only Me
          Priority: ---
         Component: Individual Port(s)
          Assignee: freebsd-ports-bugs@FreeBSD.org
          Reporter: aaz@q-fu.com

Created attachment 144914
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=144914&action=edit
fix

I am using GRUB to boot the kernel directly from ZFS.

Not long after an upgrade to a recent 10-stable r268881, GRUB stopped being
able
to see the pool and boot. Having completed an appropriate recovery effort and
finally booting the system again, I used gdb on grub-probe to determine that
the
problem was in lz4 decompression of the uberblock.

Here is the problematic code in GRUB 2.00 (with FreeBSD port patches):

grub-core/fs/zfs/zfs_lz4.c:

    #if BYTE_ORDER == BIG_ENDIAN

Apparently <sys/endian.h> isn't included, so those macros expand to 0, and the
code incorrectly assumes a big-endian system. Then based on this assumption it
byte-swaps a 2-byte offset field in the compressed data, which makes the data
appear corrupt, and fails.

I am not sure why this problem happened to manifest just now, since GRUB hasn't
been updated in a while, but I think the recent kernel happens to lz4-compress
the uberblock and earlier kernels happened to lzjb-compress or not compress it,
leaving the problem unnoticed.

This causes disturbing messages like "error: no such device: <pool id>." and
"lz4 decompression failed" at the GRUB prompt, and this:

# grub-probe -d /dev/gpt/mypool
grub-probe: error: unknown filesystem.

The fix is simply adding #include <sys/endian.h> at the top of zfs_lz4.c:

# grub-probe -d /dev/gpt/mypool
zfs

Note I am also using the patch from bug 188524 for the "hole_birth" feature and
I haven't enabled the "embedded_data" feature on my pool yet. A newly created
pool doesn't work in GRUB because of those feature flags, regardless of lz4.

The latest GRUB source uses grub_le_to_cpu16() instead of BYTE_ORDER, so the
problem should resolve itself in future versions.

-- 
You are receiving this mail because:
You are the assignee for the bug.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-192066-13>