Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 29 Dec 2002 15:57:49 +1100 (EST)
From:      Bruce Evans <bde@zeta.org.au>
To:        Jos Backus <jos@catnook.com>
Cc:        freebsd-current@FreeBSD.ORG
Subject:   Re: spec_getpages I/O read failure on md0
Message-ID:  <20021229152736.N39955-100000@gamplex.bde.org>
In-Reply-To: <20021228193431.GA13948@lizzy.catnook.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 28 Dec 2002, Jos Backus wrote:

> I am using the following fstab entry:
>
>     /dev/md0 /tmp mfs rw,nosuid,nodev,-s=32m 0 0
>
> And three times now have I seen this during an installworld, both on a UP and
> an SMP system running a very recent -current:
>
>     ===> usr.sbin/pcvt/ispcvt
>     install -s -o root -g wheel -m 555   ispcvt /usr/sbin
>     install -o root -g wheel -m 444 ispcvt.8.gz  /usr/share/man/man8
>     ===> usr.sbin/pcvt/vgaio
>     echo:Input/output error
>     *** Error code 1
> ...
> Accompanied by
>
>     Dec 28 01:42:12 lizzy kernel: spec_getpages:(md0) I/O read failure: (error=22) bp 0xce5f9310 vp 0xc41e8708

Known bugs in spec_getpages() (or its callers or infrastructure):
(1) does not understand disk driver's si_iosize_max, so it fails if sizes
    larger than this are requested.  Sizes larger than this can occur for
    at least calls from exec_map_first_page() for execve() in some cases,
    since VM_INITIAL_PAGEIN is constant (but MD) and may exceed si_iosize_max.
    VM_INITIAL_PAGEIN 16 pages so it works for most devices on i386's but not
    for any device with a limit of 64K on alphas.  This used to cause exec
    failures for afd (zip) devices on i386's because the limit was 32K.
(2) see below.

>     Dec 28 01:42:12 lizzy kernel: size: 2048, resid: 2048, a_count: 2028, valid: 0x0
>     Dec 28 01:42:12 lizzy kernel: nread: 0, reqpage: 0, pindex: 1, pcount: 1
>     Dec 28 01:42:12 lizzy kernel: vm_fault: pager read error, pid 40673 (sh)
>     Dec 28 01:43:26 lizzy kernel: spec_getpages:(md0) I/O read failure: (error=22) bp 0xce5fb158 vp 0xc41e8708
>     Dec 28 01:43:26 lizzy kernel: size: 14848, resid: 14848, a_count: 14404, valid: 0x0
>     Dec 28 01:43:26 lizzy kernel: nread: 0, reqpage: 0, pindex: 0, pcount: 4

This seems to be a different but related problem: spec_getpages() understands
the disk driver's si_bsize_phys but si_bsize_phys is apparently not
initialized correctly.  We start with a count of 14404 = 3 * 4096 + 2116
= 28 * 512 + 28.  This is not a multiple of any reasonably block size, so
it must be rounded up.  We round it up to a multiple of 512 (29 * 512),
apparently because we use the default block size of DEV_BSIZE = 512.  The
md driver doesn't like this, apparently because you are using swap-backed
mode which gives a block size of PAGE_SIZE = 4096.

The md driver doesn't set any of the si_ size parameters so it has no chance
of getting this stuff right when the parameters are not the defaults.  The
defaults are set bogusly in too many places to DFLTPHYS for si_iosize_max
and to DEV_BSIZE for si_bsize_phys.

Errors caused by these bugs can be non-deterministic because the pages may
be loaded into memory other means.  E.g., when exec off zip drives
was broken, exec would succeed after several attempts because each attempt
loaded another 8 i386 pages so that all pages were eventually in memory.

Bruce


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20021229152736.N39955-100000>