Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 12 Sep 2013 08:55:38 -0600
From:      Ian Lepore <ian@FreeBSD.org>
To:        Tim Kientzle <kientzle@FreeBSD.org>
Cc:        "freebsd-arm@freebsd.org" <freebsd-arm@FreeBSD.org>
Subject:   Re: Panic mounting root on BeagleBone Black
Message-ID:  <1378997738.1111.631.camel@revolution.hippie.lan>
In-Reply-To: <47E403AE-01A2-4AC8-8028-41F0298FAC3E@freebsd.org>
References:  <47E403AE-01A2-4AC8-8028-41F0298FAC3E@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 2013-09-11 at 06:43 -0700, Tim Kientzle wrote:
> Just built a new image for BBB from SVN r255438.
>=20
> At the second boot, I got this:
> =10=10
> Mounting local file systems:.
> mmcsd0: Error indicated: 1 Timeout
> g_vfs_done():mmcsd0s2a[READ(offset=3D2016903168, length=3D4096)]error =3D=
 5
> vnode_pager_getpages: I/O read error
> vm_fault: pager read error, pid 126 (ps)
> mmcsd0: Error indicated: 1 Timeout
> g_vfs_done():mmcsd0s2a[READ(offset=3D131072, length=3D32768)]error =3D =
5
> sdhci_ti0-slot0: Got data interrupt 0x00000010, but there is no active =
command.
> sdhci_ti0-slot0: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D REGISTER DU=
MP =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> sdhci_ti0-slot0: Sys addr: 0x00000000 | Version:  0x00003101
> sdhci_ti0-slot0: Blk size: 0x00000200 | Blk cnt:  0x00000010
> sdhci_ti0-slot0: Argument: 0x0024679e | Trn mode: 0x0000193a
> sdhci_ti0-slot0: Present:  0x01f70000 | Host ctl: 0x00000006
> sdhci_ti0-slot0: Power:    0x0000000d | Blk gap:  0x00000000
> sdhci_ti0-slot0: Wake-up:  0x00000000 | Clock:    0x00000007
> sdhci_ti0-slot0: Timeout:  0x0000000d | Int stat: 0x00000000
> sdhci_ti0-slot0: Int enab: 0x017f00fb | Sig enab: 0x017f00fb
> sdhci_ti0-slot0: AC12 err: 0x00000000 | Slot int: 0x00000000
> sdhci_ti0-slot0: Caps:     0x06e10080 | Max curr: 0x00000000
> sdhci_ti0-slot0: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>=20
> =85. few more similar messages, then =85.
>=20
> mmcsd0: Error indicated: 1 Timeout
> g_vfs_done():mmcsd0s2a[WRITE(offset=3D20808192, length=3D512)]error =3D=
 5
> g_vfs_done():mmcsd0s2a[WRITE(offset=3D1276346368, length=3D24576)]error=
 =3D 5
> panic: brelse: inappropriate B_PAGING or B_CLUSTER bp 0xcd148778
> [bt snipped]
>=20

This was a single occurance, right?  Like you're not dead in the water
or anything?

There's insanity in that info... the register dump shows a multi-block
write (8kbytes) was set up, but the command that timed out was a read.
If a prior write had timed out why isn't there a g_vfs_done() error
logged for it?

I think what we really need is some better error recovery in the mmc and
sd layers.  Retrying a failed IO is cheap and easy.  More complex
recovery is possible too (power cycling and re-intializing the card
and/or controller).  But that has its own difficulties -- what if the
nature of the problem was that the user swapped cards? -- you don't want
to retry a write under those conditions.

-- Ian





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1378997738.1111.631.camel>