Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 21 Mar 2013 15:58:35 +0200
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        David Wolfskill <david@catwhisker.org>
Cc:        current@freebsd.org
Subject:   Re: Silent reboots in head @r248550 starting xdm with x11/nvidia-driver
Message-ID:  <20130321135835.GX3794@kib.kiev.ua>
In-Reply-To: <20130321133446.GF42912@albert.catwhisker.org>
References:  <20130320160056.GG32811@albert.catwhisker.org> <20130320171340.GE3794@kib.kiev.ua> <20130320173759.GK32811@albert.catwhisker.org> <20130320174458.GG3794@kib.kiev.ua> <20130320180239.GN32811@albert.catwhisker.org> <20130320200857.GN3794@kib.kiev.ua> <20130321013610.GB42912@albert.catwhisker.org> <20130321080441.GS3794@kib.kiev.ua> <20130321133446.GF42912@albert.catwhisker.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--cN0f9BRyJ83ABZok
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu, Mar 21, 2013 at 06:34:46AM -0700, David Wolfskill wrote:
> On Thu, Mar 21, 2013 at 10:04:41AM +0200, Konstantin Belousov wrote:
> > ...
> > This gives me an idea. The only so to say 'vm' change in r248508 was an
> > addition of the bio_transient_map submap. The vfs.unmapped_buf_allowed
> > tunable did not eliminated the submap creation. Please try r248569
> > with vfs.unmapped_buf_allowed set to 0.
>=20
> OK; I believe that worked.
>=20
> "Believe" because (in the normal course of things) I updated to:
>=20
> FreeBSD g1-235.catwhisker.org 10.0-CURRENT FreeBSD 10.0-CURRENT #845  r24=
8575M/248575: Thu Mar 21 05:35:06 PDT 2013     root@g1-235.catwhisker.org:/=
usr/obj/usr/src/sys/CANARY  i386
>=20
> which is a little beyond r248569.  (I still have r248508 on a
> different slice, and figured I could update that to precisely r248569
> if this test was incorrect or inconclusive.)
Not needed. BTW, your system uses UFS, right ?

>=20
> In any case: after booting the above (r248575) to verify that it worked
> as long as I did not load nvidia.ko first, I then rebooted, escaped to
> loader prompt, set vfs.unmapped_buf_allowed=3D0; boot.
>=20
> And after that came up OK, I (manually) loaded nvidia.ko, then
> re-started X (xdm); the nVidia banner displayed just before the xdm
> login screen did.  (I have my xdm startup script "prefer" the nvidia
> driver, but if nvidia.ko isn't loaded, it reverts to the nv driver
> automagically.)
>=20
> > If this combination allows the nvidia driver to start, please revert
> > the setting of vfs.unmapped_buf_allowed, and instead set
> > kern.bio_transient_maxcnt e.g. to 256 or even 128.
>=20
> OK; rebooting, escaping to loader, *not* setting vfs.unmapped_buf_allowed,
> and setting kern.bio_transient_maxcnt=3D256 also allowed nvidia driver
> to be used at r248575.
Ok, this is almost not a workaround but a solution (for now). See below.

>=20
> > Also, on the machine without the tunables customization, please show
> > the output of sysctl kern.nbuf, kern.bio_transient_maxcnt. Also show
> > the output of pciconf -lvb.
>=20
> OK; I rebooted (to revert the vfs.unmapped_buf_allowed setting) and
> obtained the above (augmented a wee bit by some of the others
> mentioned; I've attached that as "sysctl.txt".  I've also attached
> a copy of dmesg.boot, in case that's useful.
>=20
> I then tried rebooting r248575 and loading nvidia.ko *without* the
> tunable customization, and verified that I still saw (what looks
> like) a "reset" when I start X that way (as reported initially).
>=20
> > From what I see in your report, you use i386 arch. What is the amount
> > of memory installed in the machine ?
>=20
> 4GB.
>=20
> Is the above what you had in mind, or would you like me to try at
> precisely r248569?  Anything else?
r248569 is fine.


> Script started on Thu Mar 21 06:07:41 2013
> g1-235(10.0-C)[1] uname -a
> FreeBSD g1-235.catwhisker.org 10.0-CURRENT FreeBSD 10.0-CURRENT #845  r24=
8575M/248575: Thu Mar 21 05:35:06 PDT 2013     root@g1-235.catwhisker.org:/=
usr/obj/usr/src/sys/CANARY  i386
> g1-235(10.0-C)[2] sysctl vfs.unmapped_buf_allowed kern.bio_transient_maxc=
nt kern.nbuf
> vfs.unmapped_buf_allowed: 1
> kern.bio_transient_maxcnt: 697
> kern.nbuf: 7224
Could you, please, do some more measurements in the r248575M ?

Please show the kern.nbuf for vfs.unmapped_buf_allowed=3D0 case.
Also, from there, run "kgdb /boot/kernel/kernel /dev/mem" and do
p *buffer_map.

Reboot without applying any unmapped/transient tuning, run the kgdb
again, and do
p *buffer_map
p *bio_transient_map

Reboot with kern.bio_transient_maxcnt tunable set to 256 and again
print the buffer_map and bio_transient_map from the kgdb.

> none1@pci0:0:3:3:       class=3D0x070002 card=3D0x02501028 chip=3D0x2a478=
086 rev=3D0x07 hdr=3D0x00
>     vendor     =3D 'Intel Corporation'
>     device     =3D 'Mobile 4 Series Chipset AMT SOL Redirection'
>     class      =3D simple comms
>     subclass   =3D UART
>     bar   [10] =3D type I/O Port, range 32, base 0xef88, size 8, enabled
>     bar   [14] =3D type Memory, range 32, base 0xf6fda000, size 4096, ena=
bled
Oh, you do have the serial port on your notebook, usable remotely without
serial cable. Your chipset seems to be AMT-capable, and you could use
comms/amtterm from other machine to get a serial console.

> vgapci0@pci0:1:0:0:     class=3D0x030000 card=3D0x02501028 chip=3D0x065c1=
0de rev=3D0xa1 hdr=3D0x00
>     vendor     =3D 'NVIDIA Corporation'
>     device     =3D 'G96M [Quadro FX 770M]'
>     class      =3D display
>     subclass   =3D VGA
>     bar   [10] =3D type Memory, range 32, base 0xf5000000, size 16777216,=
 enabled
>     bar   [14] =3D type Prefetchable Memory, range 64, base 0xe0000000, s=
ize 268435456, enabled
>     bar   [1c] =3D type Memory, range 64, base 0xf2000000, size 33554432,=
 enabled
>     bar   [24] =3D type I/O Port, range 32, base 0xdf00, size 128, enabled

My current theory is that the nvidia aperture size is 256MB, as indicated
by bar at 14, and nvidia driver tries to map the whole aperture into KVA.

With 4GB of RAM and i386, available 1GB of the KVA become quite tightly
populated, and even small changes in the layout make the mapping of
256MB impossible. If I am right, this is more an issue with nvidia.

Still, the layout should have not changed much, if at all. I want the
kgdb information listed above to confirm/deny this.

If you could configure AMT SOL console, then my theory about nvidia mapping
the whole aperture could be confirmed or denied.

Thank you.

--cN0f9BRyJ83ABZok
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (FreeBSD)

iQIcBAEBAgAGBQJRSxIKAAoJEJDCuSvBvK1BHoYQAIjLimf8Vt4x9N49yPboSYBX
ZrF2ZgunfsX8zMtIisTH7nm3n1XCiIqjPYmpuOLqkhoLdzxKPZx/z9WymVVmBSlI
y3lYoFQA7w1mw6dRvnDQGa4nWyyN+T9DJgHj4ZUP2Ty1rwzKL+7DlHbJCzoGYu9R
hFBc0mT9ElWqSULOtmHMUYiYW982LpehR+/wuCJ6rEOzUkE/vUBkIWmwkme4gmBm
q4lA80O+UdqHzBmKdEBSzFuLxAmlCyU18CUy3cl8hHlVeGH4gR/r7Lu29oD3I0zz
ZkLj5wW7/ow9aOi8k+bGVr/kp26RwHgVyDLzuCRSiHKTajoQ0pIrBVR7ARttMZ2b
m/hE0MIw6jlO33yCxx+7Gi4Yt0lZBv036NKdty3/11orGHvhG5w4n8uoUbpJp+2M
hK/z5eehxx/Va9R2ubYCU5ARodvHcruHYrl6xJBIu88N3UbExKk0jl3HBFy3f35f
e2cRIBBMOw8n6sZCLizlz8ozbG0KxVIWryqVpxpKjoR2Ij68OLgZOSLwoNFqSH4F
HNHeB7kfKpQBl4iBWwzygUqSbnw4ar/xUv3WvK40iUGHM5lRb/FR4D/KTbODVnYs
STq1zfX8z6DbE2LMntCgDpBODmdnZd4/ohVIp3iuM/TaJ0i0UNtCG0fDo3k09gHg
yQrqRKiudd9LpZsJHS1z
=BZ+a
-----END PGP SIGNATURE-----

--cN0f9BRyJ83ABZok--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130321135835.GX3794>