Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 10 Jan 2014 14:11:47 -0800
From:      Alfred Perlstein <bright@mu.org>
To:        Hans Petter Selasky <hans.petter.selasky@bitfrost.no>
Cc:        Tommy Stiansen <ts@norse-corp.com>, FreeBSD Hackers <freebsd-hackers@freebsd.org>, Alfred Perlstein <alfred@freebsd.org>, Neel Natu <neel@FreeBSD.org>, Alan Cox <alc@rice.edu>
Subject:   Re: usb + other drivers stop working on 128GB+ memory machines
Message-ID:  <BE87FE39-756E-46F6-A186-BCC4DCEFF201@mu.org>
In-Reply-To: <zarafa.52d06f98.4be3.6f35d1467ec79bee@mail.lockless.no>
References:  <50BDB148.1060607@mu.org> <zarafa.52d06f98.4be3.6f35d1467ec79bee@mail.lockless.no>

next in thread | previous in thread | raw e-mail | index | archive | help

> On Jan 10, 2014, at 2:09 PM, Hans Petter Selasky <hans.petter.selasky@bitf=
rost.no> wrote:
>=20
> Hi,
>=20
> The newer XHCI chipset does support full 64-bit ranges, if the HW guys did=
 their job. We've seen in the past 32-bit hardware being cut down to 2GB of R=
AM in the hardware because some OS'es don't support more :-)
>=20
> I think in general that keeping DMA buffers below 4GB is a good idea. Wasn=
't the allocator changed some years back to allocate from the top of memory i=
nstead of the bottom?
It appears not. Or maybe it got reverted?

When you have so much ram simply stealing some % unconditionally might be ok=
 so long as devices run.=20

Note some devices have 24 but limits even!




>=20
> --HPS
> =20
> -----Original message-----
> > From:Alfred Perlstein <alfred@freebsd.org>
> > Sent: Friday 10th January 2014 22:43
> > To: Alan Cox <alc@rice.edu>; Hans Petter Selasky <hans.petter.selasky@bi=
tfrost.no>; Neel Natu <neel@FreeBSD.org>; FreeBSD Hackers <freebsd-hackers@f=
reebsd.org>
> > Cc: Tommy Stiansen <ts@norse-corp.com>
> > Subject: usb + other drivers stop working on 128GB+ memory machines
> >=20
> > Hey Alan, Neel and Hans,
> >=20
> > We're testing FreeBSD 10 here and still having problems, once we go over=
=20
> > 128GB of memory then USB stops working.  When we artificially limit=20
> > memory to 128GB or lower we are OK.
> >=20
> > Is there any chance we can revisit this patch so that large memory=20
> > systems don't use up the lower memory space which seems to be needed by=20=

> > some drivers?
> >=20
> > I'm having a bit of trouble explaining to people that too much memory =3D=
=3D=20
> > no keyboard on FreeBSD.
> >=20
> > I have the patch that seemed to work for us before.  Any chance this can=
=20
> > go into FreeBSD soon?
> >=20
> >=20
> >=20
> > -Alfred
> >=20
> >=20
> > -------- Original Message --------
> > Subject: 	Re: Questions about FreeBSD amd64 memory layout.
> > Date: 	Tue, 04 Dec 2012 00:16:08 -0800
> > From: 	Alfred Perlstein <bright@mu.org>
> > To: 	Alan Cox <alc@rice.edu>
> > CC: 	Alan Cox <alc@FreeBSD.org>, Xin LI <delphij@delphij.net>
> >=20
> >=20
> >=20
> > On 12/3/12 11:23 PM, Alan Cox wrote:
> > > On 12/03/2012 18:15, Alfred Perlstein wrote:
> > >> Hello Alan,
> > >>
> > >> The other day I ran a copy of FreeBSD 9.1 with my maxusers patches
> > >> (from current).
> > >>
> > >> The machine had 256 gigs of RAM.
> > >>
> > >> Due to that much memory, maxusers was upwards of 24860.
> > >>
> > >> What then happened was that the mfi driver, and I think also the USB
> > >> driver would not work.
> > >>
> > >> The mfi driver stopped working because it got the following error:
> > >> mfi0: Cannot allocate verbuf_h_dmamap memory
> > >>
> > >> This appears to be due to this in the mfi driver:
> > >>>          /* Start: LSIP200113393 */
> > >>>          if (bus_dma_tag_create( sc->mfi_parent_dmat,    /* parent *=
/
> > >>>                                  1, 0,                   /* algnmnt,=

> > >>> boundary */
> > >>>                                  BUS_SPACE_MAXADDR_32BIT,/* lowaddr *=
/
> > >>>                                  BUS_SPACE_MAXADDR,      /* highaddr=
 */
> > >>>                                  NULL, NULL,             /* filter,
> > >>> filterarg */
> > >>> MEGASAS_MAX_NAME*sizeof(bus_addr_t),                    /* maxsize *=
/
> > >>>                                  1,                      /* msegment=
s */
> > >>> MEGASAS_MAX_NAME*sizeof(bus_addr_t),                    /* maxsegsiz=
e */
> > >>>                                  0,                      /* flags */=

> > >>>                                  NULL, NULL,             /* lockfunc=
,
> > >>> lockarg */
> > >>>                                  &sc->verbuf_h_dmat)) {
> > >>>                  device_printf(sc->mfi_dev, "Cannot allocate
> > >>> verbuf_h_dmat DMA tag\n");
> > >>>                  return (ENOMEM);
> > >>>          }
> > >>>          if (bus_dmamem_alloc(sc->verbuf_h_dmat, (void **)&sc->verbu=
f,
> > >>>              BUS_DMA_NOWAIT, &sc->verbuf_h_dmamap)) {
> > >>>                  device_printf(sc->mfi_dev, "Cannot allocate
> > >>> verbuf_h_dmamap memory\n");
> > >> What I'm thinking is happening is that by the time we get to mfi
> > >> driver enough of the below 4GB memory is used up by callout wheels,
> > >> nbufs, various hash tables, etc that we wind up unable to get memory
> > >> in this region.
> > >>
> > >> This could (and probably is) a wrong assumption, but it's what makes
> > >> sense to me right now.
> > >>
> > >
> > > I can believe it, or more precisely I know of nothing that immediately=

> > > disproves it.
> > >
> > >
> > >> I'm wondering how the kernel map gets populated, and if it would be
> > >> possible, and if it would be advisable to change the allocation
> > >> strategy to come from the tail end of physical memory instead of the
> > >> front.
> > >>
> > >
> > > There is no intentional "allocation strategy" in the sense that you ar=
e
> > > using the phrase here.  Much of the VM system, including the physical
> > > memory allocator, is initialized early in the boot process, in fact,
> > > before callout wheels, nbufs, etc. are allocated.  So, the standard
> > > physical memory allocator is being used for callout wheels, nbufs, etc=
.,
> > > and this allocator takes pages from the cache/free page queues in
> > > whatever arbitrary order they happen to be in.  I can believe that we
> > > currently initialize the cache/free page queues in an order that resul=
ts
> > > in the allocation of pages from low physical addresses first.
> > >
> > > The physical memory allocator does, however, have a way of dealing wit=
h
> > > low physical address ranges that you don't want to allocate from excep=
t
> > > explicitly, e.g., contigmalloc()/kmem_alloc_contig(), or as a last
> > > resort.  This is currently only used for the physical address range fo=
r
> > > ISA DMA.
> > >
> > > I've attached a patch that abuses the ISA DMA range, extending it to
> > > 4GB.  See if this patch enables you to boot.
> > >
> > >
> > It does!  Everything is fixed now.
> >=20
> > What now?  Can I help somehow?
> >=20
> > =CB=9C % sysctl -a| grep maxuser
> > kern.maxusers: 33049
> > =CB=9C % dmesg| grep mfi
> > mfi0: <ThunderBolt> port 0x8000-0x80ff mem
> > 0xc7a60000-0xc7a63fff,0xc7a00000-0xc7a3ffff irq 26 at device 0.0 on pci1=

> > mfi0: Using MSI
> > mfi0: Megaraid SAS driver Ver 4.23
> > mfi0: MaxCmd =3D 3f0 MaxSgl =3D 46 state =3D b75003f0
> > mfi0: 1436 (407894536s/0x0020/info) - Shutdown command received from hos=
t
> > mfi0: 1437 (boot + 4s/0x0020/info) - Firmware initialization started
> > (PCI ID 005b/1000/0690/15d9)
> > mfi0: 1438 (boot + 4s/0x0020/info) - Firmware version 3.190.05-1669
> > mfi0: 1439 (boot + 5s/0x0020/info) - Package version 23.7.0-0029
> > mfi0: 1440 (boot + 5s/0x0020/info) - Board Revision
> > mfi0: 1441 (boot + 25s/0x0002/info) - Inserted: PD 10(e0xfc/s0)
> > mfi0: 1442 (boot + 25s/0x0002/info) - Inserted: PD 10(e0xfc/s0) Info:
> > enclPd=3Dfc, scsiType=3D0, portMap=3D00, sasAddr=3D4433221103000000,0000=
000000000000
> > mfi0: 1443 (boot + 26s/0x0001/info) - Policy change on VD 00/0 to
> > [ID=3D00,dcp=3D65,ccp=3D64,ap=3D0,dc=3D0] from [ID=3D00,dcp=3D65,ccp=3D6=
5,ap=3D0,dc=3D0]
> > mfi0: 1444 (407894583s/0x0020/info) - Time established as 12/04/12
> > 0:03:03; (37 seconds since power on)
> > mfi0: 1445 (407894819s/0x0020/info) - Host driver is loaded and operatio=
nal
> > mfid0 on mfi0
> > mfid0: 2861022MB (5859373056 sectors) RAID volume (no label) is optimal
> > Trying to mount root from ufs:/dev/mfid0p2 [rw]...
> >=20
> >=20
> >=20
> > _______________________________________________
> > freebsd-hackers@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.or=
g"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?BE87FE39-756E-46F6-A186-BCC4DCEFF201>