Date: Sat, 14 Nov 2020 11:48:34 -0700 From: Scott Long <scottl@samsco.org> To: Konstantin Belousov <kostikbel@gmail.com> Cc: Alexander Motin <mav@freebsd.org>, "freebsd-arch@freebsd.org" <freebsd-arch@freebsd.org> Subject: Re: MAXPHYS bump for FreeBSD 13 Message-ID: <634E35E9-9E2B-40CE-9C70-BB130BD9D614@samsco.org> In-Reply-To: <X7Aj0a6ZtIQfBscC@kib.kiev.ua> References: <aac44f21-3a6d-c697-bb52-1a669b11aa3b@FreeBSD.org> <X7Aj0a6ZtIQfBscC@kib.kiev.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
> On Nov 14, 2020, at 11:37 AM, Konstantin Belousov = <kostikbel@gmail.com> wrote: >=20 > On Sat, Nov 14, 2020 at 10:01:05AM -0500, Alexander Motin wrote: >> On Fri, 13 Nov 2020 21:09:37 +0200 Konstantin Belousov wrote: >>> To put the specific numbers, for struct buf it means increase by = 1792 >>> bytes. For bio it does not, because it does not embed vm_page_t[] = into >>> the structure. >>>=20 >>> Worse, typical struct buf addend for excess vm_page pointers is = going >>> to be unused, because normal size of the UFS block is 32K. It is >>> going to be only used by clusters and physbufs. >>>=20 >>> So I object against bumping this value without reworking buffers >>> handling of b_pages[]. Most straightforward approach is stop using >>> MAXPHYS to size this array, and use external array for clusters. >>> Pbufs can embed large array. >>=20 >> I am not very familiar with struct buf usage, so I'd appreciate some >> help there. >>=20 >> Quickly looking on pbuf, it seems trivial to allocate external = b_pages >> array of any size in pbuf_init, that should easily satisfy all of = pbuf >> descendants. Cluster and vnode/swap pagers code are pbuf descendants >> also. Vnode pager I guess may only need replacement for >> nitems(bp->b_pages) in few places. > I planned to look at making MAXPHYS a tunable. >=20 > You are right, we would need: > 1. move b_pages to the end of struct buf and declaring it as flexible. > This would make KBI worse because struct buf depends on some debugging > options, and than b_pages offset depends on config. >=20 > Another option could be to change b_pages to pointer, if we are fine = with > one more indirection. But in my plan, real array is always allocated = past > struct buf, so flexible array is more correct even. >=20 I like this, and I was in the middle of writing up an email that = described it. There could be multiple malloc types or UMA zones of different sizes, depending on the intended i/o size, or just a runtime change to the size = of a single allocation size. > 2. Preallocating both normal bufs and pbufs together with the arrays. >=20 > 3. I considered adding B_SMALLPAGES flag to b_flags and use it to = indicate > that buffer has 'small' b_pages. All buffers rotated through = getnewbuf()/ > buf_alloc() should have it set. >=20 This would work nicely with a variable sized allocator, yes. > 4. There could be some places which either malloc() or allocate struct = buf > on stack (I tend to believe that I converted all later places to = formed). > They would need to get special handling. >=20 I couldn=E2=80=99t find any places that allocated a buf on the stack or = embedded it into another structure. > md(4) uses pbufs. >=20 > 4. My larger concern is, in fact, cam and drivers. >=20 Can you describe your concern? >>=20 >> Could you or somebody help with vfs/ffs code, where I suppose the >> smaller page lists are used? > Do you plan to work on this ? I can help, sure. >=20 > Still, I wanted to make MAXPHYS (and 'small' MAXPHYS, this is not same = as > DFLPHYS), a tunable, in the scope of this work. Sounds great, thank you for looking at it. Scott
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?634E35E9-9E2B-40CE-9C70-BB130BD9D614>