Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 3 Sep 2009 11:50:55 -0600 (MDT)
From:      Scott Long <scottl@samsco.org>
To:        Ivan Voras <ivoras@freebsd.org>
Cc:        svn-src-head@freebsd.org, Alexander Motin <mav@freebsd.org>, src-committers@freebsd.org, svn-src-all@freebsd.org
Subject:   Re: svn commit: r196777 - head/sys/dev/ahci
Message-ID:  <20090903114121.C20031@pooker.samsco.org>
In-Reply-To: <9bbcef730909031037y4aecd692t4812718b1fd7e78e@mail.gmail.com>
References:  <200909031237.n83CbIgk032551@svn.freebsd.org> <1872D962-9297-4C45-9F73-4BB823C49D74@samsco.org> <4A9FD8B4.2080605@FreeBSD.org> <20090903095224.N20031@pooker.samsco.org> <9bbcef730909031037y4aecd692t4812718b1fd7e78e@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--0-282202799-1252000255=:20031
Content-Type: TEXT/PLAIN; charset=UTF-8; format=flowed
Content-Transfer-Encoding: QUOTED-PRINTABLE

On Thu, 3 Sep 2009, Ivan Voras wrote:
> 2009/9/3 Scott Long <scottl@samsco.org>:
>> On Thu, 3 Sep 2009, Alexander Motin wrote:
>
>>> It would be nice if every level would do it's own job.
>>
>> It's the job of the driver to handle the limitations of the hardware, ye=
s.
>> Again, if you want to experiment with pushing this functionality into GE=
OM,
>> be my guest. =C2=A0But until then, consider following my advice.
>
> Speaking as an user who goes "huh, well" every time he sees a RAID
> card with a GB of cache talking to the OS in 64 kB chunks, eventually
> removing this limitation seems a Nice Thing to Have.
>
> I don't know how things look at the driver side, but GEOM by itself
> has no problems passing around requests of at least "long" in length.
> Specifically, it cares about struct bio, where bio_bcount is a "long"
> and bio_length and bio_completed are off_t. So, ssize_t looks ok as a
> high boundary.
>
> There was a time (apparently much of it was a bug in reporting and is
> now fixed :( ) when MAXPHYS could be manually redefined to be 256K or
> more and iostat would nicely state the higher value. I think the
> concern raised at the topic was that it doesn't play nice with
> bufcache, and I think the specific problem was possible out-of-memory
> situations. Now that kernel limits on AMD64 are much increased (to
> values not longer fitting in uint32_t) I wonder if the problem is so
> serious?
>

The problem is lack of kernel address space, not lack of RAM, but that's=20
just semantics in this discussion.  I've tested with increasing MAXPHYS in=
=20
increments to 1M.  Performance increases logrithmically, and effectively=20
hits a max at 512K for the variety of controllers that I tested.  The gain=
=20
from 64K to 128K is huge, the gain from 128K to 256K is ok, the gain from=
=20
256k to 512k is measurable but less significant, and the gain from 512k to=
=20
1m is almost not measurable.

I have simple patches to increase MAXPHYS.  The introduction of the the=20
maxio paramter in the CAM SIM interface is there in preparation for this.=
=20
However, a _LOT_LOT_LOT_ of drivers in the tree falsely assume that=20
MAXPHYS and DEFLTPHYS are 128k and 64k respectively, and size their data=20
structures accordingly.  Changing these values will cause the drivers to=20
fail in bad ways.  So an audit needs to be done.  Also, MAXPHYS is abused=
=20
by the swapper in the struct-buf, so that needs to be reviewed as well.

Even though kernel address space is less restricted on 64bit platforms,=20
it's still not free and limitless.  Large I/O's requires more work in the=
=20
VM to assign address space, and in turn causes more lock contention.  I=20
haven't done any practical measurements of this on common workloads, but I=
=20
can anecdotally say that I see increased lock contention from it in=20
locking profiles.  If FreeBSD wants to seriously increase MAXPHYS, this=20
needs to be looked at and either proven to not be important, or fixed.

Scott

--0-282202799-1252000255=:20031--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20090903114121.C20031>