From owner-freebsd-arch@FreeBSD.ORG Sat Mar 20 20:41:36 2010 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7F8A9106566C; Sat, 20 Mar 2010 20:41:36 +0000 (UTC) (envelope-from julian@elischer.org) Received: from out-0.mx.aerioconnect.net (outm.internet-mail-service.net [216.240.47.236]) by mx1.freebsd.org (Postfix) with ESMTP id 2CFD78FC08; Sat, 20 Mar 2010 20:41:36 +0000 (UTC) Received: from idiom.com (postfix@mx0.idiom.com [216.240.32.160]) by out-0.mx.aerioconnect.net (8.13.8/8.13.8) with ESMTP id o2KKfX20006239; Sat, 20 Mar 2010 13:41:33 -0700 X-Client-Authorized: MaGic Cook1e X-Client-Authorized: MaGic Cook1e X-Client-Authorized: MaGic Cook1e X-Client-Authorized: MaGic Cook1e Received: from julian-mac.elischer.org (h-67-100-89-137.snfccasy.static.covad.net [67.100.89.137]) by idiom.com (Postfix) with ESMTP id 3926E2D6013; Sat, 20 Mar 2010 13:41:32 -0700 (PDT) Message-ID: <4BA532FF.6040407@elischer.org> Date: Sat, 20 Mar 2010 13:41:35 -0700 From: Julian Elischer User-Agent: Thunderbird 2.0.0.24 (Macintosh/20100228) MIME-Version: 1.0 To: Alexander Motin References: <4BA4E7A9.3070502@FreeBSD.org> <201003201753.o2KHrH5x003946@apollo.backplane.com> <891E2580-8DE3-4B82-81C4-F2C07735A854@samsco.org> <4BA52179.9030903@FreeBSD.org> In-Reply-To: <4BA52179.9030903@FreeBSD.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.67 on 216.240.47.51 Cc: Scott Long , FreeBSD-Current , freebsd-arch@freebsd.org Subject: Re: Increasing MAXPHYS X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 20 Mar 2010 20:41:36 -0000 Alexander Motin wrote: > Scott Long wrote: >> On Mar 20, 2010, at 11:53 AM, Matthew Dillon wrote: >>> Diminishing returns get hit pretty quickly with larger MAXPHYS values. >>> As long as the I/O can be pipelined the reduced transaction rate >>> becomes less interesting when the transaction rate is less than a >>> certain level. Off the cuff I'd say 2000 tps is a good basis for >>> considering whether it is an issue or not. 256K is actually quite >>> a reasonable value. Even 128K is reasonable. >> I agree completely. I did quite a bit of testing on this in 2008 and 2009. >> I even added some hooks into CAM to support this, and I thought that I had >> discussed this extensively with Alexander at the time. Guess it was yet another >> wasted conversation with him =-( I'll repeat it here for the record. In the Fusion-io driver we find that the limiting factor is not the size of MAXPHYS, but the fact that we can not push more than 170k tps through geom. (in my test machine. I've seen more on some beefier machines), but that is only a limit on small transacrtions, or in the case of large transfers the DMA engine tops out before a bigger MAXPHYS would make any difference. Where it may make a difference is that Linux only pushes 128k at a time it looks like so many hardware engines have likely never been tested with greater. (not sure about Windows). Some drivers may also be written with the assumption that they will not see more. OF course they should be able to limit the transaction size down themselves if they are written well. > > AFAIR at that time you've agreed that 256K gives improvements, and 64K > of DFLTPHYS limiting most SCSI SIMs is too small. That's why you've > implemented that hooks in CAM. I have not forgot that conversation (pity > that it quietly died for SCSI SIMs). I agree that too high value could > be just a waste of resources. As you may see I haven't blindly committed > it, but asked public opinion. If you think 256K is OK - let it be 256K. > If you think that 256K needed only for media servers - OK, but lets make > it usable there. > >> Besides the nswbuf sizing problem, there is a real problem that a lot of drivers >> have incorrectly assumed over the years that MAXPHYS and DFLTPHYS are >> particular values, and they've sized their data structures accordingly. Before >> these values are changed, an audit needs to be done OF EVERY SINGLE >> STORAGE DRIVER. No exceptions. This isn't a case of changing MAXHYS >> in the ata driver, testing that your machine boots, and then committing the change >> to source control. Some drivers will have non-obvious restrictions based on >> the number of SG elements allowed in a particular command format. MPT >> comes to mind (its multi message SG code seems to be broken when I tried >> testing large MAXPHYS on it), but I bet that there are others. > > As you should remember, we have made it in such way, that all unchecked > drivers keep using DFLTPHYS, which is not going to be changed ever. So > there is no problem. I would more worry about non-CAM storages and above > stuff, like some rare GEOM classes. > >> I'm fine with raising MAXPHYS in production once the problems are >> addressed. > > That's why in my post I've asked people about any known problems. I've > addressed several related issues in last months, and I am looking for > more. To address problems, it would be nice to know about them first. >