From owner-freebsd-stable@freebsd.org Mon Aug 1 20:34:06 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 15F34BAB1F3; Mon, 1 Aug 2016 20:34:06 +0000 (UTC) (envelope-from michelle@sorbs.net) Received: from hades.sorbs.net (mail.sorbs.net [67.231.146.200]) by mx1.freebsd.org (Postfix) with ESMTP id 03ADB1358; Mon, 1 Aug 2016 20:34:05 +0000 (UTC) (envelope-from michelle@sorbs.net) MIME-version: 1.0 Content-transfer-encoding: 8BIT Content-type: text/plain; charset=UTF-8; format=flowed Received: from isux.com (firewall.isux.com [213.165.190.213]) by hades.sorbs.net (Oracle Communications Messaging Server 7.0.5.29.0 64bit (built Jul 9 2013)) with ESMTPSA id <0OB800C6YRNTJH00@hades.sorbs.net>; Mon, 01 Aug 2016 10:38:21 -0700 (PDT) Subject: Re: mfi driver performance too bad on LSI MegaRAID SAS 9260-8i To: Borja Marcos , "O. Hartmann" Cc: Jason Zhang , freebsd-performance@freebsd.org, freebsd-current@freebsd.org, freebsd-stable@freebsd.org, freebsd-hardware@freebsd.org References: <16CD100A-3BD0-47BA-A91E-F445E5DF6DBC@cyphytech.com> <1466527001.2694442.644278905.18E236CD@webmail.messagingengine.com> <1790833A-9292-4A46-B43C-BF41C7C801BE@cyphytech.com> <20160801084504.563c79cf@freyja.zeit4.iv.bundesimmobilien.de> <1519EC23-0DBC-4139-96F6-250EF872A14B@sarenet.es> <20160801151203.14a7a67d@freyja.zeit4.iv.bundesimmobilien.de> <0CA1A1F1-AFDD-4763-84C3-2FC059F44789@sarenet.es> From: Michelle Sullivan Message-id: <579F8743.8030104@sorbs.net> Date: Mon, 01 Aug 2016 19:30:43 +0200 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:43.0) Gecko/20100101 Firefox/43.0 SeaMonkey/2.40 In-reply-to: <0CA1A1F1-AFDD-4763-84C3-2FC059F44789@sarenet.es> X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Aug 2016 20:34:06 -0000 Borja Marcos wrote: >> On 01 Aug 2016, at 15:12, O. Hartmann wrote: >> >> First, thanks for responding so quickly. >> >>> - The third option is to make the driver expose the SAS devices like a HBA >>> would do, so that they are visible to the CAM layer, and disks are handled by >>> the stock “da” driver, which is the ideal solution. >> I didn't find any switch which offers me the opportunity to put the PRAID >> CP400i into a simple HBA mode. > The switch is in the FreeBSD mfi driver, the loader tunable I mentioned, regardless of what the card > firmware does or pretends to do. > > It’s not visible doing a "sysctl -a”, but it exists and it’s unique even. It’s defined here: > > https://svnweb.freebsd.org/base/stable/10/sys/dev/mfi/mfi_cam.c?revision=267084&view=markup > (line 93) > >>> In order to do it you need a couple of things. You need to set the variable >>> hw.mfi.allow_cam_disk_passthrough=1 and to load the mfip.ko module. >>> >>> When booting installation media, enter command mode and use these commands: >>> >>> ----- >>> set hw.mfi.allow_cam_disk_passthrough=1 >>> load mfip >>> boot >>> ——— >> Well, I'm truly aware of this problemacy and solution (now), but I run into a >> henn-egg-problem, literally. As long as I can boot off of the installation >> medium, I have a kernel which deals with the setting. But the boot medium is >> supposed to be a SSD sitting with the PRAID CP400i controller itself! So, I >> never be able to boot off the system without crippling the ability to have a >> fullspeed ZFS configuration which I suppose to have with HBA mode, but not >> with any of the forced RAID modes offered by the controller. > Been there plenty of times, even argued quite strongly about the advantages of ZFS against hardware based RAID > 5 cards. :) I remember when the Dell salesmen couldn’t possibly understand why I wanted a “software based RAID rather than a > robust, hardware based solution” :D There are reasons for using either... Nowadays its seems the conversations have degenerated into those like Windows vs Linux vs Mac where everyone thinks their answer is the right one (just as you suggested you (Borja Marcos) did with the Dell salesman), where in reality each has its own advantages and disadvantages. Eg: I'm running 2 zfs servers on 'LSI 9260-16i's... big mistake! (the ZFS, not LSI's)... one is a 'movie server' the other a 'postgresql database' server... The latter most would agree is a bad use of zfs, the die-hards won't but then they don't understand database servers and how they work on disk. The former has mixed views, some argue that zfs is the only way to ensure the movies will always work, personally I think of all the years before zfs when my data on disk worked without failure until the disks themselves failed... and RAID stopped that happening... what suddenly changed, are disks and ram suddenly not reliable at transferring data? .. anyhow back to the issue there is another part with this particular hardware that people just throw away... The LSI 9260-* controllers have been designed to provide on hardware RAID. The caching whether using the Cachecade SSD or just oneboard ECC memory is *ONLY* used when running some sort of RAID set and LVs... this is why LSI recommend 'MegaCli -CfgEachDskRaid0' because it does enable caching.. A good read on how to setup something similar is here: https://calomel.org/megacli_lsi_commands.html (disclaimer, I haven't parsed it all so the author could be clueless, but it seems to give generally good advice.) Going the way of 'JBOD' is a bad thing to do, just don't, performance sucks. As for the recommended command above, can't comment because currently I don't use it nor will I need to in the near future... but... If you (O Hartmann) want to use or need to use ZFS with any OS including FreeBSD don't go with the LSI 92xx series controllers, its just the wrong thing to do.. Pick an HBA that is designed to give you direct access to the drives not one you have to kludge and cajole.. Including LSI controllers with caches that use the mfi driver, just not those that are not designed to work in a non RAID mode (with or without the passthru command/mode above.) > > At worst, you can set up a simple boot from a thumb drive or, even better, a SATADOM installed inside the server. I guess it will > have SATA ports on the mainboard. That’s what I use to do. FreeNAS uses a similar approach as well. And some modern servers > also can boot from a SD card which you can use just to load the kernel. > > Depending on the number of disks you have, you can also sacrifice two to set up a mirror with a “nomal” boot system, and using > the rest of the disks for ZFS. Actually I’ve got an old server I set up in 2012. It has 16 disks, and I created a logical volume (mirror) > with 2 disks for boot, the other 14 disks for ZFS. > > If I installed this server now I would do it different, booting off a thumb drive. But I was younger and naiver :) > > If I installed mine now I would do them differently as well... neither would run ZFS, both would use their on card RAID kernels and UFS on top of them... ZFS would be reserved for the multi-user NFS file servers. (and trust me here, when it comes to media servers - where the media is just stored not changed/updated/edited - the 16i with a good highspeed SSD as 'Cachecade' really performs well... and on a moderately powerful MB/CPU combo with good RAM and several gigabit interfaces it's surprising how many unicast transcoded media streams it can handle... (read: my twin fibres are saturated before the machine reaches anywhere near full load, and I can still write at 13MBps from my old Mac Mini over NFS... which is about all it can do without any load either.) So moral of the story/choices. Don't go with ZFS because people tell you its best, because it isn't, go with ZFS if it suits your hardware and application, and if ZFS suits your application, get hardware for it. Regards, -- Michelle Sullivan http://www.mhix.org/