Date: Wed, 18 Jan 2012 10:00:05 +0530 From: "Desai, Kashyap" <Kashyap.Desai@lsi.com> To: John <jwd@freebsd.org>, "Kenneth D. Merry" <ken@freebsd.org> Cc: "freebsd-scsi@freebsd.org" <freebsd-scsi@freebsd.org> Subject: RE: mps driver chain_alloc_fail / performance ? Message-ID: <B2FD678A64EAAD45B089B123FDFC3ED7299D0AA748@inbmail01.lsi.com> In-Reply-To: <20120117020218.GA59053@FreeBSD.org> References: <20120114051618.GA41288@FreeBSD.org> <20120114232245.GA57880@nargothrond.kdm.org> <B2FD678A64EAAD45B089B123FDFC3ED7299CF90E7C@inbmail01.lsi.com> <20120117020218.GA59053@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
> -----Original Message----- > From: John [mailto:jwd@freebsd.org] > Sent: Tuesday, January 17, 2012 7:32 AM > To: Desai, Kashyap; Kenneth D. Merry > Cc: freebsd-scsi@freebsd.org > Subject: Re: mps driver chain_alloc_fail / performance ? >=20 > ----- Desai, Kashyap's Original Message ----- > > Which driver version is this ? In our 09.00.00.00 Driver (which is in > pipeline to be committed) has 2048 chain buffer counter. >=20 > I'm not sure how to answer your question directly. We're using the > driver > that comes with FreeBSD. Not a driver directly from LSI. If we can get a > copy > of your 9.0 driver we can try testing against it. If you type "sysctl -a |grep mps" you can see driver version.. >=20 > > And our Test team has verified it with almost 150+ Drives. >=20 > Currently, we have 8 shelves, 25 drives per shelf, dual attached > configured with geom multipath using Active/Active. Ignoring SSDs and > OS disks on the internal card, we see 400 da devices on mps1 & mps2. > For the record, the shelves are: >=20 > ses0 at mps1 bus 0 scbus7 target 0 lun 0 > ses0: <HP D2700 SAS AJ941A 0131> Fixed Enclosure Services SCSI-5 device > ses0: 600.000MB/s transfers > ses0: Command Queueing enabled > ses0: SCSI-3 SES Device >=20 >=20 > > As suggested by Ken, Can you try increasing MPS_CHAIN_FRAMES to 4096 > OR 2048 >=20 > Absolutely. The current value is 2048. We are currently running with > this patch to increase the value and output a singular alerting message: >=20 > --- sys/dev/mps/mpsvar.h.orig 2012-01-15 19:28:51.000000000 -0500 > +++ sys/dev/mps/mpsvar.h 2012-01-15 20:14:07.000000000 -0500 > @@ -34,7 +34,7 @@ > #define MPS_REQ_FRAMES 1024 > #define MPS_EVT_REPLY_FRAMES 32 > #define MPS_REPLY_FRAMES MPS_REQ_FRAMES > -#define MPS_CHAIN_FRAMES 2048 > +#define MPS_CHAIN_FRAMES 4096 > #define MPS_SENSE_LEN SSD_FULL_SIZE > #define MPS_MSI_COUNT 1 > #define MPS_SGE64_SIZE 12 > @@ -242,8 +242,11 @@ > sc->chain_free--; > if (sc->chain_free < sc->chain_free_lowwater) > sc->chain_free_lowwater =3D sc->chain_free; > - } else > + } else { > sc->chain_alloc_fail++; > + if (sc->chain_alloc_fail =3D=3D 1) > + device_printf(sc->mps_dev,"Insufficient chain_list > buffers."); > + } > return (chain); > } >=20 >=20 > If the logic for outputting the message is appropriate I think > it would be nice to get it committed. If this works for you and you really want to commit, I would suggest to hav= e module parameter to pass chain_max value. Basically, current implementation is not the correct way to handle out of c= hain scenario. Driver should calculate max chain required per HBA at run time from IOC fac= t reply from FW. And it should try to allocate those many Chain buffer run time (instead of having #define for chain max ).=20 If Driver does not find those memory from system @ run time, we should fail= to detect HBA at load time. >From our Linux Driver logs, I find out we need 29700 chain buffer required = per HBA(SAS2008 PCI-Express). So better to increase MPS_CHAIN_FRAMES to (24 * 1024), until we have more r= obust support in driver. Hope this helps you. ~ Kashyap >=20 > > ~ Kashyap > > > > > Kenneth D. Merry said: > > > > > > The firmware on those boards is a little old. You might consider > > > upgrading. >=20 > We updated the the FW this morning and we're now showing: >=20 > mps0: <LSI SAS2116> port 0x5000-0x50ff mem 0xf5ff0000- > 0xf5ff3fff,0xf5f80000-0xf5fbffff irq 30 at device 0.0 on pci13 > mps0: Firmware: 12.00.00.00 > mps0: IOCCapabilities: > 1285c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,EventReplay,HostDis > c> > mps1: <LSI SAS2116> port 0x7000-0x70ff mem 0xfbef0000- > 0xfbef3fff,0xfbe80000-0xfbebffff irq 48 at device 0.0 on pci33 > mps1: Firmware: 12.00.00.00 > mps1: IOCCapabilities: > 1285c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,EventReplay,HostDis > c> > mps2: <LSI SAS2116> port 0x6000-0x60ff mem 0xfbcf0000- > 0xfbcf3fff,0xfbc80000-0xfbcbffff irq 56 at device 0.0 on pci27 > mps2: Firmware: 12.00.00.00 > mps2: IOCCapabilities: > 1285c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,EventReplay,HostDis > c> >=20 > We last updated about around November of last year. >=20 > > > > # camcontrol inquiry da10 > > > > pass21: <HP EG0600FBLSH HPD2> Fixed Direct Access SCSI-5 device > > > > pass21: Serial Number 6XR14KYV0000B148LDKM > > > > pass21: 600.000MB/s transfers, Command Queueing Enabled > > > > > > That's a lot of drives! I've only run up to 60 drives. >=20 > See above. In general, I'm relatively pleased with how the system > responds with all these drives. >=20 > > > > When running the system under load, I see the following > reported: > > > > > > > > hw.mps.2.allow_multiple_tm_cmds: 0 > > > > hw.mps.2.io_cmds_active: 0 > > > > hw.mps.2.io_cmds_highwater: 1019 > > > > hw.mps.2.chain_free: 2048 > > > > hw.mps.2.chain_free_lowwater: 0 > > > > hw.mps.2.chain_alloc_fail: 13307 <---- ?? >=20 > The current test case run is showing: >=20 > hw.mps.2.debug_level: 0 > hw.mps.2.allow_multiple_tm_cmds: 0 > hw.mps.2.io_cmds_active: 109 > hw.mps.2.io_cmds_highwater: 1019 > hw.mps.2.chain_free: 4042 > hw.mps.2.chain_free_lowwater: 3597 > hw.mps.2.chain_alloc_fail: 0 >=20 > It may be a few hours before it progresses to the point where it > ran low last time. >=20 > > > Bump MPS_CHAIN_FRAMES to something larger. You can try 4096 and see > > > what happens. >=20 > Agreed. Let me know if you thing there is anything we should add to > the patch above. >=20 > > > > A few layers up, it seems like it would be nice if the buffer > > > > exhaustion was reported outside of debug being enabled... at least > > > > maybe the first time. > > > > > > It used to report being out of chain frames every time it happened, > > > which wound up being too much. You're right, doing it once might be > good. >=20 > Thanks, that's how I tried to put the patch together. >=20 > > > Once you bump up the number of chain frames to the point where you > aren't > > > running out, I doubt the driver will be the big bottleneck. It'll > probably > > > be other things higher up the stack. >=20 > Question. What "should" the layer of code above the mps driver do if the > driver > returns ENOBUFS? I'm wondering if it might explain some incorrect > results. >=20 > > > What sort of ZFS topology did you try? > > > > > > I know for raidz2, and perhaps for raidz, ZFS is faster if your > number > > > of data disks is a power of 2. > > > > > > If you want raidz2 protection, try creating arrays in groups of 10, > so > > > you wind up having 8 data disks. >=20 > The fasted we've seen is with a pool made of mirrors, though this uses > up the most space. It also caused the most alloc fails (and leads to my > question about ENOBUFS). >=20 > Thank you both for your help. Any comments are always welcome! If I > haven't > answered a question, or otherwise said something that doesn't make > sense, let me know. >=20 > Thanks, > John
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?B2FD678A64EAAD45B089B123FDFC3ED7299D0AA748>