Date: Tue, 22 Mar 2016 13:37:52 -0700 From: Doug Ambrisko <ambrisko@ambrisko.com> To: Garrett Wollman <wollman@csail.mit.edu> Cc: freebsd-stable@freebsd.org Subject: Re: Hangs with mrsas? Message-ID: <20160322203752.GA73172@ambrisko.com> In-Reply-To: <22257.42636.358484.165317@khavrinen.csail.mit.edu> References: <22237.53738.967189.432979@khavrinen.csail.mit.edu> <20160322184238.GA58487@ambrisko.com> <22257.42636.358484.165317@khavrinen.csail.mit.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Mar 22, 2016 at 04:09:48PM -0400, Garrett Wollman wrote: | <<On Tue, 22 Mar 2016 11:42:38 -0700, Doug Ambrisko <ambrisko@ambrisko.com> said: | | > You could try: | > https://people.freebsd.org/~ambrisko/mrsas.patch | | I take it that the important part of this patch is changing the DMA | tag and scatter/gather setup to allow 64-bit addresses? (Why would | the original driver have been limited to 32-bit addresses? It's quite | new hardware!) Yes, primarily ... there are some other things such as let the OS set things up especially in the ioctl path since user-land probably won't setup a proper SG list for the kernel. The DMA address space for the card was limited to 256K in 32 bit address space. So it didn't take much to fragment that up so things could fail or have to wait to get memory. On initial boot things worked "okay" but after some run time with our appliance (we run 64 bit) memory allocations would have issues. We found this was made worse with RAID cards that didn't have cache. I assume no cache would make I/O operations to take longer and then tie up memory longer. With the same SW running on cards with cache we didn't see these issues. So I assume they completed fast enough not to hold onto memory for very long. With these changes our appliances without RAID cache runs faster and doesn't run into "strange" issues now. We run in RAID 10 mode. It also adds RAID card event messages to dmesg. On the plus side this code exposed a VM bug in 9.2 for us! There is still a bug that with a card without cache if I send lots of management commands quickly to reconfigure the RAID the driver reports the firmware had an OCR issue and never recovers. If I put a sleep 1 after each command then it is okay. I need to try this again and dump the term log to see if the firmware will give me a clue. With the cards that we are currently using the RAID cache is an option. So they only thing I'm changing is the HW and not the firmware. However, the firmware seems to flip itself into different device when I add or remove cache. Thanks, Doug A.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20160322203752.GA73172>