Date: Tue, 14 Jul 1998 06:01:13 +0000 (GMT) From: Terry Lambert <tlambert@primenet.com> To: grog@lemis.com (Greg Lehey) Cc: tlambert@primenet.com, gibbs@plutotech.com, andre@pipeline.ch, Matthew.Alton@anheuser-busch.com, Hackers@FreeBSD.ORG Subject: Re: Software RAID-5 performance Message-ID: <199807140601.XAA00726@usr06.primenet.com> In-Reply-To: <19980714122952.L754@freebie.lemis.com> from "Greg Lehey" at Jul 14, 98 12:29:52 pm
next in thread | previous in thread | raw e-mail | index | archive | help
> Non-interleaved I/O, on the other hand, can be a big penalty (if we're > talking about the same thing). If I have an array with 5 drives, each > capable of a realistic 5 MB/s, and a stripe width of 64 kB, and I > write 256 kB to it, I need to do: We are talking about read(2) calls occuring serially, and write(2) calls occurring non-serially, presuming write(2) is implemented correctly for the non-blocking I/O case. This means that reads will trigger a fetch, but return an EWOULDBLOCK, and writes copy to a buffer (presuming, as you did, that you will be writing a stripe, which puts it on a page boundry with a page increment). > 2. Calculate parity. On the 486/66, this looks like being about 8 > ms. This is the overhead I was referring to. > 3. Write the blocks. If you can do this in parallel, it'll take > about 13 ms. Serially, it'll take about 50 ms. Writes occur in parallel if they are queued when the write is requested, and success is indicated by permission to write and available buffer space. Hard errors are a seperate issue. due to the nature of an async fd, I think it's safe to say that writes, at least in page increments on page boundries, complete immediately if there is buffer space, and therefore multiple user space threads writes are interleaved. For this case, it takes it from 13ms to 21 ms, or to be generous, one and a half times slower. > > Software RAID is a data integrity issue, not a performance one, > > and I think making the performance argument for whatever reason > > (protection domain crossing, interleaved I/O, SMP scalability, > > etc.) is a strawman at best. > > I'm not sure that I understand what you're saying here. Obviously > offloading the checksum calculation (or anything else, for that > matter) to an external box will offload the CPU. And I can't see any > particular difference in data integrity between the two approaches. If you have a specific need for RAID-5 assurances, then performance is a secondardy consideration. The next consideration after assumed fault tolerance requirements is the performance/money trade-off for hardware RAID 5 vs. software. I think performance will be secondary, so a performance argument is really secondary; you can throw money at the RAID-5 performance issues to make them go away. So even if there is a significant performance penalty (1.538 times slower is a significant penalty, IMO), if your application requires RAID-5, then it requires it at any cost. And if performance isn't an issue at that point, then pointing at user space threads as a bottleneck is the wrong thing to do (and it isn't even the bottleneck it is blamed as being; the overhead from non-interleaved reads (which are effectively interleaved read-ahead requests, followed by serial copy-from-cache, which means "about as fast as you can get, since you have to copy anyway") is negligible compared to the overhead you are already willing to eat to get the fault tolerance. There's a tiny increment, true, but it's much less than the additional overhead you'd get from kernel thread context switching on a UP kernel, or even on an SMP kernel, if you didn't have thread-CPU affinity. So the fact that FreeBSD currently has user space threads is pretty much a red herring. The performance penalty for user space threads, and the performance benefit for kernel space threads won't affect this particular (I/O bound) application anyway, unlees your box is (1) SMP and (2) has CPU affinity code; if it does, you can cut the time from 16ms for two operations to 8ms for two operations (still 128% of the time it would take with hardware RAID-5). The losses from user space threading in RAIDFrame are negligible. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199807140601.XAA00726>