From owner-freebsd-commit Thu Oct 26 14:33:33 1995 Return-Path: owner-commit Received: (from root@localhost) by freefall.freebsd.org (8.6.12/8.6.6) id OAA27692 for freebsd-commit-outgoing; Thu, 26 Oct 1995 14:33:33 -0700 Received: (from root@localhost) by freefall.freebsd.org (8.6.12/8.6.6) id OAA27674 for cvs-all-outgoing; Thu, 26 Oct 1995 14:33:28 -0700 Received: (from root@localhost) by freefall.freebsd.org (8.6.12/8.6.6) id OAA27663 for cvs-sys-outgoing; Thu, 26 Oct 1995 14:33:24 -0700 Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.211]) by freefall.freebsd.org (8.6.12/8.6.6) with ESMTP id OAA27633 ; Thu, 26 Oct 1995 14:32:54 -0700 Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id OAA21688; Thu, 26 Oct 1995 14:24:07 -0700 From: Terry Lambert Message-Id: <199510262124.OAA21688@phaeton.artisoft.com> Subject: Re: SYSCALL IDEAS [Was: cvs commit: src/sys/kern sysv_msg.c sysv_sem.c sysv_shm.c] To: dyson@freefall.freebsd.org (John Dyson) Date: Thu, 26 Oct 1995 14:24:07 -0700 (MST) Cc: terry@lambert.org, bde@zeta.org.au, CVS-commiters@freefall.freebsd.org, bde@freefall.freebsd.org, cvs-sys@freefall.freebsd.org, hackers@freebsd.org, swallace@ece.uci.edu In-Reply-To: <199510261356.GAA06844@freefall.freebsd.org> from "John Dyson" at Oct 26, 95 06:56:19 am X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Length: 4035 Sender: owner-commit@freebsd.org Precedence: bulk > The VM limitations are due to extreme performance hits if we have to > go to a (long long) type representation of a page. I propose that we > (in fact we will) move from page offsets to page indexes. For a 32 bit > machine, it gives us at least 8TB (and probably more with sign extension > fixes) for file/filesystem sizes. For a 64 bit machine it would give us > even more (by a factor of 4,000,000). As far as the API is concerned, > we can use whatever we want -- long, long long, or whatever. We will > just have that terrible, limiting capability of supporting only 8TB files on > 32 bit machines with 4K pagesizes. I think we have to work in the scope of device block addressing, which is 2^31 * 512 or 1TB, as opposed to page addressing at 4TB. The 8TB figure, I think, represents the value without the use of the "improved" UFS indirect block handling that came with 4.4. There are a number of ideas (like page anonymity based page protection) which can't be implemented without a large statistical range for the hash -- typically larger than 32 bits with machine memory sizes running into the 100's of MB. For the Alpha, at least (a nominally 64bit machine), the address range for real memory is restricted to less than 64 bits. The problem that is arising in all these cases is the buffer cache mapping of file data for large ranges requiring a linear relationship throughout the file instead of a smaller linear "window" onto the file. This would require a domain/range based offset + length mapping, allowing multiple mapping windows per file to address the issue completely satisfactorily for 64 bit block and file offsets on 32 bit Intel architectures. Such an approach would not have the "long long" drawbacks that would otherwise be introduced, though there would be *some* (lesser) overhead involved. > We haven't worked on the sign extension problems, because simply we do not > support files > 4GB (or is it 2GB???) period right now. I don't believe that > there is a problem with block devices (we do NOT use vmio for those.) But > additionally we do not support mmaping them right now. It's 2G of file, 1T of file system at present, with a single 32 bit sector offset for a max of 8G based on the dos partitioning and disklabel issues. Ie: a very big partition is allowable, but it must start in the 0-8G range, and disklabellimits the length. I'd like to address the 2G file size problem. The 2G limit currently makes the use of quad off_t's in our internal interfaces a laughable and gratuitous barrier to source compatability with legacy code (the only kind we have, unless you know about a commercial venture that I don't). > The changes have been so vast that there has been significant ugliness > added to the code. That is being worked on, and I suggest that if there > are some architectural problems that you see -- 'corrected' code would be > helpful. Note also some sort of performance analysis and architectural > impact review is desirable. All I can say is that I spent nearly a year > working on the most horrible OS code that I ever saw -- SVR4, and I don't > want us to go down the low performance path that they did. They got both > the hackery and low performance. At least we are working on cleaning up > the hackery aspects, including that which was inherited from Mach (because > of the differences in the philosophy -- Mach VM and the original BSD port > was certainly interesting.) I agree with all of this. It seems that the most interesting places to work are the boundries, and right now the 2G file size limit is one of those, at least for me. At the very least, I'd like to see the system limits on VM mappable space go away as part of the necessary changes. Has any consideration been made to pulling in the NetBSD non-vmio based changes, or to making the vmio/non-vmio switch a bit smoother and less intrusive? Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.