From owner-freebsd-hackers@FreeBSD.ORG Thu Dec 2 18:03:01 2004 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4E77016A4D2; Thu, 2 Dec 2004 18:03:01 +0000 (GMT) Received: from ims-1.prv.ampira.com (ims-1.ampira.com [66.179.231.26]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3519143D31; Thu, 2 Dec 2004 18:03:00 +0000 (GMT) (envelope-from kamalp@acm.org) Received: from [202.142.94.194] (helo=[172.16.3.26]) by ims-1.prv.ampira.com with asmtp (Exim 4.24) id 1CZvEy-0004MZ-Im; Thu, 02 Dec 2004 12:59:41 -0500 Message-ID: <41AF58CE.10904@acm.org> Date: Thu, 02 Dec 2004 23:32:54 +0530 From: "Kamal R. Prasad" User-Agent: Mozilla Thunderbird 0.7.3 (Windows/20040803) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Andre Oppermann References: <41AE3F80.1000506@freebsd.org> <41AF29AC.6030401@freebsd.org> <41AF53E1.80408@freebsd.org> In-Reply-To: <41AF53E1.80408@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit cc: Sam cc: hackers@freebsd.org cc: Scott Long cc: "current@freebsd.org" Subject: Re: My project wish-list for the next 12 months X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Dec 2004 18:03:01 -0000 Andre Oppermann wrote: > Sam wrote: > >> On Thu, 2 Dec 2004, Andre Oppermann wrote: >> >>> Scott Long wrote: >>> >>>> 5. Clustered FS support. SANs are all the rage these days, and >>>> clustered filesystems that allow data to be distributed across many >>>> storage enpoints and accessed concurrently through the SAN are very >>>> powerful. RedHat recently bought Sistina and re-opened the GFS source >>>> code, so exploring this would be very interesting. >>> >>> >>> There are certain steps that can be be taken one at a time. For >>> example >>> it should be relatively easy to mount snapshots (ro) from more than one >>> machine. Next step would be to mount a full 'rw' filesystem as 'ro' on >>> other boxes. This would require cache and sector invalidation >>> broadcasting >>> from the 'rw' box to the 'ro' mounts. The holy grail of course is >>> to mount >>> the same filesystem 'rw' on more than one box, preferrably more than >>> two. >>> This requires some more involved synchronization and locking on top >>> of the >>> cache invalidation. And make sure that the multi-'rw' cluster stays >>> alive >>> if one of the participants freezes and doesn't respond anymore. >>> >>> Scrolling through the UFS/FFS code I think the first one is 2-3 days of >>> work. The second 2-4 weeks and the third 2-3 month to get it right. >>> If someone would throw up the money... >> >> >> You might also design in consideration for data redundancy. Right now >> GFS largely relies on the SAN box to export already redundant RAID >> disks. GFS sits on a "cluster aware" lvm layer that is supposed to >> be able to do mirroring and striping, but I'm told it's not >> stable enough for "production" use. > > > Data redundancy would require a UFS/FFS redesign. I'm 'only' talking > about enhancing UFS/FFS but keeping anything ondisk the same (plus > some more elements). > If you add redundancy code into UFS/FFS, that will slow down its performance (even for those not seeking redundancy). A better way would be to have another filesystem implementation like VxFS (veritas filesystem). Im not sure if they have published papers/put their techniques into public domain. regards -kamal