From owner-freebsd-fs@FreeBSD.ORG Mon Jun 27 18:40:04 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A556B106564A; Mon, 27 Jun 2011 18:40:04 +0000 (UTC) (envelope-from gad@FreeBSD.org) Received: from smtp7.server.rpi.edu (smtp7.server.rpi.edu [128.113.2.227]) by mx1.freebsd.org (Postfix) with ESMTP id 1B2D88FC0A; Mon, 27 Jun 2011 18:40:03 +0000 (UTC) Received: from gilead.netel.rpi.edu (gilead.netel.rpi.edu [128.113.124.121]) by smtp7.server.rpi.edu (8.13.1/8.13.1) with ESMTP id p5RIdqDH016103; Mon, 27 Jun 2011 14:39:53 -0400 Message-ID: <4E08CE78.1050207@FreeBSD.org> Date: Mon, 27 Jun 2011 14:39:52 -0400 From: Garance A Drosehn User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.1.9) Gecko/20100722 Eudora/3.0.4 MIME-Version: 1.0 To: mdf@FreeBSD.org References: <20101201091203.GA3933@tops> <20110104175558.GR3140@deviant.kiev.zoral.com.ua> <20110120124108.GA32866@tops.skynet.lt> <4E027897.8080700@FreeBSD.org> <20110623064333.GA2823@tops> <20110623081140.GQ48734@deviant.kiev.zoral.com.ua> <4E03B8C4.6040800@FreeBSD.org> <20110623222630.GU48734@deviant.kiev.zoral.com.ua> <4E04FC7F.6090801@FreeBSD.org> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Bayes-Prob: 0.0001 (Score 0) X-RPI-SA-Score: 1.50 (*) [Hold at 12.00] COMBINED_FROM,RATWARE_GECKO_BUILD X-CanItPRO-Stream: outgoing X-Canit-Stats-ID: Bayes signature not available X-Scanned-By: CanIt (www . roaringpenguin . com) on 128.113.2.227 Cc: freebsd-fs@FreeBSD.org, Robert Watson Subject: Re: [rfc] 64-bit inode numbers X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Jun 2011 18:40:04 -0000 On 6/24/11 6:21 PM, mdf@FreeBSD.org wrote: > On Fri, Jun 24, 2011 at 2:07 PM, Garance A Drosehn wrote: > >> The AFS cell at RPI has approximately 40,000 AFS volumes, and each >> volume should have it's own dev_t (IMO). >> >> Please realize that I do not mind if people felt that there was no >> need to increase the size of dev_t at this time, and that we should >> wait until we see more of a demand for increasing it. But given the >> project to increase the size of inode numbers, I thought this was a >> good time to also ask about dev_t. I ask about it every few years :-) >> > I don't see why 32 bits are anywhere close to becoming tight to > represent 40k unique values. Is there something wrong with how each > new dev_t is computed, that runs out of space quicker than this > implies? > > Thanks, > matthew The 40K values are just for the AFS volumes at RPI. AFS presents the entire world as a single filesystem, with the RPI cell as just one small part of that worldwide filesystem. The public CellServDB.master file lists 200 cells, where all of those cells would be available at the same time to any user sitting on a single machine which has AFS installed. And that's just the official public AFS cells. Organizations can (and do) have private AFS cells which are not part of the official public list. I mentioned the 40K volumes at RPI because someone said "I do not expect to see hundreds of thousands of mounts on a single system". My example was just to show that I can access 40 thousand AFS volumes in a single unix *command*, without even leaving RPI. That was not meant to show how many volumes are reachable under all of /afs. Also, it was really easy for me to come up with the number of AFS volumes in the RPI cell. I'd be reluctant to try and probe all of the publicly-reachable AFS cells to come up with a real number for how many AFS volumes there are in the world. (aside: actually there are more like 60K AFS volumes at RPI, but at least 20K of those are not readily accessible via unix commands, so I said 40K. And most users at RPI couldn't even access 40K of those AFS volumes, but I suspect I can because I'm an AFS admin) One reason RPI has so many AFS volumes is that each user has their own AFS volume for their home directory. Given the way AFS works, that is a very very reasonable thing to do. In fact, it'd almost be stupid to *not* give every user their own AFS volume. Now imagine the WWW, where every single http/www.place.tld/~username on the entire planet was on a different disk volume. And any single user on a single system can access any combination of those disk volumes within a single login session. The WWW is a world-wide web. AFS is meant as a world-wide distributed file system. When working on a world-wide scale, you hit larger numbers. I think that many people who have not worked with AFS keep thinking of it the same way they think of NFS, but AFS was designed with much larger-scale deployment in mind. Again, I don't mind if we don't wish to tackle a larger dev_t right now, and I definitely do not want the 64-bit ino_t project to get bogged down with a debate over a larger dev_t. But I have been working with OpenAFS for ten years now, and it is definitely true that a larger dev_t would be helpful for that specific filesystem. And it may be that some other solution would be even better, so I don't want to push this one too much. -- Garance Alistair Drosehn = gad@gilead.netel.rpi.edu Senior Systems Programmer or gad@freebsd.org Rensselaer Polytechnic Institute or drosih@rpi.edu