Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 1 Sep 2005 19:20:33 +0300 (EEST)
From:      Dmitry Pryanishnikov <dmitry@atlantis.dp.ua>
To:        Bruce Evans <bde@zeta.org.au>
Cc:        freebsd-arch@freebsd.org
Subject:   Re: kern/85503: panic: wrong dirclust using msdosfs in RELENG_6
Message-ID:  <20050901183311.D62325@atlantis.atlantis.dp.ua>

next in thread | raw e-mail | index | archive | help

Hello!

> Date:      Thu, 1 Sep 2005 23:47:15 +1000 (EST)
> From:      Bruce Evans <bde@zeta.org.au>
>>        LIST_ENTRY(vnode)       v_hashlist;
>>        u_int                   v_hash;
>>
>> I think it's feasible and useful to upgrade type of v_hash to at least 
>> off_t.
>
> This is not needed yet.
>
> Filesystems with more than 4G files are not supported yet, since ino_t
> is 32 bits and is used in critical APIs (struct stat...).  Also,

  Sorry, I don't agree with you. The current situation is ugly: not only
it forces us to play dirty tricks within filesystems in order to generate
unique 32-bit inode numbers, but also it creates an artificial limit
on maximum number of files for 32-bit architectures. E.g., on FreeBSD/ia64
u_int is 64 bits, and thus it would be no problem for it's API to create and 
handle more than 4G files/fs. But such a file system will be incompatible 
with FreeBSD/i386! Isn't this ugly? u_int has nothing to do with storage
size, while off_t has. It is clear that no media with maximum size of
off_t will contain more than off_t files, while we can't guarantee this
for u_int, which is bounded to CPU abilities. I think UNIX is about
compatibility between different architectures, isn't it?

> So all current file systems need to generate unique 32-bit inode
> numbers.  This may be difficult, but once it is done I think the inode
                  ^^^^^^^^^^^^^^^^

   ...and may be close-to-impossible. What if e.g. Microsoft invites say 
FAT-2005 with variable-length directory entries? I'm not sure that for
every third-party filesystem it would be possible to generate 32-bit
pseudoinode. And it's very bad that we can't handle >4Gfiles/fs at all.

> For msdosfs, the inode number is essentially the byte offset divided by
> the size of a directory entry.  The size is 32, so this breaks at a byte
> offset of 128G instead of 4G.  Details:

  This is also imperfect: it creates a lot of pain and limitations for

options         MSDOSFS_LARGE

So, while I understand complexity of such a transitions, but it's clear
that for long-term solution ino_t should be upgraded to the size of off_t 
everywhere. For short-term one... Well, msdosfs isn't the worst case.

>
> Bruce
>

Sincerely, Dmitry
-- 
Atlantis ISP, System Administrator
e-mail:  dmitry@atlantis.dp.ua
nic-hdl: LYNX-RIPE



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050901183311.D62325>