Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 27 Feb 2012 21:21:46 -0500
From:      Robert Banfield <rbanfield@weogeo.com>
To:        freebsd-questions@freebsd.org
Subject:   Re: "find" not traversing all directories on a single zfs file system
Message-ID:  <4F4C3A3A.9080903@weogeo.com>
In-Reply-To: <4F4C094C.1010900@infracaninophile.co.uk>
References:  <4F4BFB09.9090002@weogeo.com> <4F4C094C.1010900@infracaninophile.co.uk>

next in thread | previous in thread | raw e-mail | index | archive | help
On 02/27/2012 05:53 PM, Matthew Seaman wrote:
>
> These are all actual directories -- no symbolic link or anything like
> that?  I assume permissions are not the problem?  All directories have
> at least mode r_x for your user id? (Hmmm... but you are logged in as
> root -- can't be that then.)  How about ACLs?  Are you using those at
> all on your filesystem?

There are no symbolic links, nor any ACLs at all anywhere on the 
system.  All the directories have rwx for root, and permissions are not 
a problem.


> The symptoms you are observing are definitely incorrect, and not at all
> what the vast majority of find(1) users would experience.  Something is
> definitely a bit fubar on your machine.  It would be useful to try and
> establish if it is the find(1) program giving bogus results, or whether
> it is some other part of the system.  Do other methods of printing out
> the filesystem contents suffer from the same problem -- eg. 'ls -R .' or
> 'tar -cvf /dev/null .'

ls -R appears to be traversing all subdirectories.

> Is there anything in the system log or printed
> on the console?  (Note: I always find it useful to enable the
> console.log and all.log by uncommenting the relevant lines in
> /etc/syslog.conf and following the other instructions there.)

da0 runs the operating system.  da1-12 are set up as a RAIDZ2 with 2 hot 
spares.

# zpool status
   pool: tank0
  state: ONLINE
  scan: none requested
config:

     NAME                 STATE     READ WRITE CKSUM
     tank0                ONLINE       0     0     0
       raidz2-0           ONLINE       0     0     0
         label/zfsdisk1   ONLINE       0     0     0
         label/zfsdisk2   ONLINE       0     0     0
         label/zfsdisk3   ONLINE       0     0     0
         label/zfsdisk4   ONLINE       0     0     0
         label/zfsdisk5   ONLINE       0     0     0
         label/zfsdisk6   ONLINE       0     0     0
         label/zfsdisk7   ONLINE       0     0     0
         label/zfsdisk8   ONLINE       0     0     0
         label/zfsdisk9   ONLINE       0     0     0
         label/zfsdisk10  ONLINE       0     0     0
     spares
       label/zfsdisk11    AVAIL
       label/zfsdisk12    AVAIL


# glabel status
                                       Name  Status  Components
gptid/d49367f4-5cfc-11e1-be4b-000423b4b110     N/A  da0p1
                             label/zfsdisk1     N/A  da1
                             label/zfsdisk2     N/A  da2
                             label/zfsdisk3     N/A  da3
                             label/zfsdisk4     N/A  da4
                             label/zfsdisk5     N/A  da5
                             label/zfsdisk6     N/A  da6
                             label/zfsdisk7     N/A  da7
                             label/zfsdisk8     N/A  da8
                             label/zfsdisk9     N/A  da9
                            label/zfsdisk10     N/A  da10
                            label/zfsdisk11     N/A  da11
                            label/zfsdisk12     N/A  da12


These messages appear in the output of dmesg:

GEOM: da1: the primary GPT table is corrupt or invalid.
GEOM: da1: using the secondary instead -- recovery strongly advised.
(repeat for da2 - da12)

GEOM: da1: corrupt or invalid GPT detected.
GEOM: da1: GPT rejected -- may not be recoverable.
GEOM: da1: corrupt or invalid GPT detected.
GEOM: da1: GPT rejected -- may not be recoverable.
(repeat for da2-da12)
GEOM: label/zfsdisk1: corrupt or invalid GPT detected.
GEOM: label/zfsdisk1: GPT rejected -- may not be recoverable.

Could this be related, or a separate issue?

> Also, is this 9.0-RELEASE straight from the installation media, or did
> you compile it yourself?  If you compiled it yourself, what compiler did
> you use (gcc or clang)?  What optimization and what architecture
> settings -- trying to tweak such things for maximum optimization
> frequently leads to dissapointment.

This is straight from the 64-bit memstick install.  I have used both the 
standard install /usr/bin/find as well as a compiled 
/usr/src/usr.bin/find/ and both give the same results.  I have no tweaks 
for zfs other than to zfs_enable on boot.  Because this machine has 16GB 
of RAM, I believe prefetch is automatically enabled.



I have some additional information that I didnt see before actually 
digging into the log file.  It is quite interesting.  There are 82,206 
subdirectories in one of the folders.  Like this:

/zfs_mount/directoryA/token[1-82206]/various_tileset_files

When looking at the output of find, here is what I see:

Lines 1-9996943: The output of find, good as good can be
Lines 9996944-10062479:  Subdirectory entries only, it traversed none of 
them.

Notice 10062479-9996944+1 = 65536 = 2^16

So, of the 82206 subdirectories, the first 82206-2^16 were traversed, 
and the final 2^16 were not.  The plot thickens...



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4F4C3A3A.9080903>