Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 13 Dec 2007 19:22:06 +0100
From:      cpghost <cpghost@cordula.ws>
To:        "Mark Evans" <mbe2@bayou.com>
Cc:        questions@freebsd.org
Subject:   Re: ls -l takes a forever to finish.
Message-ID:  <20071213192206.547cbffe@epia-2.farid-hajji.net>
In-Reply-To: <002201c83cf5$2948a560$0d00a8c0@bayoucshaffer>
References:  <005901c8313f$f7048b70$0d00a8c0@bayoucshaffer> <474CA49D.50306@FreeBSD.org> <002001c831d5$80ad8670$0d00a8c0@bayoucshaffer> <a969fbd10711280752v7d38070x5f34d9d652ec4f7f@mail.gmail.com> <003101c831da$a405bc50$0d00a8c0@bayoucshaffer> <20071129122043.A9040@wojtek.tensor.gdynia.pl> <20071129084244.eaba6f7a.wmoran@potentialtech.com> <20071129154230.5ba29b43@epia-2.farid-hajji.net> <002201c83cf5$2948a560$0d00a8c0@bayoucshaffer>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 12 Dec 2007 13:28:23 -0600
"Mark Evans" <mbe2@bayou.com> wrote:

> this program seems to have the same issues with it.

[Please don't top post.]

Of course, if "ls -lf" has those issues, "sortls.py" will
have them too, because it just runs it and sorts its output
externally with another sorting algorithm. sortls.py speeds
up "ls -l" considerably for huge (10,000+ entries) directories
by using another sorting algorithm, it doesn't do anything else.

Just to ask again: while you're waiting for "ls -lf", what
does "top" say? Is that process accumulating CPU time, or
is it just sitting around waiting, waiting, waiting...?
Are you using NFS or another file system where stat(2) is
expensive?

> Thanks
> Mark
> 
> 
> ----- Original Message ----- 
> From: "cpghost" <cpghost@cordula.ws>
> To: "Bill Moran" <wmoran@potentialtech.com>
> Cc: "Wojciech Puchar" <wojtek@wojtek.tensor.gdynia.pl>; 
> <questions@freebsd.org>; "Mark Evans" <mbe2@bayou.com>
> Sent: Thursday, November 29, 2007 8:42 AM
> Subject: Re: ls -l takes a forever to finish.
> 
> 
> > On Thu, 29 Nov 2007 08:42:44 -0500
> > Bill Moran <wmoran@potentialtech.com> wrote:
> >
> >> In response to Wojciech Puchar <wojtek@wojtek.tensor.gdynia.pl>:
> >>
> >> > > ls | wc
> >> >
> >> > strange. i did
> >> >
> >> > [wojtek@wojtek ~/b]$ a=0;while [ $a -lt 10000 ];do mkdir
> >> > $a;a=$[a+1];done
> >> >
> >> > completed <25 seconds on 1Ghz CPU
> >> >
> >> > ls takes 0.1 seconds user time, ls -l takes 0.3 second user time.
> >> >
> >> > unless you have 486/33 or slower system there is something wrong.
> >>
> >> Another possible scenario is that the directory is badly
> >> fragmented. Unless something has changed since I last researched
> >> this (which is possible) FreeBSD doesn't manage directory
> >> fragmentation during use. If you're constantly adding and removing
> >> files, it's possible that the directory entry is such a mess that
> >> it takes ls a long time to process it.
> >
> > Yes, that's also possible. But sorting is really the culprit here:
> > it *is* possible to create a directory with filenames in such a way
> > that it triggers Quicksort's O(N^2) worst case instead of O(N log
> > N).
> >
> > The following Python (2.5) program calls "ls -lf" and sorts its
> > output with Python's own stable sort() routine (which is NOT
> > qsort(3)). On a directory with 44,000 entries, it runs orders of
> > magnitude faster than "ls -l", even though it has to use the
> > decorate-sort-undecorate idiom to sort the output according
> > according the filename, and it is interpreted rather than compiled!
> >
> > I guess that replacing qsort(3) in
> > /usr/src/lib/libc/gen/fts.c:fts_sort()
> > with another sort algorithm which doesn't
> > expose this anomaly would solve that problem.

--------------------- cut here ------------------ cut here ------------
#!/usr/bin/env python
# sortls.py -- sort output of ls -lf with python's stable sort routine.

import os

def sort_ls_lf(path):
    "Sort the output of ls -lf path"
    os.chdir(path)
    lines = os.popen("ls -lf", "r").readlines()
    dsu = [ (line.split()[-1], line) for line in lines ]
    dsu.sort()
    return ''.join(tupl[1] for tupl in dsu)

if __name__ == '__main__':
    import sys
    if len(sys.argv) < 2:
        print >>sys.stderr, "Usage:", sys.argv[0], "path"
        sys.exit(1)
    path = sys.argv[1]
    
    try:
        print sort_ls_lf(path)
    except IOError:
        pass   # silently absorb broken pipe and other errors
--------------------- cut here ------------------ cut here ------------

-cpghost.

-- 
Cordula's Web. http://www.cordula.ws/



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20071213192206.547cbffe>