Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 18 Oct 2002 06:50:04 -0700 (PDT)
From:      abc@anchorageinternet.org
To:        freebsd-bugs@FreeBSD.org
Subject:   Re: misc/44195: globbing/argument limits
Message-ID:  <200210181350.g9IDo4e7023662@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help
The following reply was made to PR misc/44195; it has been noted by GNATS.

From: abc@anchorageinternet.org
To: Peter Pentchev <roam@ringlet.net>
Cc:  
Subject: Re: misc/44195: globbing/argument limits
Date: Fri, 18 Oct 2002 13:46:20 GMT

 ok - thanks - you were very helpful.
 i put this stuff in little scripts so
 on my web site so hopefully people can
 see some simple techniques and avoid
 bugging you guys like i did :)
 
 i did check around *quite a bit* for the things you
 answered, and failed to find answers as good as
 the ones you provided.  i was getting grumpy
 with some things that were frustrating and
 you made me a happy FBSD user once again :)
 
 ps.  i leave the following in this email
      so i have a copy in my sent mail and
      can study your answers more.  nothing
      new follows.  thank you very much.
 
 > On Fri, Oct 18, 2002 at 11:57:36AM +0000, abc@anchorageinternet.org wrote:
 > > > > >Number:         44195
 > > > > >Category:       misc
 > > > > >Synopsis:       globbing/argument limits
 > > > > >Originator:     Joe Public
 > > > > >Release:        i386 FreeBSD 4.7-RELEASE
 > > > > >Organization:
 > > > > no org
 > > > > >Environment:
 > > > > ^^^^^^^^^^^^^^^^^^^^^^^^
 > > > > >Description:
 > > > > argument limits painful to users in days of 100GB drives.
 > > > > >How-To-Repeat:
 > > > > try a command and give it a few thousand arguments,
 > > > 
 > > > It is not a matter of how many arguments you give to a command, it is
 > > > simply a matter of how *long* the command line becomes.  Lugging around
 > > > a multimegabyte command line buffer through shells, execv() system calls
 > > > and such would be a *major* strain on your system.
 > > 
 > > i would've assumed the command line stays in place in memory,
 > > and only a pointer is passed around - and checking the exec(3)
 > > manpage seems to show this is the case in fact.
 > 
 > Not when exec(3) invokes the execve(2) system call, as stated in the
 > very first paragraph of the exec(3) manual page.  The execve(2) system
 > call needs to copy the arguments to kernel space to examine them, and
 > then to build a single command line for the new process.
 > 
 > > > > like in file modifying command a folder with 6000 files.
 > > > > find(1) is too slow, and combining it with xargs is a kludge.
 > > > 
 > > > If you mean that 'find -exec' is too slow, then I would argue that using
 > > > -exec is the kludge, when xargs(1) is available.  I am pretty sure that
 > > > the find(1) and xargs(1) utilities were actually developed together,
 > > > with a common goal in mind, that goal being *exactly* processing of
 > > > multiple files in one go.
 > > 
 > > ok - interesting - i appreciate you explaining this.
 > > it should be in a the FAQ or something.
 > >  
 > > > The -exec primary to find(1) is extremely inefficient when dealing with
 > > > many files - it spawns a new process for each file it finds, which, as
 > > > you note, is too slow.  The xargs utility will do a much better job; I
 > > > would be very interested in what exactly do you consider to be a kludge
 > > > about it.
 > > 
 > > i consider it to be a kludge when you have to:
 > > 
 > > find -s "$I" ! -type d | xargs tar rvf "$I.tar" && \
 > > && gzip -f9 "$I.tar" && mv "$I.tar.gz" "$I.tgz" 
 > > 
 > > just to create a sorted tar/gzip archive
 > > of a directory tree.  hehe - as i look at
 > > it - i say to myself "this is bullshit" :).
 > 
 > As noted in my response to your other PR, this particular use of find(1)
 > and tar(1) may be optimized :)  Besides, the only "kludge" in that
 > example is the need to update the tarball incrementally using tar's 'r'
 > command instead of 'c'; I, personally, would not consider that too large
 > a price to pay for being able to process the whole list of files at all.
 > 
 > > i mean - UNIX hackers have got to be smarter
 > > than to demand all that from a user just to
 > > accomplish such a minor ordinary task.
 > 
 > Yep, see both above and below :)
 > 
 > > also, when you do something like:
 > > 
 > >     find / \!   \(  -path \*/bin/\*     -or -path \*/lib/\*         \
 > >                 -or -path \*/libexec/\* -or -path /usr/games/\*     \
 > >                 -or -path \*/sbin/\*    -or -path /boot/\*          \
 > >                 -or -path /dev/\*       -or -path /modules/\*       \
 > >                 -or -path /proc/\*      -or -path /root/\*      \)  \
 > >                 -type f -exec chown     root:wheel {}               \;\
 > >                         -exec chmod     0644 {}                     \;
 > > 
 > > ie, something a find(1) with 2 -exec's, xargs fails,
 > > and you are forced to double or triple (or more) the code
 > > it takes - according to the number of -exec's you need to
 > > perform - i consider this to be a kludge as well.
 > 
 > If you need to execute multiple commands, there are several things you
 > might do.
 > 
 > The simplest is to create a small shell script, and use xargs(1) to
 > execute it; the shell script runs chown, chmod, or whatever, on all its
 > arguments.
 > 
 > Another way would be capturing find(1)'s output into a file, then
 > running xargs(1) as many times as needed, redirecting its input to
 > read this file; something like:
 > 
 >   find / \! \( ... \) > filelist
 >   xargs chown root:wheel < filelist
 >   xargs chmod 0644 < filelist
 > 
 > Still another way might avoid the temporary file altogether, with some
 > creative file descriptor hackery.  I *think* I have done this before,
 > but right now, I cannot remember the proper incantations to make the
 > shell duplicate find(1)'s output to a new file descriptor, run xargs
 > from fd 1's output, then run another copy of xargs, making it read from
 > the file descriptor that find(1)'s output was duplicated to.  I know it
 > is possible, it is just that I cannot remember how to do it :)
 > 
 > > > PS. This will very probably be my last post on this subject, and nobody
 > > > should be surprised if this PR is closed very soon; what with the recent
 > > > mailing list "activity", it scores big on my troll indicator.  I could
 > > > be wrong, of course, but I'm just stating my opinion here.
 > > 
 > > ok - well - no trolling - just installed 4.7, and hit
 > > a point of frustration with things that have been
 > > bugging me over the years - that don't see to
 > > get fixed or improve.
 > > 
 > > i truly value the effort you made to try to explain,
 > > though as stated, i still don't see the problem
 > > in fixing things.
 > 
 > Apologies for the above paragraph of mine and my attitude in somewhat
 > summarily dismissing your other PR's at first; there has been quite a
 > bit of trolling on the various FreeBSD lists recently, and there have
 > been a couple of bogus PR's filed in the process, so I was a bit
 > trigger-happy there.
 > 
 > G'luck,
 > Peter
 > 
 > -- 
 > Peter Pentchev	roam@ringlet.net	roam@FreeBSD.org
 > PGP key:	http://people.FreeBSD.org/~roam/roam.key.asc
 > Key fingerprint	FDBA FD79 C26F 3C51 C95E  DF9E ED18 B68D 1619 4553
 > This would easier understand fewer had omitted.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200210181350.g9IDo4e7023662>