Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 18 Oct 2002 05:50:06 -0700 (PDT)
From:      Peter Pentchev <roam@ringlet.net>
To:        freebsd-bugs@FreeBSD.org
Subject:   Re: misc/44195: globbing/argument limits
Message-ID:  <200210181250.g9ICo6dV059756@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help
The following reply was made to PR misc/44195; it has been noted by GNATS.

From: Peter Pentchev <roam@ringlet.net>
To: abc@anchorageinternet.org
Cc: bug-followup@freebsd.org, "Kerr, Greg" <greg@kerr1.com>,
	"Choudhury, Raj" <raj.choudhury@de.opel.com>
Subject: Re: misc/44195: globbing/argument limits
Date: Fri, 18 Oct 2002 15:48:13 +0300

 On Fri, Oct 18, 2002 at 11:57:36AM +0000, abc@anchorageinternet.org wrote:
 > > > >Number:         44195
 > > > >Category:       misc
 > > > >Synopsis:       globbing/argument limits
 > > > >Originator:     Joe Public
 > > > >Release:        i386 FreeBSD 4.7-RELEASE
 > > > >Organization:
 > > > no org
 > > > >Environment:
 > > > ^^^^^^^^^^^^^^^^^^^^^^^^
 > > > >Description:
 > > > argument limits painful to users in days of 100GB drives.
 > > > >How-To-Repeat:
 > > > try a command and give it a few thousand arguments,
 > >=20
 > > It is not a matter of how many arguments you give to a command, it is
 > > simply a matter of how *long* the command line becomes.  Lugging around
 > > a multimegabyte command line buffer through shells, execv() system calls
 > > and such would be a *major* strain on your system.
 >=20
 > i would've assumed the command line stays in place in memory,
 > and only a pointer is passed around - and checking the exec(3)
 > manpage seems to show this is the case in fact.
 
 Not when exec(3) invokes the execve(2) system call, as stated in the
 very first paragraph of the exec(3) manual page.  The execve(2) system
 call needs to copy the arguments to kernel space to examine them, and
 then to build a single command line for the new process.
 
 > > > like in file modifying command a folder with 6000 files.
 > > > find(1) is too slow, and combining it with xargs is a kludge.
 > >=20
 > > If you mean that 'find -exec' is too slow, then I would argue that using
 > > -exec is the kludge, when xargs(1) is available.  I am pretty sure that
 > > the find(1) and xargs(1) utilities were actually developed together,
 > > with a common goal in mind, that goal being *exactly* processing of
 > > multiple files in one go.
 >=20
 > ok - interesting - i appreciate you explaining this.
 > it should be in a the FAQ or something.
 > =20
 > > The -exec primary to find(1) is extremely inefficient when dealing with
 > > many files - it spawns a new process for each file it finds, which, as
 > > you note, is too slow.  The xargs utility will do a much better job; I
 > > would be very interested in what exactly do you consider to be a kludge
 > > about it.
 >=20
 > i consider it to be a kludge when you have to:
 >=20
 > find -s "$I" ! -type d | xargs tar rvf "$I.tar" && \
 > && gzip -f9 "$I.tar" && mv "$I.tar.gz" "$I.tgz"=20
 >=20
 > just to create a sorted tar/gzip archive
 > of a directory tree.  hehe - as i look at
 > it - i say to myself "this is bullshit" :).
 
 As noted in my response to your other PR, this particular use of find(1)
 and tar(1) may be optimized :)  Besides, the only "kludge" in that
 example is the need to update the tarball incrementally using tar's 'r'
 command instead of 'c'; I, personally, would not consider that too large
 a price to pay for being able to process the whole list of files at all.
 
 > i mean - UNIX hackers have got to be smarter
 > than to demand all that from a user just to
 > accomplish such a minor ordinary task.
 
 Yep, see both above and below :)
 
 > also, when you do something like:
 >=20
 >     find / \!   \(  -path \*/bin/\*     -or -path \*/lib/\*         \
 >                 -or -path \*/libexec/\* -or -path /usr/games/\*     \
 >                 -or -path \*/sbin/\*    -or -path /boot/\*          \
 >                 -or -path /dev/\*       -or -path /modules/\*       \
 >                 -or -path /proc/\*      -or -path /root/\*      \)  \
 >                 -type f -exec chown     root:wheel {}               \;\
 >                         -exec chmod     0644 {}                     \;
 >=20
 > ie, something a find(1) with 2 -exec's, xargs fails,
 > and you are forced to double or triple (or more) the code
 > it takes - according to the number of -exec's you need to
 > perform - i consider this to be a kludge as well.
 
 If you need to execute multiple commands, there are several things you
 might do.
 
 The simplest is to create a small shell script, and use xargs(1) to
 execute it; the shell script runs chown, chmod, or whatever, on all its
 arguments.
 
 Another way would be capturing find(1)'s output into a file, then
 running xargs(1) as many times as needed, redirecting its input to
 read this file; something like:
 
   find / \! \( ... \) > filelist
   xargs chown root:wheel < filelist
   xargs chmod 0644 < filelist
 
 Still another way might avoid the temporary file altogether, with some
 creative file descriptor hackery.  I *think* I have done this before,
 but right now, I cannot remember the proper incantations to make the
 shell duplicate find(1)'s output to a new file descriptor, run xargs
 from fd 1's output, then run another copy of xargs, making it read from
 the file descriptor that find(1)'s output was duplicated to.  I know it
 is possible, it is just that I cannot remember how to do it :)
 
 > > PS. This will very probably be my last post on this subject, and nobody
 > > should be surprised if this PR is closed very soon; what with the recent
 > > mailing list "activity", it scores big on my troll indicator.  I could
 > > be wrong, of course, but I'm just stating my opinion here.
 >=20
 > ok - well - no trolling - just installed 4.7, and hit
 > a point of frustration with things that have been
 > bugging me over the years - that don't see to
 > get fixed or improve.
 >=20
 > i truly value the effort you made to try to explain,
 > though as stated, i still don't see the problem
 > in fixing things.
 
 Apologies for the above paragraph of mine and my attitude in somewhat
 summarily dismissing your other PR's at first; there has been quite a
 bit of trolling on the various FreeBSD lists recently, and there have
 been a couple of bogus PR's filed in the process, so I was a bit
 trigger-happy there.
 
 G'luck,
 Peter
 
 --=20
 Peter Pentchev	roam@ringlet.net	roam@FreeBSD.org
 PGP key:	http://people.FreeBSD.org/~roam/roam.key.asc
 Key fingerprint	FDBA FD79 C26F 3C51 C95E  DF9E ED18 B68D 1619 4553
 This would easier understand fewer had omitted.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200210181250.g9ICo6dV059756>