From owner-freebsd-current@FreeBSD.ORG Wed Nov 16 16:15:54 2005 Return-Path: X-Original-To: freebsd-current@freebsd.org Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id AF89416A41F for ; Wed, 16 Nov 2005 16:15:54 +0000 (GMT) (envelope-from b.candler@pobox.com) Received: from thorn.pobox.com (thorn.pobox.com [208.210.124.75]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1E46843D8D for ; Wed, 16 Nov 2005 16:15:44 +0000 (GMT) (envelope-from b.candler@pobox.com) Received: from thorn (localhost [127.0.0.1]) by thorn.pobox.com (Postfix) with ESMTP id 347A813B for ; Wed, 16 Nov 2005 11:06:31 -0500 (EST) Received: from mappit.local.linnet.org (212-74-113-67.static.dsl.as9105.com [212.74.113.67]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by thorn.sasl.smtp.pobox.com (Postfix) with ESMTP id DCD26679 for ; Wed, 16 Nov 2005 11:06:30 -0500 (EST) Received: from lists by mappit.local.linnet.org with local (Exim 4.54 (FreeBSD)) id 1EcPwi-0001Bh-BN for freebsd-current@freebsd.org; Wed, 16 Nov 2005 16:15:40 +0000 Date: Wed, 16 Nov 2005 16:15:40 +0000 From: Brian Candler To: freebsd-current@freebsd.org Message-ID: <20051116161540.GB4383@uk.tiscali.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.1i Subject: Order of files with 'cp' X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Nov 2005 16:15:54 -0000 I've noticed on FreeBSD-5.4 and -6.0 that the order in which 'cp' copies multiple files does not match the order they're given on the command line. This is noticeable when the target server is remote and/or slow (e.g. NFS; USB flash device). I guess it's not especially important, but it's slightly annoying in that I have a dumb USB MP3 player, and it plays the tracks in the raw order they appear in the filesystem, not by sorting filenames or anything like that. I've had a look through the code, and it seems that cp calls fts_open() with the list of files in argv; fts_open then does a qsort() on the arguments, using the comparison function mastercmp() provided by cp: /* * mastercmp -- * The comparison function for the copy order. The order is to copy * non-directory files before directory files. The reason for this * is because files tend to be in the same cylinder group as their * parent directory, whereas directories tend not to be. Copying the * files first reduces seeking. */ This seems reasonable enough, but I think it would be good to preserve order when all the arguments are files. This could be done at not great expense. I can think of several ways: (1) /usr/src/bin/cp/cp.c Update mastercmp so that it falls back to comparing the argv[] indexes if otherwise it would return 0. I thought the fts_number member of the FTSENT structure could be used for this purpose, although it is currently being used as a one-bit flag (pflag/dne). This flag could be moved to a high bit instead. (2) /usr/src/lib/libc/gen/fts.c Before calling qsort, call the comparison function on each pair of items in turn. If this returns -1 or 0 in every case, then the list is already ordered and there is no need to call qsort(), which will unorder them. This covers the common cases where all the sources are either all files or all directories. (3) replace the call to qsort() with a stable sort, e.g. mergesort(). I think cp's mastercmp() will still need some tweaking in that case so that two directories compare as equal, e.g. if (a_info == FTS_D) return (-1); if (b_info == FTS_D) return (1); return (0); becomes if (a_info == FTS_D && b_info != FTS_D) return (-1); if (b_info == FTS_D) return (1); return (0); Anyone have any thoughts on this? Regards, Brian.