From owner-freebsd-questions@freebsd.org Fri Oct 16 03:05:13 2020 Return-Path: Delivered-To: freebsd-questions@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 29A4444D970 for ; Fri, 16 Oct 2020 03:05:13 +0000 (UTC) (envelope-from bob@proulx.com) Received: from havoc.proulx.com (havoc.proulx.com [96.88.95.61]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4CC9xh1fznz4MCy for ; Fri, 16 Oct 2020 03:05:11 +0000 (UTC) (envelope-from bob@proulx.com) Received: from joseki.proulx.com (localhost [127.0.0.1]) by havoc.proulx.com (Postfix) with ESMTP id ADD5F59E for ; Thu, 15 Oct 2020 21:05:04 -0600 (MDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=proulx.com; s=dkim2048; t=1602817504; bh=FEwfeW6kQRh5nINek0hgdIQ7CK3m4dwhSeGyxFqSjvU=; h=Date:From:To:Subject:References:In-Reply-To:From; b=LIWGZZBKKF28PbYs64q1+PUTDMifp0k+sHIWch7tpVZvzufbHxC1kaEl50YFF2aGQ TCiVRJKlasNjDckvDfX0y0aUKVEXFnMdZYcCXJEnljgeBRgPGbJRpYkjg/YohuGANp n7Q9FF+2cbczvetH7HgCEApLE6soWgXkPWEwfqeyNgUiJG14RokHZTe6oEd5uenjRh fPOqJtb5Tm6xNprH/LKqpIUT+RZeMpUP6/deIt6qHGx1uqyUs5MOzUDP2sUyxyUjyV rt1G+Kn/cR6r+mWYfB3rey2oMDM6ExMIPsUUjU2UihHDOKbBWDSBCBTmMwHCMwCLHu tJVJZmikvkZGw== Received: from hysteria.proulx.com (hysteria.proulx.com [192.168.230.119]) by joseki.proulx.com (Postfix) with ESMTP id 89A6121D63 for ; Thu, 15 Oct 2020 21:05:04 -0600 (MDT) Received: by hysteria.proulx.com (Postfix, from userid 1000) id 778062DC9D; Thu, 15 Oct 2020 21:05:04 -0600 (MDT) Date: Thu, 15 Oct 2020 21:05:04 -0600 From: Bob Proulx To: freebsd-questions@freebsd.org Subject: Re: sh scripting question Message-ID: <20201015204226099763897@bob.proulx.com> References: <24456.60388.135834.43951@jerusalem.litteratus.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <24456.60388.135834.43951@jerusalem.litteratus.org> X-Rspamd-Queue-Id: 4CC9xh1fznz4MCy X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=proulx.com header.s=dkim2048 header.b=LIWGZZBK; dmarc=pass (policy=none) header.from=proulx.com; spf=pass (mx1.freebsd.org: domain of bob@proulx.com designates 96.88.95.61 as permitted sender) smtp.mailfrom=bob@proulx.com X-Spamd-Result: default: False [-4.08 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-0.997]; R_DKIM_ALLOW(-0.20)[proulx.com:s=dkim2048]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; R_SPF_ALLOW(-0.20)[+a]; MIME_GOOD(-0.10)[text/plain]; TO_DN_NONE(0.00)[]; PREVIOUSLY_DELIVERED(0.00)[freebsd-questions@freebsd.org]; RCPT_COUNT_ONE(0.00)[1]; NEURAL_HAM_LONG(-1.01)[-1.013]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[proulx.com:+]; DMARC_POLICY_ALLOW(-0.50)[proulx.com,none]; NEURAL_HAM_SHORT(-1.07)[-1.065]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; ASN(0.00)[asn:7922, ipnet:96.64.0.0/11, country:US]; MAILMAN_DEST(0.00)[freebsd-questions] X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 16 Oct 2020 03:05:13 -0000 Robert Huff wrote: > I have a file ("files.list") with a list of filenames, similar to > /path A/path B/FreeBSD is great.txt > (note the embedded spaces) Oh you are tormenting us now. :-) There are some subtle issues here. > If I use > > for FILE in `cat files.list` > > FILE will be set to "/path". > How do I get it to read the entire string? > Or am I using the wrong tool? For the pedantic you want to temporarily set IFS and you want to use the -r option. It probably won't matter with you exact specific case above because you want the entire line and you do not have leading or trailing spaces and you are not using any escapes in your file names. But if you did then it would matter. First see the sh man page and see this passage. Backslashes are treated specially, unless the -r option is specified. If a backslash is followed by a newline, the backslash and the newline will be deleted. If a backslash is followed by any other character, the backslash will be deleted and the following character will be treated as though it were not in IFS, even if it is. If your brains are not yet leaking out your ears then go here: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_05 And see this line: 2. If the value of IFS is null, no field splitting shall be performed. That is why setting IFS= (same as IFS="" with the useless "" part) turns off word splitting. It's a special case. If IFS= the empty string then no field splitting is performed. (The standard is much more clear and forceful on this topic than the man page.) And therefore both are needed to completely handle a data line with potential spaces and backslashes in it. If file1 contains: /path A/path B/FreeBSD is great.txt /path A/path C/FreeBSD is fun.txt Then we can read it with both IFS= and read -r this way. while IFS= read -r line; do printf "|%s|\n" "$line" done < file1 |/path A/path B/FreeBSD is great.txt| |/path A/path C/FreeBSD is fun.txt| And if there are leading or trailing spaces in file1 then those spaces will be brought through verbatim without trimming. And maybe these test cases that poke at the corners will make things a little more clear. $ printf "%s\n" " foo \t bar " | while read line; do printf "|%s|\n" "$line"; done |foo t bar| $ printf "%s\n" " foo \t bar " | while read -r line; do printf "|%s|\n" "$line"; done |foo \t bar| $ printf "%s\n" " foo \t bar " | while IFS= read -r line; do printf "|%s|\n" "$line"; done | foo \t bar | Another hint is that the shell handles for and while loops in a pipeline by creating a subshell for them. Which means that variables can't be set in the parent shell. That's why it is good and convenient to use a pipeline on the command line, it prevents setting variables in the command line shell. But in a script usually it is better to use a "... done < file1" redirection so that it is invoked within the currently existing shell process so that variables can be set in the parent shell. Hope this helps! :-) Bob