Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 15 Oct 2020 21:05:04 -0600
From:      Bob Proulx <bob@proulx.com>
To:        freebsd-questions@freebsd.org
Subject:   Re: sh scripting question
Message-ID:  <20201015204226099763897@bob.proulx.com>
In-Reply-To: <24456.60388.135834.43951@jerusalem.litteratus.org>
References:  <24456.60388.135834.43951@jerusalem.litteratus.org>

Next in thread | Previous in thread | Raw E-Mail | Index | Archive | Help
Robert Huff wrote:
> 	I have a file ("files.list") with a list of filenames, similar to
> 	/path A/path B/FreeBSD is great.txt
> 	(note the embedded spaces)

Oh you are tormenting us now.  :-)  There are some subtle issues here.

> 	If I use
> 
> 	for FILE in `cat files.list`
> 
> 	FILE will be set to "/path".
> 	How do I get it to read the entire string?
> 	Or am I using the wrong tool?

For the pedantic you want to temporarily set IFS and you want to use
the -r option.  It probably won't matter with you exact specific case
above because you want the entire line and you do not have leading or
trailing spaces and you are not using any escapes in your file names.
But if you did then it would matter.

First see the sh man page and see this passage.

    Backslashes are treated specially, unless the -r option is
    specified.  If a backslash is followed by a newline, the backslash
    and the newline will be deleted.  If a backslash is followed by
    any other character, the backslash will be deleted and the
    following character will be treated as though it were not in IFS,
    even if it is.

If your brains are not yet leaking out your ears then go here:

  https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_05

And see this line:

  2. If the value of IFS is null, no field splitting shall be performed.

That is why setting IFS= (same as IFS="" with the useless "" part)
turns off word splitting.  It's a special case.  If IFS= the empty
string then no field splitting is performed.  (The standard is much
more clear and forceful on this topic than the man page.)

And therefore both are needed to completely handle a data line with
potential spaces and backslashes in it.

If file1 contains:

    /path A/path B/FreeBSD is great.txt
    /path A/path C/FreeBSD is fun.txt

Then we can read it with both IFS= and read -r this way.

    while IFS= read -r line; do
      printf "|%s|\n" "$line"
    done < file1

    |/path A/path B/FreeBSD is great.txt|
    |/path A/path C/FreeBSD is fun.txt|

And if there are leading or trailing spaces in file1 then those spaces
will be brought through verbatim without trimming.

And maybe these test cases that poke at the corners will make things a
little more clear.

    $ printf "%s\n" " foo \t  bar   " | while read line; do printf "|%s|\n" "$line"; done
    |foo t  bar|

    $ printf "%s\n" " foo \t  bar   " | while read -r line; do printf "|%s|\n" "$line"; done
    |foo \t  bar|

    $ printf "%s\n" " foo \t  bar   " | while IFS= read -r line; do printf "|%s|\n" "$line"; done
    | foo \t  bar   |

Another hint is that the shell handles for and while loops in a
pipeline by creating a subshell for them.  Which means that variables
can't be set in the parent shell.  That's why it is good and
convenient to use a pipeline on the command line, it prevents setting
variables in the command line shell.  But in a script usually it is
better to use a "... done < file1" redirection so that it is invoked
within the currently existing shell process so that variables can be
set in the parent shell.

Hope this helps! :-)

Bob



Want to link to this message? Use this URL: <http://docs.FreeBSD.org/cgi/mid.cgi?20201015204226099763897>