Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 05 Nov 2010 10:40:58 -0700
From:      Devin Teske <dteske@vicor.com>
To:        Cyrille Lefevre <cyrille.lefevre-lists@laposte.net>
Cc:        freebsd-rc@freebsd.org
Subject:   Re: sysrc(8) -- a sysctl(8)-like utility for managing rc.conf(5)
Message-ID:  <1288978858.7362.154.camel@localhost.localdomain>
In-Reply-To: <4CD3731C.6020501@laposte.net>
References:  <1286925182.32724.18.camel@localhost.localdomain> <1286996709.32724.60.camel@localhost.localdomain> <1287448781.5713.3.camel@localhost.localdomain> <1287510629.25599.2.camel@localhost.localdomain> <D763F474-8F19-4C65-B23F-78C9B137A8FE@vicor.com> <1288746388.7362.4.camel@localhost.localdomain> <17B64023-A64A-40DA-9CBC-A601710AB5BB@vicor.com> <1288919368.7362.35.camel@localhost.localdomain> <4CD3731C.6020501@laposte.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 2010-11-05 at 03:59 +0100, Cyrille Lefevre wrote:
> Le 05/11/2010 02:09, Devin Teske a =C3=A9crit :
>=20
> hi, how about something like (untested, but should work) :
>=20
> # no $(cat << EOF needed here, so, no extra \ and $quot and so needed

The `$(cat << EOF' is very much needed.

The expansion rules for text appearing between "<<EOF" and "EOF" are
very different than the alternatives.

Let's take a look:

Assignment Type: Double-quoted value...
Example: awkscript=3D"..."
Word-Expansions Performed:
   - Escaped-sequence/Backslash expansion
   - Parameter/Variable Expansion
   - Command Substitution
   - Arithmetic Expansion

Assignment Type: Single-quotes value...
Example: awkscript=3D'...'
Word-Expansions Performed:
   - None

Assignment Type: Embedded ``Here-Document''
Example: awkscript=3D$( cat << EOF ... EOF )
Word-Expansions Performed:
   - Parameter/Variable Expansion
   - Command substitution
   - Arithmetic Expansion

Assignment Type: Embedded _Literal_ ``Here-Document''
Example: awkscript=3D$( cat << "EOF" ... EOF )
Word-Expansions Performed:
   - None

More on ``here-documents'': man 1 sh | less +/here-document

Take another look at the contents of awkscript. You'll notice:

a) I use of backslashes in the awk script.
   e.g. if ( ! match(\$0, /$regex/) ) { print; next }
b) I use shell variables in the awk script.
   e.g. if ( t1 ~ /[$bbtick\\\$\\\\]/ )

Therefore, if I were to:

1. Switch to double-quoted value:
   - I'd have to double-escape the backslashes
2. Switch to single-quoted value:
   - Can't do it, I need parameter expansion
3. Switch to embedded _literal_ here-document:
   - Can't do it, I need parameter expansion

Options 2 and 3 are not valid, though you are right, I could achieve the
same thing with a compound string...

E.g.
awkscript=3D'...'"$shell_variable"'...'

And in-truth, this would save me from two more fork()s (the sub-shell
assignment and the call to cat(1)). However, I think there's something
lost in the readability when you resort to compound strings. Not to
mention that I tested the performance increase of using compound strings
over embedded here-documents and there's no significance.

Last, but not least...

Let me explain why I used $apos, $quot, $bquot and $bbtick.

If you're using a syntax-highlighting editor such as vim or gvim, the
stray apostrophes, stray quotes, and stray backticks messed up syntax
highlighting for all code appearing after the embedded here-document
explicitly because they were strays.

If you look closer, you'll see that I indeed to make copious use of
quotation marks within the awkscript, however I use $apos, $quot,
$bquot, and $bbtick wherever a stray mark is needed expressly to prevent
destruction of syntax highlighting. Furthermore, I already considered
the thought of throwing in a stray ": mark" or "# mark" after the
awkscript block, but that too failed due to vim's synchronous syntax
highlighting rules for shell-scripts (see the `sync' options
in /usr/local/share/vim/vim72/syntax/sh.vim for additional details).


> awkscript=3D'
> # %s/\\$0/$0/;s/\\\\/\\/g
> 	BEGIN { ...; regex=3D"^[[:space:]]*" varname "=3D" }

Won't work.

You've replaced my usage of an embedded here-document (which performs
parameter expansion) with single-quote assignment (which does not
perform parameter expansion).

The point of failure here is that you've taken $varname (which is a
positional parameter passed to the shell script function) and turned it
into an awk variable (which is never assigned).

So right off the bat, your awk script will fail because regex will be
assigned a value of:

^[[:space:]]*=3D

Now, again, you can get around this problem by using a compound string,
like so:

awkscript=3D'
BEGIN { ...; regex=3D"^[[:space:]]*'"$varname"'=3D }

But that's not as readable because it requires you to mentally switch
from reading awk code to reading shell, back to awk in the same line.


> ...
> 	if ( ! match($0, regex) ) { print; next }

Fail due to bad regex value (see prior comments)


> ...
> 	if ( t1 ~ /[\'\$\\]/ )

Since we're picking nits:
- the backslash before the apostrophe is not needed.


> ...
> 	else if ( t1 =3D=3D apos ) {

apos needs to be a shell-expanded parameter/variable, otherwise you'll
be forced to declare apos in the awkscript, which itself (without using
compound strings) will force vim syntax highlighting to break (something
that's not really important for the awkscript itself, but it's rather
disconcerting to see entire oceans of miscolored text AFTER the embedded
here-document).


> 		sub("^" apos "[^" apos "]*", "", value)
> 		if ( length(value) =3D=3D 0 ) t2 =3D ""
> 		sub("^" apos, "", value)

Same as above.


> ...
> 	else if ( t1 =3D=3D "\"" ) {
> 		sub(/^"[^"]*/, "", value)
> 		if ( length(value) =3D=3D 0 ) t2 =3D ""
> 		sub(/^"/, "", value)

$quot was used to protect the vim syntax highlighting from stray marks.
This block breaks that whereas my block preserved highlighting (using
vim version 7.2.411 here).


> ...
> 		t1 =3D t2 =3D "\""

Again, stray mark. To understand why, it's because back-slash expansion
is not performed within a here-document, so syntax-highlighting sees
three quotes, _not_ an escaped-quote surrounded by two double-quotes.


> ...
> 		else if ( t1 ~ /[[:space:]];#]/ )
> # parentheses aren't needed here, or wrap them as before
> 			t1 =3D t2 =3D "\""

Huh?

My code:

else if ( t1 ~ /[^[:space:];#]/ ) {
	t1 =3D t2 =3D "$bquot"
	sub(/^[^[:space:]]*/, "", value)
}

Your suggested replacement (???):

else if ( t1 ~ /[[:space:]];#]/ )
	t1 =3D t2 =3D "\""

*cough*

You've removed the "NOT" operator (^) from my character class expansion,
removed the substitution line, and killed the braces. This code does not
do the same thing.

Either that, or you were attempting to combine the detection of the non-
quoted value-assignment and the null-assignment. This is wrong. These
cannot be combined or else the results differ.


> ...
> 	printf "%s%c%s%c%s\n", substr(\$0, 0, matchlen), \
> 		t1, awk_new_value, t2, value
> '

Just as before with $regex (which had $varname which is $1 which is the
positional parameter passed to the shell function), this will fail
because awk_new_value is not defined by awk.

Further, you've forgotten to conver \$0 to $0 (considering that you've
switched the context from an embedded here-document which performs
parameter expansion -- hence the escape of the dollar-sign -- versus a
single-quoted assignment which does _NOT_ perform parameter expansion --
and hence no need for the preceding back-slash).


>=20
> # ... | ... doesn't need a final \ when wrapped after the |
> local awk_new_value=3D"$( echo "$new_value" |
> 	awk '{ gsub(/\\/, "\\\\"); gsub(/"/, "\\\""); print }' )"

Wrong. Fail. And here's why...

You are correct that a $( ... ) block can traverse newlines.

However, what $( ... ) functional performs is a sub-shell. Each line
within the $( ... ) syntax is taken as a single-line of shell to be
executed.

Therefore, by deleting the back-slash at the end of the line, you've
turned one statement into two.

One statement:

echo "$new_value" | \
	awk '{ gsub(/\\/, "\\\\"); gsub(/"/, "\\\""); print }'

(also written as -- if you have the guts to stomach line-wrapping given
ts=3D8 indentation within the script)

echo "$new_value" | awk '{ gsub(/\\/, "\\\\"); gsub(/"/, "\\\""); print }'

However, you've turned the above into two statements:

First statement:
	echo "$new_value" |
Second statement:
	awk '{ gsub(/\\/, "\\\\"); gsub(/"/, "\\\""); print }'

The first statement will cause a shell syntax error within the sub-
shell.


> ...
> # you missed the " here
> 	new_contents=3D$( tail -r "$file" 2> /dev/null )

No, I did not. And here's why...

Many people are confused on this issue. Let me clarify.

When you make an assignment to a variable in the bare name-space using
the $( ... ) or ` ... ` syntaxes, quotes are optional.

When you make an assignment to a variable in the positional argument
name-space using the $( ... ) or ` ... ` syntaxes, quotes are _required_
(if you want to preserve spaces).

Valid Examples:

# bare name-space
foo=3D$( echo "   123   456   ")
echo "'$foo'"
# produces: '   123   456   '

# bare name-space
foo=3D` echo "   123   456   " `
echo "'$foo'"
# produces: '   123   456   '

# positional argument name-space
env foo=3D"$( echo "   123   456   " )"
echo "'$foo'"
# produces: '   123   456   '

# positional argument name-space (within a function)
local foo=3D"$( echo "   123   456   " )"
echo "'$foo'"
# produces: '   123   456   '

# positional argument name-space
[ "$( echo "   123   456   " )" =3D "xxx" ]
# works fine

Invalid Examples:

# positional argument name-space
env foo=3D$( echo "   123   456   " )
# produces: env: 123: No such file or directory

# positional argument name-space (within a function)
local foo=3D$( echo "   123   456   " )
# produces: local: 123: bad variable name

# positional argument name-space
[ $( echo "   123   456   " ) =3D "xxx" ]
# produces: [: 123: unexpected operator


> # you may want to use printf "%s" "$new_contents" instead of echo
> # to avoid \ sequences interpretation if any
> 	new_contents=3D$( echo "$new_contents" |
> 			awk -v varname=3D"$varname" -v apos=3D"'" \
> 			    -v new_value=3D"$new_value" "$awkscript")

No.

You are confused about the order in which the shell performs each
different word expansion.

Expansion of escape sequences is not performed on expanded parameters.

Here's proof:

#!/bin/sh
foo=3D'123\456'
echo "'$foo'"
# produces: '123\456'

Here's the one-liner proof (first with compound strings):

/bin/sh -c 'foo=3D'"'"'123\456'"'"'; echo "'"'"'$foo'"'"'"'

again (with alternate-form compound strings):

/bin/sh -c "foo=3D'"'123\456'"'; echo "'"'"'"'$foo'"'"'"'

again (without compound strings; with escapes):

/bin/sh -c "foo=3D'123\\456'; echo \"'\$foo'\""

again (with compound strings and escapes):

/bin/sh -c "foo=3D'"'123\456'"'; echo \"'\$foo'\""

again (simpler -- as quotes are optional on second part)

/bin/sh -c "foo=3D'123\\456'; echo \$foo"

again:

/bin/sh -c 'foo=3D'"'"'123\456'"'"'; echo $foo'


All of the above produce the same output (well, almost, the last two
lack surrounding apostrophes -- which were themselves illustrative of
the fact that parameter expansion within double-quotes, even with
surrounding text, also lacks escape-sequence expansion) and serve to
illustrate that it is a common misconception that escape-sequences
within variables need imply that one needs to use printf instead of
echo. Rather, the truth is that one never needs to escape the escape-
sequences within a variable unless the command-line that the parameter
was expanded in is itself then re-evaluated (e.g. foo=3D'\\' && eval echo
$foo).

It is perfectly safe to pass a variable containing escape-sequences,
newlines, spaces, variables, etc. to echo.

Just make sure that you enclose the variable is quoted (and even then,
the only thing you lose by not having it quoted is that multiple spaces
will become one -- because rather than appearing as one argument, it
will appear as multiple arguments and echo cannot conceivably preserve
spaces between arguments because it is the shell that obliterates the
spaces between positional parameters, not echo)


>=20
> of course, same remarks about the later awk script :-)

Which would be equally as faulty. Above remarks to the your same remarks
about the later awk script. :-)

>=20
> also, %s|/bin/sh|$_PATH_BSHELL| && _PATH_BSHELL=3D/bin/sh

What could this possibly buy you? Other than increased CPU cycles caused
by foisting further expansions upon the runtime execution?

This script is coded for a single target: FreeBSD-CURRENT

Abstracting the shell in the above manner gives you nothing, though I
understand why you'd want it.

Alternatively, if you really desired the ability to switch the shell,
you'd do something like:

_SH=3D$SHELL
...
$_SH ...

This would allow you two nice things...

1. If someone modifies your script to change the invocation line:

from:
#!/bin/sh
to:
#!/usr/bin/env sh
or:
#!/usr/bin/env bash
or:
#!/usr/local/bin/bash
or:
#!/bin/bash
or:
#!/usr/bin/bash
etc...

_SH will inherit the correct value from $SHELL and use the same script
interpreter that is interpreting the script itself.

2. If someone modifies your script to change the _SH line:

from:
_SH=3D$SHELL
to:
_SH=3D/bin/sh
or:
_SH=3D/usr/bin/bash
etc...

The script can use a different sub-interpreter than the invocation
interpreter.

However, I will use neither for two very good reasons:

a. when I do call /bin/sh in certain places, I need to pass specific
command-line options were may be shell specific, and thus it should not
be considered allowable for some user to come along and change the
following:

/bin/sh -n /etc/defaults/rc.conf

to use some other shell because some other shell (beit inheritance of
the invocation shell or otherwise) may not support the `-n' flag (which
btw, checks syntax of a script without executing it).

b. If /bin/sh doesn't exist, you've got real problems. Least to mention,
it's likely that script itself will not execute considering that the
invocation line is itself: #!/bin/sh

If you truly want to write shell agnostic code, you can, and I have.
This involves using env(1) to find your shell by-name rather than by-
path (and env(1) is always at /usr/bin/env on every POSIX-compliant OS
including BSD*, SunOS, *BSD, Solaris, Linux, Cygwin, and Mac OS X). It
also involves using a dependency-calculating structure much like the
first iterations of this script (see the beginnings of this thread over
on freebsd-hackers@ mailing list archives under the "sysconf" thread).

However, as I'll point out again, this script is coded for a single
target (FreeBSD-Current) and thus will not employ such abstractions so
commonly associated with platform aganosticism and/or embeddedOS
programming. In other words, we will rely on things that exist in the
base structure of FreeBSD and make no qualms about directly referencing
pathnames that should exist.

And in mentioning this, here's the current dependency list:

# Dependencies (sorted alphabetically):
#
#    awk(1)    cat(1)     chmod(1)   chown(8)    chroot(8)   env(1)
#    grep(1)   jexec(8)   jls(8)*    mktemp(1)   mv(1)       rm(1)
#    sh(1)     stat(1)    tail(1)
#
# *optional

Last, but not least...

Some pathnames have been hard-coded for security purposes. For example,
in the script, you will find three instances where PATH expansion is not
trusted and full-pathnames are used:

/bin/sh (6 occurrences)
/usr/sbin/jexec (2 occurrences)
/usr/sbin/chroot (1 occurrence)

The only reason I can possibly see to assign these full paths to
variables would be if we were going to target another OS which puts
these in different places. However, as it currently stands, this script
targets FreeBSD, and all versions of FreeBSD agree on the location of
these utilities.

If it is discovered that this script is even usable on another BSD-style
OS that uses the rc.conf(5) files in a similar manner (that
is, /etc/defaults/rc.conf must exist and must declare the
source_rc_confs() function and must also define the rc_conf_files
variable), then I will consider localizing the paths to these
executables to variables IF (and only IF) said target platform uses
different pathnames. However, as it currently, stands, even MidnightBSD
(which has taken this script into its base -- Thanks Lucas Holt!) has
these executables right where I expect them to be.



>=20
> PS : for non french user :
> %s/quot/dquot/;%s/apos/squot/;s/bquotquot/bquot/

Might you have instead meant:

%s/[^b]?quot/dquot/g
%s/apos/squot/g

Not sure where you got the bquotquot (not in my code).

NOTE: The leading '%' indicates to me that perhaps you meant this as a
vi/vim find/replace command (no '%' would instead indicate sed(1) or
perl(1)). You cannot perform multiple '%s///' operators separated by a
semi-colon. You would get the error "E488: Trailing characters" in vim
and the error "Usage: [line [,line]] s [[/;]RE[/;]repl[/;] [cgr] [count]
[#lp]]." in vi (but I digress from picking nits)



>=20
> Regards,
>=20
> Cyrille Lefevre
--=20
Cheers,
Devin Teske

-> CONTACT INFORMATION <-
Business Solutions Consultant II
FIS - fisglobal.com
510-735-5650 Mobile
510-621-2038 Office
510-621-2020 Office Fax
909-477-4578 Home/Fax
devin.teske@fisglobal.com

-> LEGAL DISCLAIMER <-
This message  contains confidential  and proprietary  information
of the sender,  and is intended only for the person(s) to whom it
is addressed. Any use, distribution, copying or disclosure by any
other person  is strictly prohibited.  If you have  received this
message in error,  please notify  the e-mail sender  immediately,
and delete the original message without making a copy.

-> END TRANSMISSION <-




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1288978858.7362.154.camel>